Source: author: [email protected] www.expectfly.com/2017/08/15/…
Why do we need timed tasks
Let’s consider the solutions for the following business scenarios:
- The payment system runs at 1:00 a.m. every day for one-day settlement, and the last month’s settlement will be carried out on the first day of every month
- E-commerce shopping hour, commodity prices start at 8 o ‘clock discount
- 12306 ticket purchase system, more than 30 minutes without successful payment order, recycling processing
- After a product is successfully delivered, an SMS notification is sent to the customer
There are many similar business scenarios. How do we solve them?
Many business scenarios require us to do a task at a certain time, and timed tasks solve this business scenario. In general, the system can replace some of the scheduled tasks with messaging, which has many similarities and can be interchangeable scenarios.
For example, in the above business scenario of sending SMS notifications to customers after successful delivery, we can send MQ messages to the queue after successful delivery, then consume MQ messages and send SMS messages.
But it’s not interchangeable in certain situations:
A) Time-driven/event-driven: Internal systems can generally be time-driven, but when it comes to external systems, only time-driven can be used. For fear of taking the price of external websites, climb once every hour
B) Batch processing/line-by-line processing: Batch processing of stacked data is more efficient and has advantages over message-oriented middleware without the need for real-time processing. And some business logic can only be processed in batches. Such as mobile monthly settlement of our phone bill
C) Real-time/non-real-time: message-oriented middleware can process data in real time, but in some cases it is not necessary, such as VIP upgrade
D) Intra-system/system decoupling: Scheduled task scheduling is generally within the system, while message-oriented middleware can be used between two systems
What are the timing tasks of the framework
stand-alone
- Timer: indicates a timer class that can be configured for a specified scheduled task. The TimerTask class is a scheduled task class that implements the Runnable interface
- ScheduledExecutorService: a relatively delayed or periodic task is scheduled. The disadvantage is that there is no absolute date or time
- Spring timing framework: It is simple to configure and has many functions. If the system uses a single machine, Spring timers are preferred
distribution
- Quartz: Java’s de facto timed task standard. But Quartz focuses on timed tasks rather than data, and there is no process that can be customized for data processing. Although Quartz can implement high availability of database-based jobs, it lacks distributed parallel scheduling capabilities
- TBSchedule: Alibaba’s early open source distributed task scheduling system. The code is slightly older and uses timers instead of thread pools to perform task scheduling. Timers are known to be defective in handling exceptions. In addition, the TBSchedule job type is relatively simple and can only be a mode of data acquisition/processing. There is also a serious lack of documentation
- Elastic-job: an elastic distributed task scheduling system developed by Dangdang. It uses ZooKeeper to implement distributed coordination, high availability of tasks, and sharding. Currently in version 2.15, it supports cloud development
- Saturn: A distributed scheduling platform for scheduled tasks independently developed by Vipshop. Developed based on Dangdang’s Elastice-Job version 1, Saturn can be deployed to Docker containers.
- Xxl-job: A distributed task scheduling platform released in 2015 by Xu Xueli, an employee of Dianping, is a lightweight distributed task scheduling framework. Its core design goal is rapid development, simple learning, lightweight and easy to expand.
Comparison of distributed task scheduling systems
Available system solutions for comparison are as follows: Elastic — Job (e-job) and XXX -Job (X-job)
Project background and community strength
X-job: Xu Xueli, employee of Dianping, and 3 contributors; Lot 2470 star, 1015 fork 6 | | QQ discussion group has registered in the use of more than 40 companies | documentation is complete E – Job: dangdang open source, contributors 17 people; Lot 2524 star, 1015 fork | QQ discussion groups 1, source discussion group 1 | have registration in the use of more than 50 companies complete | | document have a clear plan of development
Cluster Deployment
X-job: The only requirements for cluster deployment are as follows: Ensure that the db and login accounts of all cluster nodes are the same. The scheduling center uses DB configurations to differentiate clusters.
The actuator supports cluster deployment, improving scheduling system availability and task processing capability. The only requirements for cluster deployment are as follows: Ensure that xxl.job.admin.addresses/ scheduling center addresses are the same for each actuator in the cluster, and that actuators perform automatic registration based on the configuration.
E-job: Rewrite Quartz’s database-based distributed functionality to implement a registry using Zookeeper
Job registry: Global job registry control center based on Zookeeper and its client Curator implementation. Used to register, control, and coordinate distributed job execution.
When multiple nodes are deployed, the task cannot be repeated
X-job: Uses Quartz’s database-based distributed capabilities
E-job: after a task is divided into N tasks, each server executes the assigned tasks. If a new server is added to the cluster or an existing server goes offline, extice-job triggers job re-sharding before the next job starts without changing the execution of this one.
Log traceability
X-job: supports log query
E-job: processes important events in the scheduling process by subscribing to events. E-job is used for query, statistics collection, and monitoring. Currently, Elastic-Job provides two event subscription methods for recording events based on a relational database.
Monitoring alarm
X-job: when scheduling fails, failure alarms are triggered, such as sending alarm emails.
Multiple email addresses can be configured. Use commas (,) to separate multiple email addresses
E-job: this mode can be implemented automatically by subscribing to events
Monitoring job running status, monitoring job server survival, monitoring recent data processing success, and data flow type jobs (You can judge whether the job traffic is normal by monitoring the number of recent data processing success. If it is less than the threshold for normal operation, you can choose to alarm.) And monitor the failure of recent data processing (the result of job processing can be judged by the number of recent data processing failures monitored. If the number is greater than 0, you can choose to alarm.)
Elastic capacity expansion and reduction
X-job: With Quartz’s database-based distributed capabilities, exceeding a certain number of servers puts a certain amount of strain on the database
E-job: Uses ZK to register, control, and coordinate services
Support parallel scheduling
X-job: Multiple threads in the scheduling system (10 threads by default) trigger the scheduling to ensure that the scheduling is executed accurately and not blocked.
E-job: implements task fragmentation. A task is divided into N independent task items, which are executed by distributed servers in parallel.
High availability policy
X-job: The scheduling center uses DB locks to ensure the consistency of distributed scheduling in the cluster. A task scheduling task triggers only one execution.
E-job: High availability of the Scheduler is achieved by running several instances of Elastice-job-cloud-Scheduler that point to the same ZooKeeper cluster. ZooKeeper is used to perform a leader election if the current master elastice-job-Cloud-Scheduler instance fails. The cluster is formed by at least two instances of the scheduler, in which only one instance of the scheduler provides services and the other instances are in “standby” state. When this instance fails, the cluster elects one of the remaining instances to continue providing services.
Failure Handling Strategy
X-job: indicates the processing policy for scheduling failures. The policies include failure alarm (default) and retry.
E-job: Indicates that elastic capacity expansion or reduction is performed before the next Job is run. However, during this Job execution, jobs assigned by offline servers will not be re-assigned. Failover can be performed in this job run with an idle server fetching orphan job fragments. Fail-over also compromises performance.
Dynamic Sharding strategy
X-job: Fragment broadcast tasks are fragmented based on actuators. Dynamic expansion of actuator clusters is supported to dynamically increase the number of fragments and coordinate service processing. It can significantly improve the task processing capability and speed when performing large data operations.
When the task routing policy is fragment broadcast, a task scheduling policy triggers all the actuators in the cluster to execute a task and transmits fragment parameters. Sharding tasks can be developed according to sharding parameters;
E-job: supports multiple sharding policies and can be customized
The default sharding policy supports three types of sharding policies: sharding policy based on the average assignment algorithm, sharding policy based on the IP address ascending or descending algorithm based on the hash value of the Job name, and sharding policy that rotates the Job instance list based on the hash value of the Job name. You can customize the sharding policy
Elastice-job sharding is implemented using ZooKeeper. Shards are allocated by the master node. The shard algorithm on the master node is triggered in the following three situations: a. New Job instances are added to the cluster. B.
Compare that to the Quartz framework
- Call the API in a way to operate tasks, not humanized;
- The need to persist the business QuartzJobBean into the underlying data table is quite intrusive.
- Scheduling logic and QuartzJobBean are coupled in the same project, which will lead to a problem. When the number of scheduling tasks gradually increases and the scheduling task logic gradually increases, the performance of the scheduling system will be greatly limited by services.
- Quartz focuses on timed tasks rather than data, and has no process that is customized for data processing. Although Quartz can implement high availability of database-based jobs, it lacks distributed parallel scheduling capabilities.
Comprehensive comparison
Summary and conclusion
Thing in common:
E-job and X-Job have a wide user base and complete technical documents, which meet the basic function requirements of scheduled tasks.
Difference:
X-job focuses on easy service implementation, easy management, easy learning cost, and rich failure policies and routing policies. It is recommended to use it when the user base is relatively small and the number of servers is within a certain range.
E-job focuses on data, adding flexible capacity expansion and data fragmentation to maximize the utilization of distributed server resources. However, the learning cost is high. You are advised to use this command when a large amount of data is required and a large number of servers are deployed
Other plans with scheduled tasks
The system can automatically confirm receipt of goods when the goods have not been received for more than 10 days after delivery.
Every day at midnight screening the next day can automatically confirm the receipt of orders, and then the next day every 10 minutes to perform the confirmation of the receipt of goods is not too expensive, it is relatively accurate time
If the status of automatic confirmation of receipt is only for the client to see, the next time the user online, do a calculation can be.
Delayed and timed message delivery
ActiveMQ provides a broker side message scheduling mechanism. The broker does not want messages to be sent to consumers immediately, but 60 seconds later. The broker does not want messages to be sent to consumers immediately
RabbitMQ can set x-message-tt for Queue and Message to control the lifetime of messages. If a timeout occurs, the Message will become a dead letter. With DLX, when a message becomes dead letter in one queue, it can be republished to another Exchange. The message can then be consumed again.
Recent hot articles recommended:
1.1,000+ Java Interview Questions and Answers (2021)
2. I finally got the IntelliJ IDEA activation code thanks to the open source project. How sweet!
3. Ali Mock is officially open source, killing all Mock tools on the market!
4.Spring Cloud 2020.0.0 is officially released, a new and disruptive version!
5. “Java Development Manual (Songshan version)” the latest release, quick download!
Feel good, don’t forget to click on + forward oh!