In this paper, starting from vivo Internet technology WeChat public links: mp.weixin.qq.com/s/l4vuYpNRj… Author: Chen Wangrong
Distributed task scheduling framework is a necessary tool for almost every large application. This paper introduces the background and pain points of the application of task scheduling framework, explores the application of open source distributed task scheduling framework widely used in the industry, and analyzes the advantages and disadvantages of these frameworks and their own business thinking.
I. Business background
1.1 Why Is Scheduled Task Scheduling Required
(1) Time-driven processing scenario: send coupons on the hour, update revenue every day, refresh label data and crowd data every day.
(2) Batch data processing: monthly batch statistics report data, batch update SMS status, real-time requirements are not high.
(3) Asynchronous execution decoupling: activity state refresh, asynchronous execution of offline query, and internal logic decoupling.
1.2 Usage requirements and pain points
(1) Alarm monitoring capability for task execution.
(2) Tasks can be flexibly and dynamically configured without restart.
(3) Business transparency, low coupling, simplified configuration and convenient development.
(4) Easy to test.
(5) High availability, no single point of failure.
(6) The task cannot be executed repeatedly to prevent logical exceptions.
(7) Large task distribution and parallel processing ability.
Second, open source framework practice and exploration
2.1 Java native Timer and ScheduledExecutorService
2.1.1 the Timer used
The Timer defects:
-
At the bottom of Timer, a single thread is used to process multiple Timer tasks, which means that all tasks are actually executed in serial, and the delay of the previous task will affect the execution of subsequent tasks.
-
Due to single-threading, if an unhandled exception occurs while a scheduled task is running, not only the current thread will stop, but all scheduled tasks will stop.
-
Timer task execution depends on the absolute system time. The change of the system time leads to the change of the execution plan.
Due to the above defects, try not to use the Timer, the idea will be a clear hint, ScheduledThreadPoolExecutor substitute the Timer.
2.1.2 ScheduledExecutorService use
ScheduledExecutorService fixes the defects of Timer. First, ScheduledExecutorService implements ScheduledThreadPool thread pool, which can support concurrent execution of multiple tasks.
ScheduledExecutorService is a delay based on time interval. The execution of ScheduledExecutorService does not change due to system time changes.
Of course, ScheduledExecutorService has its own limitations: it can only schedule according to the delay of tasks, and cannot meet the requirements of scheduling based on absolute time and calendar.
2.2 Spring Task
2.2.1 Using Spring Task
Spring Task is a lightweight scheduled task framework independently developed by Spring. It does not need to rely on additional packages and is easy to configure.
The annotation configuration is used here
2.2.2 Spring Task Defects
Spring Task itself does not support persistence, nor does it introduce the official distributed cluster mode. It can only be manually extended by developers in business applications, which cannot meet the requirements of visualization and easy configuration.
2.3 Forever classic Quartz
2.3.1 Basic Introduction
The Quartz framework is the most well-known open source task scheduling tool in Java and the de facto scheduled task standard. Almost all open source scheduled task frameworks are built based on the Quartz core scheduling.
2.3.2 Principle analysis
Core components and architecture
A key concept
(1) Scheduler: task Scheduler, a controller that performs task scheduling. In essence, it is a scheduling container that registers all Trigger and corresponding JobDetail, and uses thread pool as the basic component of task running to improve task execution efficiency.
(2) Trigger: Trigger, which is used to define the time rule of task scheduling and tell the task scheduler when to Trigger the task. CronTrigger is a powerful Trigger built based on cron expression.
(3) Calendar: a collection of Calendar specific time points. A trigger can contain multiple calendars that can be used to exclude or contain certain points in time.
(4) JobDetail: is an executable Job that describes the Job implementation class and other related static information, such as Job name, listener and other related information.
(5) Job: the task execution interface has only one execute method, which is used to execute real business logic.
(6) JobStore: The task storage mode mainly includes RAMJobStore and JDBCJobStore. RAMJobStore is stored in the MEMORY of JVM and has the risk of loss and quantity limitation. JDBCJobStore is to persist the task information into the database and supports cluster.
2.3.3 Practice Description
(1) Basic use of Quartz
-
See the Official Quartz documentation and practical tutorials on blogs.
(2) To ensure that dynamic modification and restart are not lost, you generally need to use the database for saving.
-
Quartz itself supports JDBCJobStore, but it has a large number of tables to configure. You can refer to official documents for the recommended configuration. If there are more than 10 tables, services are heavy.
-
In use, only basic trigger configuration, corresponding tasks and related execution log tables are required to meet most requirements.
(3) componentization
-
The quartz dynamic task configuration information is persisted to the database, and the data operations are wrapped in a basic JAR package for use between projects. Reference projects only need to import JAR dependencies and configure corresponding data tables, which can be used transparently to the Quartz configuration.
(4) Expansion
-
Cluster pattern
The high availability of tasks is realized through failover and load balancing, and the unique execution of tasks is ensured by the locking mechanism of the database. However, the cluster feature is only used for HA, and the increase of the number of nodes does not improve the execution efficiency of a single task, so horizontal expansion cannot be achieved.
-
Quartz plugin
Extensions can be made for specific needs, such as adding triggers and task execution logs, and tasks that rely on serial processing scenarios. See: Quartz plugin — serial scheduling between tasks
2.3.4 Defects and deficiencies
(1) The task information needs to be persisted into the business data table, which is coupled with the business.
(2) Scheduling logic and execution logic coexist in the same project. In the case of fixed machine performance, business and scheduling will inevitably influence each other.
(3) In quartz cluster mode, the database exclusive lock is used to uniquely obtain tasks, and the task execution does not achieve a perfect load balancing mechanism.
2.4 Lightweight artifact XXL-job
2.4.1 Basic Introduction
Xxl-job is a lightweight distributed task scheduling platform, characterized by platformer, easy deployment, rapid development, simple learning, lightweight, easy to expand, the code is still being updated.
“Control center” is a task scheduling console, the platform does not undertake the business logic, is responsible for the unified management and scheduling of task execution, and provides task management platform, “actuator” is responsible for receiving a scheduling and execution of “control center”, can be directly deployed actuators, actuators can be integrated into existing business projects. By decoupling the scheduling control of tasks from the execution of tasks, business usage only needs to focus on the development of business logic.
It mainly provides several functional modules of task dynamic configuration management, task monitoring and statistical reports, and scheduling logs. It supports multiple operating modes and routing strategies, and can process simple fragmented data based on the number of corresponding actuator machine clusters.
2.4.2 Principle Analysis
Before version 2.1.0, the core scheduling modules were all based on the Quartz framework. In version 2.1.0, the scheduling components were developed by ourselves, and the Quartz dependency was removed and time round scheduling was used.
2.4.3 Practice Description
For details, see the official documents.
2.4.3.1 Using Demo:
Example 1: Implement a simple task configuration that simply inherits the IJobHandler abstract class and declares annotations
@jobHandler (value=”offlineTaskJobHandler”). (Note: Dubbo was introduced this time, as described later).
@JobHandler(value="offlineTaskJobHandler")
@Component
public class OfflineTaskJobHandler extends IJobHandler {
@Reference(check = false,version = "cms-dev",group="cms-service")
private OfflineTaskExecutorFacade offlineTaskExecutorFacade;
@Override
public ReturnT<String> execute(String param) throws Exception {
XxlJobLogger.log(" offlineTaskJobHandler start.");
try {
offlineTaskExecutorFacade.executeOfflineTask();
} catch (Exception e) {
XxlJobLogger.log("offlineTaskJobHandler-->exception." , e);
return FAIL;
}
XxlJobLogger.log("XXL-JOB, offlineTaskJobHandler end.");
returnSUCCESS; }}Copy the code
Example 2: Sharding broadcast task.
@JobHandler(value="shardingJobHandler") @Service public class ShardingJobHandler extends IJobHandler { @Override public ReturnT<String> execute(String param) Throws the Exception {/ / shard parameter ShardingUtil. ShardingVO ShardingVO = ShardingUtil. GetShardingVo (); XxlJobLogger.log("Shard parameter: Current shard number = {}, total shard number = {}", shardingVO.getIndex(), shardingVO.getTotal()); // Business logicfor (int i = 0; i < shardingVO.getTotal(); i++) {
if (i == shardingVO.getIndex()) {
XxlJobLogger.log({} slice, hit fragment start processing, i);
} else {
XxlJobLogger.log("{} slice, ignore", i); }}returnSUCCESS; }}Copy the code
2.4.3.2 integration dubbo
(1) Introduce the dependency of dubo-Spring-boot-Starter and service facade JAR packages.
<dependency> <groupId>com.alibaba.spring.boot</groupId> <artifactId>dubbo-spring-boot-starter</artifactId> < version > 2.0.0 < / version > < / dependency > < the dependency > < groupId > com. The demo. Service < / groupId > - the facade < artifactId > XXX < / artifactId > < version > 1.9 - the SNAPSHOT < / version > < / dependency >Copy the code
(2) Add configuration files to the Dubbo consumer configuration (multiple profiles can be defined according to the environment and switched by profile).
## Dubbo service consumer configuration
spring.dubbo.application.name=xxl-job
spring.dubbo.registry.address=zookeeper://zookeeper.xyz:2183
spring.dubbo.port=20880
spring.dubbo.version=demo
spring.dubbo.group=demo-serviceCopy the code
(3) Inject the facade interface into the code through @Reference.
@Reference(check = false,version = "demo",group="demo-service")
private OfflineTaskExecutorFacade offlineTaskExecutorFacade;Copy the code
(4) Add the @enableDubboConfiguration annotation to the boot program.
@SpringBootApplication @EnableDubboConfiguration public class XxlJobExecutorApplication { public static void main(String[] args) { SpringApplication.run(XxlJobExecutorApplication.class, args); }}Copy the code
2.4.4 Visualized Task Configuration
The built-in platform project facilitates the developer’s task management and execution log monitoring, and provides some easy-to-test functions.
2.4.5 extension
(1) Task monitoring and report optimization.
(2) Expansion of task alarm mode, such as adding alarm center, providing internal message, SMS alarm.
(3) Execute different alarm monitoring and retry policies for actual services when exceptions occur.
2.5 High availability Elastic-Job
2.5.1 Basic Introduction
Elastic-job is a distributed scheduling solution consisting of two independent subprojects, elastic-Job-Lite and Elastic-Job-Cloud.
Elastik-job-lite is positioned as a lightweight decentralized solution that uses JAR packages to coordinate distributed tasks.
Elastice-job-cloud uses Mesos + Docker solutions to provide additional services such as resource governance, application distribution, and process isolation.
Unfortunately, there has been no iteration update record for two years.
2.5.2 Principle Analysis
2.5.3 Practice Description
2.5.3.1 demo using
(1) Install ZooKeeper, configure the registry config, and add the configuration file to the registry ZK configuration.
@Configuration
@ConditionalOnExpression("'${regCenter.serverList}'.length() > 0")
public class JobRegistryCenterConfig {
@Bean(initMethod = "init")
public ZookeeperRegistryCenter regCenter(@Value("${regCenter.serverList}") final String serverList,
@Value("${regCenter.namespace}") final String namespace) {
returnnew ZookeeperRegistryCenter(new ZookeeperConfiguration(serverList, namespace)); }}Copy the code
spring.application.name=demo_elasticjob regCenter.serverList=localhost:2181 regCenter.namespace=demo_elasticjob Spring. The datasource. Url = JDBC: mysql: / / 127.0.0.1:3306 / XXL - job? Unicode =true&characterEncoding=UTF-8
spring.datasource.username=user
spring.datasource.password=pwdCopy the code
(2) Configure the data source config and add the data source configuration to the configuration file.
@Getter
@Setter
@NoArgsConstructor
@AllArgsConstructor
@ToString
@Configuration
@ConfigurationProperties(prefix = "spring.datasource")
public class DataSourceProperties {
private String url;
private String username;
private String password;
@Bean
@Primary
public DataSource getDataSource() {
DruidDataSource dataSource = new DruidDataSource();
dataSource.setUrl(url);
dataSource.setUsername(username);
dataSource.setPassword(password);
returndataSource; }}Copy the code
Spring. The datasource. Url = JDBC: mysql: / / 127.0.0.1:3306 / XXL - job? Unicode=true&characterEncoding=UTF-8
spring.datasource.username=user
spring.datasource.password=pwdCopy the code
(3) Configure event config.
@Configuration
public class JobEventConfig {
@Autowired
private DataSource dataSource;
@Bean
public JobEventConfiguration jobEventConfiguration() {
returnnew JobEventRdbConfiguration(dataSource); }}Copy the code
(4) Add ElasticSimpleJob annotation so that you can flexibly configure different task trigger events.
@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
public @interface ElasticSimpleJob {
@AliasFor("cron")
String value() default "";
@AliasFor("value")
String cron() default "";
String jobName() default "";
int shardingTotalCount() default 1;
String shardingItemParameters() default "";
String jobParameter() default "";
}Copy the code
(5) Initialize the configuration.
@Configuration
@ConditionalOnExpression("'${elaticjob.zookeeper.server-lists}'.length() > 0")
public class ElasticJobAutoConfiguration {
@Value("${regCenter.serverList}")
private String serverList;
@Value("${regCenter.namespace}")
private String namespace;
@Autowired
private ApplicationContext applicationContext;
@Autowired
private DataSource dataSource;
@PostConstruct
public void initElasticJob() { ZookeeperRegistryCenter regCenter = new ZookeeperRegistryCenter(new ZookeeperConfiguration(serverList, namespace)); regCenter.init(); Map<String, SimpleJob> map = applicationContext.getBeansOfType(SimpleJob.class);for(Map.Entry<String, SimpleJob> entry : map.entrySet()) { SimpleJob simpleJob = entry.getValue(); ElasticSimpleJob elasticSimpleJobAnnotation = simpleJob.getClass().getAnnotation(ElasticSimpleJob.class); String cron = StringUtils.defaultIfBlank(elasticSimpleJobAnnotation.cron(), elasticSimpleJobAnnotation.value()); SimpleJobConfiguration simpleJobConfiguration = new SimpleJobConfiguration(JobCoreConfiguration.newBuilder(simpleJob.getClass().getName(), cron, elasticSimpleJobAnnotation.shardingTotalCount()).shardingItemParameters(elasticSimpleJobAnnotation.shardingItemParameter s()).build(), simpleJob.getClass().getCanonicalName()); LiteJobConfiguration liteJobConfiguration = LiteJobConfiguration.newBuilder(simpleJobConfiguration).overwrite(true).build(); JobEventRdbConfiguration jobEventRdbConfiguration = new JobEventRdbConfiguration(dataSource); SpringJobScheduler jobScheduler = new SpringJobScheduler(simpleJob, regCenter, liteJobConfiguration, jobEventRdbConfiguration); jobScheduler.init(); }}}Copy the code
(6) Realize SimpleJob interface, integrate Dubbo according to the above method, and complete the business logic.
@ElasticSimpleJob(
cron = "*/10 * * * *?",
jobName = "OfflineTaskJob",
shardingTotalCount = 2,
jobParameter = "Test parameters",
shardingItemParameters = "0=A,1=B")
@Component
public class MySimpleJob implements SimpleJob {
Logger logger = LoggerFactory.getLogger(OfflineTaskJob.class);
@Reference(check = false, version = "cms-dev", group = "cms-service")
private OfflineTaskExecutorFacade offlineTaskExecutorFacade;
@Override
public void execute(ShardingContext shardingContext) {
offlineTaskExecutorFacade.executeOfflineTask();
logger.info(String.format("Thread ID: %s, number of job fragments: %s," +
"Current fragment item: %s. Current parameter: %s," +
"Job name: %s. Job custom parameter: %s", Thread.currentThread().getId(), shardingContext.getShardingTotalCount(), shardingContext.getShardingItem(), shardingContext.getShardingParameter(), shardingContext.getJobName(), shardingContext.getJobParameter() )); }}Copy the code
2.6 Other Open source frameworks
(1) Saturn: Saturn is vipSHOP’s open-source distributed task scheduling platform that has been modified based on Elastic Job.
(2) SiA-Task: It is an open-source distributed TASK scheduling platform of CreditEase.
3. Comparison of advantages and disadvantages and adaptation of business scenarios
Business thinking:
-
Enrich task monitoring data and alarm policies.
-
Access unified login and permission control.
-
Simplify service access steps.
Four, conclusion
For systems with less concurrency, XXL-Job is easy to configure and deploy, does not require the introduction of redundant components, and provides a visual console, which is very user-friendly and is a good choice. For systems that wish to directly utilize the capabilities of open source distributed frameworks, it is recommended to select the system according to their own situation.
Attached: References
-
Quartz plugin — Serial scheduling between tasks
For more content, please pay attention to vivo Internet technology wechat public account
Note: To reprint the article, please contact our wechat account: LABs2020.