A background.
Qingming holiday, finally have a little time to resume these months of work and life. It’s been a long time since the last update. It’s not that I don’t want to update, but I moved to a Top10 Internet company a year ago. One is the task heavy busy many, wish to have double body skill; Second, there are so many people under great pressure. I wish I could travel back and forth and recreate it again! But in general, I am very happy and I cherish this opportunity. In addition, I think the most valuable thing is to get in touch with the R&D process of Dachang. The CF space inside the company has inexhaustible technical articles and plans. These are the crystals extracted from the company’s history. I’ve been running around like a hungry Wolf every time I’ve got time! One of my tasks over the past few months has been to investigate and come up with a practical solution for distributed task scheduling, which is used in both our SaaS scenario and our privatization scenario. I was wondering at first, doesn’t a company of our size have a proven solution? It turns out that there is, but in a SaaS platform privatization scenario, the company’s platform is not available. In order to avoid too much transformation of the project during the private deployment, it was decided to implement a set independently.
Two. Technology selection
There are a lot of options available on the Internet, and there are also a lot of big shots in this area are very comprehensive. I tried to put together the suggestions in a short way, sorting out the following comparison scheme. Things like Timer and Spring Task are not included in the comparison because they are not suitable for the production environment of most projects and are relatively simple.
product | Quartz | Elastic-job | XXL-JOB | Apache Airflow |
---|---|---|---|---|
Scheduling type | Cron | Cron | Cron \ Fixed speed | Cron |
workflow | Does not support | Does not support | Does not support | support |
HA | Support: Multi-node deployment | Supported: Based on ZooKeeper | Support: Multi-node deployment | support |
Distributed task | There is no | Static fragmentation | Dynamic fragmentation | support |
Customize task parameters | Does not support | support | support | support |
White screen task management | There is no | Execution record: None; Run the market: have; Run logs: Yes. Run in place again: none; Data brushing: None | Execution record: Yes; Run the market: have; Run logs: Yes. Run in place again: none; Data brushing: None | Execution record: Yes; Run the market: have; Run logs: Yes. Running in place: Yes; Scrub data: Yes |
Task type | Java | Java/Shell | Java/Shell/Python/PHP/Node.js | It can be customized by Operator. Come with the main big data and Shell, no Java |
Alarm monitoring | There is no | Since the research | Since the research | |
Use cost | The DB. More than one Server | The DB. Zookeeper. The console. server | The DB. Dispatching center; actuator | The DB. The Master; The Worker; MQ |
Maintenance cycle | In the near future | Three years ago | In the near future | In the near future |
Use the difficulty | Two open the workload is big | The project has not been maintained for many years, I do not know what pit | Perfect function; Two open low workload | Suitable for the field of big data |
Based on the above selection comparison, combined with the needs of our team characteristics, based on XXL-job is more suitable for us: its own definition is lightweight, rich task type, more perfect function, the project has been maintaining, the source code is relatively easy to read and understand.
XXL – the lack of jobs
Xxl-job itself is defined as lightweight, which means it doesn’t rely on too many other JAR packages and functional middleware. Try to use the capabilities provided by Java itself to implement functions, such as RPC communication based on java.net.URL; Log output using FileOutputStream; Authentication uses accessToken string matching. Light weight is light weight, but performance and security aspects have more obvious defects. At least not what we wanted. Xxl-job divides scheduling and execution into two projects, which is one of its best features. However, the scheduling center needs to be deployed independently. For us, we hope that it can be integrated into the general capability platform as one of the platform capabilities, but we also hope to minimize the transformation of the management page and avoid excessive human resources investment. Moreover, the capability of function customization is needed in the scenario of privatization. Therefore, we need to transform the scheduling center into an embedded scheduling module with pluggable capability.
Retrofit scheme
At present, our goal of transformation is to integrate it with our entire architecture scheme first, so we do not significantly transform the core capability layer (such as forwarding strategy and scheduling strategy).
1. Embedded scheduling center
The scheduling center module of xxl-job is xxl-job-admin. The module is out-of-the-box. But I need to transform it into an embedded module.
Transformation goal
Other projects introduced after the transformation of the module, can have the dispatching center capability, and can not affect any function of the original project.
Implementation plan
Xxl-job-admin is a typical MVC project. The template engine used is Freemaker. This means THAT I need to think about how to handle a large number of resource files and configuration files in the Resources directory.
You can perform the following operations: 1. Migrate the resource configuration file under xxl-job-admin to the project to be integrated. This saves brainpower, no plan, but requires more physical energy. This is not in line with our programmer status (cough ~), this scheme is not elegant. Xxl-job-admin is built based on Maven, so we can customize the configuration in POM.xml to achieve our purpose. Pom.xml provides a tag called Build, which defines how to compile and package the project, leaving the compilation and packaging to an internal plugin. Build plugins default to the following types:
plugin | function | Life cycle stage |
---|---|---|
clean plugin | Clean up the target file from the last compilation execution, that is, clean up the target directory | clean |
resource plugin | Process source resource files and test source resource files | resources testResources |
jar plugin | Create the jar | jar |
deploy plugin | Publish jars to remote Maven repositories | deploy |
If we don’t have any configuration, Maven will execute the default plugin during the build process. Maven is not the focus of this article, but by defining the < Resources > configuration of the
node, you can process resource files. The configuration is as follows:
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<includes>
<include>**/*.*</include>
</includes>
<excludes>
<exclude>**/*.properties</exclude>
<exclude>**/*.xml</exclude>
</excludes>
<filtering>false</filtering>
</resource>
</resources>
</build>
Copy the code
With the above configuration, Maven does not filter resources during compilation. Besides, the configuration class files need to be configured in a specific project, so they are excluded here. After compiling, xxl-job-admin resource files can be compiled to the project resource directory. Based on this, the embedded dispatch center succeeded in the first step.
2. Unified user authentication
Xxl-job provides its own user system, including registration, login, and basic permission authentication. But for us, that’s not enough.
Transformation goal
Registration and login require access to the Unified User Authentication Center and security center of the Group. The dispatching center itself needs perfect RBAC authentication ability. The interface connects to the unified gateway.
Implementation plan
The main reason for the transformation of this part is that, as an embedded module, it needs to connect with other underlying capabilities of the team, rather than the “chicken”. We need to transform this “chicken” into a “crane” to be fully integrated into the whole solution. The core of xxL-job’s login capability lies in the PermissionInterceptor and CookieInterceptor. Therefore, in order to avoid major changes, MY solution is as follows: Use the login and registration module of XXL-job, but need to connect with the Group unified User Authentication and Security Center; For embedded use: use the group unified login and registration module. Therefore, we need to make some changes to the core of login capability. Because of the confidentiality involved, here I will simply put the source code:
3. Mount the gateway
We need all the interface through the gateway, so wrote a tool automatically registered gateway class, same here don’t post the source code, idea is to write an event listener, when listening to the container initialization completes, scans all the API interface embedded under the dispatch center, and then according to certain rules can be registered to the gateway.
4. The RPC extension
In a real business, there are requirements for a project that does not have scheduled tasks but wants to be able to actively trigger scheduled tasks in other projects to accomplish some initialization or manual compensation capability. If you do not have such ability, you need to go to the dispatching center management platform to trigger each time. In the business closed loop link, the fewer nodes, the clearer the process. Therefore, we hope to provide this kind of active trigger and shield the manual trigger. Therefore, an RPC package was added to xxl-job-admin to open up some capabilities.
5. Distributed scenario optimization
Xxl-job-admin does not check the uniqueness of the jobHandler and the executor when adding tasks. This causes two problems: 1. Visual confusion: When production staff manage tasks, they can be confused by the same names. Although we know that the semantics of names should be clear at creation time, this validation should be mandatory at the application level; 2. Executor registration problem in distributed scenario: in the xxl-job-admin source code, the executor is automatically registered with the scheduling center. The registration logic is based on the xxl.job.executor.
Xxl-job registers actuators asynchronously.registryUpdate
andregistrySave
The final execution is mybatis-mapper configuration SQL,registry_group
And that corresponds toappname
. The source code is as follows:
<update id="registryUpdate" >
UPDATE xxl_job_registry
SET `update_time` = #{updateTime}
WHERE `registry_group` = #{registryGroup}
AND `registry_key` = #{registryKey}
AND `registry_value` = #{registryValue}
</update>
<insert id="registrySave" >
INSERT INTO xxl_job_registry( `registry_group` , `registry_key` , `registry_value`, `update_time`)
VALUES( #{registryGroup} , #{registryKey} , #{registryValue}, #{updateTime})
</insert>
Copy the code
See the problem? If the appnames are the same, both actuators are registered. In the distributed task scheduling scenario, it will inevitably cause unpredictable problems! Therefore, I check the uniqueness of the actuator and jobHandler to prevent this from happening at the source. (I have put forward the issue of this issue). After transformation, it is as follows:
6. Log component optimization
The log component of xxL-job is simple and crude. In the synchronous mode, a FileInputStream is directly appended to a file, and logs are read directly from the I/O stream. Aside from performance problems, the architecture solution has a unified log management platform, and logs of any module need to be written to the log management platform. Therefore, the log component of XXL-Job needs to be modified to meet our acceptance standards. The log component of xxL-job is in com.xxl.job.core.log. The source code is not posted here.
7. Underlying communication
The bottom RPC communication module of xxL-job uses URL and NetTY, and the executor receives and scheduling uses NetTY. Otherwise, it uses HttpURLConnection. Using URL in XXL-Job is actually not a problem, after all, it is positioned as lightweight. However, in the scenario of large-scale distributed task scheduling, and we are basically using it as embedded scheduling, so it is more sensitive to performance, based on this consideration, we decided to use the unified HTTP request scheme of the team.
3. To summarize
This article Outlines the us is how to carry on the secondary development based on the XXL – job to build to meet our team use the standard platform for distributed task scheduling, limited to space, there are also a lot of thinking about the details of some conclusions and didn’t say to put some I think some of the more important point here make a record, for back-up. In general, xxL-job has a lot of ideas worth learning, its source is not difficult to understand, is a very good timing task open source framework.