This article is intended for those with basic Java knowledge

Author: HelloGitHub – Salieri

HelloGitHub introduces the Open Source project series. This chapter begins with the technical analysis of the PowerJob framework. As the first chapter of the technology series, this article will describe the overall architecture of PowerJob and introduce related technologies for the rest of the article.

Project Address:

https://github.com/KFCFans/PowerJob

1. Architectural design

As mentioned above, PowerJob is designed to be an enterprise-level distributed task scheduling framework, which is suitable for unified deployment scheduling center to become the task scheduling middleware within the company. Therefore, in architecture design, PowerJob introduces a scheduling center to solve task configuration and scheduling in a unified manner, unlike QuartZ, which produces and sells a Jar package and does everything by itself. The specific architecture is shown in the following figure:


As can be seen from the figure, the whole PowerJob system consists of powerJob-server and Powerjob-worker.

The scheduling center is a Web application based on SpringBoot. It can be divided into internal and external layers according to service objects. The external part is user-oriented, that is, it provides HTTP service, allowing developers to visualize the configuration and management of tasks, workflow and other information on the front-end interface. Responsible for scheduling and distributing the tasks entered by the developers for the internal sub-section, and maintaining the status of all the clusters of actuators registered in the registry.

The executor is a common Jar package. After the application that needs to access the scheduling center relies on the Jar package and is initialized, Powerjob-worker starts and provides services. The overall logic of the executor is very simple (complex is the implementation of MapReduce, broadcast and other advanced processing tasks, please look forward to the following article). It is to listen to the task execution request from the scheduling center, and once the task is received, it will allocate resources and initialize the executor to start processing. Meanwhile, a group of background threads are maintained to regularly report their health status and task execution status.

Akka-remote is used to communicate between the dispatching center and the actuator. Dispatch center can deploy multiple instances to horizontal extension, improve scheduling at the same time to achieve the high availability of dispatching center, the performance of actuator can realize high availability through cluster deployment, at the same time, if the developer has realized the graphs the processor capable of distributed processing, also can mobilize the whole cluster computing resources to complete the task of distributed computing.

Two, knowledge overview

In general, PowerJob covers the following topics. Read the source code and a series of technical articles to learn:

  • Java basics: new Java 8 features (Stream, Optional, Lambda, FunctionalInterface)
  • Java Advanced: Multithreading and concurrency safety (thread pools, concurrent containers, reentranced locks, segmented locks, ThreadLocal, etc.), Java I/O (network operations, file flow operations), hot loading (custom class loaders, Jar package operations)
  • Java Web: SpringBoot-related Web knowledge, including basic Controller usage, WebSocket, file upload and download, ControllerAdvice global exception handling, cross-domainCORS etc.
  • Related: Spring AOP Web log (records), asynchronous method (@ Async), timing task (@ Scheduled), self-built containers (ClassPathXmlApplicationContext), context, using various (Aware)
  • Database: compile database independent persistence layer code (Spring Data JPA), database basic theory (various SQL, index usage, etc.), multi-data source configuration, use of MongoDB (GridFS)
  • Algorithm knowledge: Graph (DAG), reference counter (implementing small GC), distributed unique ID algorithm (Snowflake), time wheel
  • Distributed knowledge: remote communication, cluster high availability, service discovery, failover and recovery, distributed consistency, distributed locking (reliable distributed locking based on database)
  • Serialization related: Kryo, Jackson-CBor, object pooling technology
  • Akka basics: Actor model, Akka-remote, Akka-serialization

If you are new to Java, this project and this tutorial will help you gain a better understanding of the basics of Java.

If you are an experienced driver, this project and this tutorial will help you learn about distributed computing and task scheduling.

Iii. Summary and notice

This chapter introduces the overall architecture design of PowerJob and related technical knowledge points involved in the project. In the next chapter, I will introduce and use the Akka Toolkit, the cornerstone of PowerJob.

We’ll see you next time ~ Notice: Next time is very dry, need to bring enough water!

Follow the public account to join the communication group (author in Java Group)