Detailed analysis of Flink's operating architecture

“This is the second day of my participation in the November Gwen Challenge. See details of the event: The last Gwen Challenge 2021”.

1. Flink program structure

The basic building blocks of the Flink program are streams and transformations (note that the DataSet used in Flink’s DataSet API is also an internal stream). Conceptually, streams are (potentially endless) streams of data records, while transformations are operations that treat one or more streams as one or more streams. Input and produce one or more output streams.

The Flink application structure looks like this:

Source: data Source, Flink has 4 types of Source for stream and batch processing: local collection based Source, file based Source, network socket based Source, and custom Source. Custom sources include Apache Kafka, RabbitMQ, etc. You can also customize your own source.

The Transformation: Data conversion operations, There are Map/FlatMap/Filter/KeyBy/Reduce/Fold/Aggregations/Window/WindowAll/Union/Window join/Split / Select, etc., a lot of operations, you can convert the data into the data you want.

You may need to store it. Flink’s common sinks are as follows: Write file, print out, write socket, and custom Sink. Custom sinks include Apache Kafka, RabbitMQ, MySQL, ElasticSearch, Apache Cassandra, Hadoop FileSystem, etc. You can also define your own sink.

2. Flink parallel data flow

When executed, the Flink program is mapped to a Streaming Dataflow, which consists of a set of Streams and Transformation operators. Start with one or more Source operators at startup and end with one or more Sink operators.

Flink programs are parallel and distributed in nature. During execution, a stream contains one or more stream partitions, and each operator contains one or more operator subtasks. The operation subtasks are independent of each other and run on different threads, even on different machines or containers. The number of operator subtasks is the parallelism of this particular operator. Different operators in the same program have different levels of parallelism.

A Stream can be divided into multiple Stream partitions. An Operator can also be divided into multiple Operator subtasks. As shown in the figure above, the Source is split into Source1 and Source2, which are the Source Operator subtasks, respectively. Each Operator Subtask is executed independently in a different thread. The parallelism of an Operator is equal to the number of Operator subtasks. The parallelism of Source in the figure above is 2. The parallelism of a Stream is equal to the parallelism of the Operator it generates.

There are two modes when data is passed between two operators:

The one-to-one mode: When the two operators transfer data in this mode, the number of partitions and the sorting of data are kept. As in Source1 to Map1 in the figure above, it retains the partitioning properties of the Source and the ordering of the processing of partitioned elements.

Redistributing mode: This mode changes the number of data partitions. Each operator subtask sends data to a different target subtask based on the selected transformation, such as keyBy(),broadcast(), and hashCode repartition The rebalance() method repartitions randomly;

3. Task and Operator chain

All operations of Flink are called operators. When submitting tasks, the client will optimize operators. Operators that can be combined will be merged into one Operator, and the merged Operator is called Operator chain. It is essentially a chain of execution, and each chain is executed in a separate thread on the TaskManager.

4. Task scheduling and execution

When Flink executes, executor automatically generates DAG data flow diagrams based on program code;
ActorSystem creates actors to send data flow diagrams to actors in JobManager;
The JobManager receives heartbeat messages from the TaskManager to obtain a valid TaskManager.
JobManager uses the scheduler to schedule execution tasks in TaskManager (in Flink, the smallest scheduling unit is a Task, corresponding to a thread).
When the program is running, data can be transmitted between tasks.

The Job the Client:

The main responsibility is to submit the task, after submission can end the process, can also wait for the result return;
The Job Client is not an internal part of Flink program execution, but it is the starting point of task execution.
The Job Client is responsible for taking the user’s program code, creating a data flow and submitting the data flow to The Job Manager for further execution. After the execution is complete, the Job Client returns the result to the user.

JobManager:

The main responsibility is to schedule work and coordinate tasks to do checkpoints;
At least one master is configured in a cluster. The master is responsible for scheduling tasks, coordinating the cpus and fault tolerance.
High availability Settings can have multiple masters, but ensure that one is the leader and the others are standby.
Job Manager consists of Actor System, Scheduler, and CheckPoint.
After JobManager receives a task from the client, it generates an optimized execution plan and then schedules it to TaskManager for execution.

TaskManager:

Main responsibilities: receiving tasks from JobManager, deploying and starting tasks, receiving upstream data and processing;
A Task Manager is a work node that executes tasks in one or more threads within the JVM;
The Slot is set up when the TaskManager is created, and each Slot can perform one task.

5. Share the task slot with the slot

Each TaskManager is a JVM process that can perform one or more subtasks in different threads. To control how many tasks a worker can receive. Workers are controlled by task slot (each worker has at least one Task slot).

1) task slots

Each Task slot represents a fixed-size subset of resources owned by the TaskManager.

Flink divides the process memory into multiple slots.

There are two TaskManagers. Each TaskManager has three slots, and each slot occupies 1/3 of the memory.

The following benefits can be obtained when the memory is divided into different slots:

The maximum number of concurrent tasks that a TaskManager can execute at the same time is manageable, which is three because the number of slots cannot exceed.
Slots have exclusive memory space so that multiple different jobs can be run in a TaskManager without affecting each other.

2) slot to be Shared

By default, Flink allows subtasks to share slots, even if they are subtasks of different tasks, as long as they come from the same job. The result is a single slot that can hold the entire pipe of a job. Allowing slot sharing has two main benefits:

You only need to compute the highest parallelism task slot in a Job, and if this is satisfied, all other jobs will be satisfied as well.
Resource allocation is more equitable, and more tasks can be assigned to a slot that is relatively free. If there is no task slot sharing in the figure, subtasks such as Source/Map with low load will occupy a lot of resources, while subtasks with high load Windows will lack resources.
With task slot sharing, you can increase base parallelism from 2 to 6. This improves the utilization of slot resources. It also ensures that the slot scheme assigned by the TaskManager to the subtask is fairer.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Detailed analysis of Flink’s operating architecture

1. Flink program structure

2. Flink parallel data flow

3. Task and Operator chain

4. Task scheduling and execution

5. Share the task slot with the slot

1) task slots

2) slot to be Shared

Detailed analysis of Flink’s operating architecture

1. Flink program structure

2. Flink parallel data flow

3. Task and Operator chain

4. Task scheduling and execution

5. Share the task slot with the slot

1) task slots

2) slot to be Shared

Related Posts

Uncover the underlying principles and implementation of Spring AOP

MySQL and Redis double write consistency solution

Ramble on container networks