1 Scheduled Task
Netty, Quartz, Kafka, and Linux all have scheduled tasks.
- Regular JDK java.util.Timer and DelayedQueue and other tool classes can realize simple scheduled tasks. The underlying heap data structure is used, and the access complexity is O(nlog(n)), which cannot support massive scheduled tasks.
- In a scenario with a large number of scheduled tasks and high performance requirements, the time round scheme is used to reduce the complexity of task access and cancellation operations to O(1).
Time wheel model and its application
A scheduling model for efficient batch management of scheduled tasks. A time wheel is typically implemented as a ring structure, similar to a clock, divided into many slots, one slot representing a time interval, and each slot uses a bidirectional linked list to store scheduled tasks. When the pointer jumps to a slot, the scheduled task of the slot is executed.
- Architecture of the Hashed Timing Wheel
Applicable scenario
- Fault recovery
- Flow control
- Scheduling algorithm
- Controls the packet life cycle in the network
Timers are expensive to maintain if
- The processor breaks at every tick of the clock
- Use a fine-grained timer
- There are many unfinished timers
Efficient timer algorithms are needed to reduce the overall interrupt overhead.
The capacity and precision of single-layer time wheel are limited. For scenarios with high precision requirements, large time span or massive scheduled tasks that need to be scheduled, multi-level time wheel and the scheme combining persistent storage and time wheel are usually used.
Model and performance metrics
Rules in the model
- Client call:
- START_TIMER (Time interval, Request_ID, Expiry_Action)
- STOP_TIMER (Request_ID)
- Timer tick call:
- PER_TICK_BOOKKEEPING
- EXPIRY_PROCESSING
Performance indicators
- space
The memory used by the data structure
- delay
The time required to start and finish any of the above routines
3 Dubbo’s time wheel structure
Dubbo time round implementation in Dubbo – org.apache.dubbo.com mon common module. The timer in the package, let’s to analyze the time wheel core interface and implementation.
The core interface
TimerTask
In Dubbo, all scheduled tasks implement the TimerTask interface. Only one run() method is defined with a Timeout interface object as an input.
Timeout
A Timeout object corresponds one-to-one to a TimerTask, similar to the relationship between a Future object returned by a thread pool and a task object submitted to the thread pool. Using the Timeout object, you can view the status of scheduled tasks and perform operations on scheduled tasks, such as canceling associated scheduled tasks.
- Methods in the Timeout interface:
The Timer interface defines the basic behavior of timers, the core of which is newTimeout() : submit a TimerTask and return the associated Timeout object, similar to submitting a task to a thread pool.
HashedWheelTimeout
HashedWheelTimeout is the only implementation of the Timeout interface and is an inner class of HashedWheelTimer. HashedWheelTimeout plays two roles:
- The node of the bidirectional linked list in the time wheel is the container of the timed task TimerTask in HashedWheelTimer
- A Handle (Handle) returned by the TimerTask after it is submitted to the HashedWheelTimer, which is used to view and control the timed task outside the time wheel
Core fields
- Prev, next. The bidirectional linked list is used in the HashedWheelTimerBucket chain timeouts (timed tasks), and there is no need to synchronize /volatile because it only acts on the WorkerThread.
- Task indicates the actual scheduled task
- Deadline: Indicates the execution time of the scheduled task. Specified when HashedWheelTimeout is created
Calculation formula:CurrentTime (time to create HashedWheelTimeout) + delay (task delay) - startTime (startTime of HashedWheelTimer)
Ns,
- State: indicates the current status of the scheduled task
– The available states are as follows:– There is also a STATE_UPDATER field for atomicity of state changes.
- RemainingRounds, number of clock cycles remaining in the current task. The time that the time wheel can represent is limited. If the time difference between the task expiration time and the current time exceeds the time that the single turn of the time wheel can represent, a ring will appear. The value of this field is required to represent the remaining clock cycle.
Core API
- isCancelled()
- isExpired()
- state()
Check the current HashedWheelTimeout state
- The cancel () method
- The expire () method
- remove()
HashedWheelBucket
A slot in the time wheel. The slot in the time wheel is actually a container for caching and managing a bidirectional linked list. Each node in the bidirectional linked list is a HashedWheelTimeout object, which is associated with a TimerTask scheduled task.
HashedWheelBucket holds the first and last two nodes of the bidirectional linked list – head and tail, and each HashedWheelTimeout node holds the precursor and successor references, so the whole linked list can be traversed forward and backward.
Core API
- addTimeout()
- pollTimeout()
- remove()
Removes the specified HashedWheelTimeout node from the bidirectional linked list.
- clearTimeouts()
The pollTimeout() method is looped through the entire list and returns any tasks that have not timed out or been cancelled.
- expireTimeouts()
Iterate over all HashedWheelTimeout nodes in the bidirectional list. When a scheduled task expires, it is fetched by the remove() method and executed by calling its expire() method. For cancelled tasks, remove them using remove() and discard them directly. For unexpired tasks, the remainingRounds field (number of remaining clock cycles) is reduced by one.
HashedWheelTimer
Timer interface implementation, through the time round algorithm to achieve a Timer.
The function
Select the corresponding HashedWheelBucket slot according to the current time wheel pointer, iterate from the head of the linked list, and calculate each HashedWheelTimeout timed task:
- If it belongs to the current clock cycle, it is taken out and run
- If no, the remaining clock cycles are reduced by one
The core domain
- workerState
The current state of the time wheel, three optional values, byAtomicIntegerFieldUpdater
Implement its atomic modification.
- startTime
The start time of the current time cycle and the deadline field value of the scheduled task submitted to the time cycle are calculated from this timestamp.
- wheel
Time wheel ring queue, each element is a slot. When the number of slots is specified as n, the nearest power of n is raised to the 2nd power
- Timeouts, cancelledTimeouts
HashedWheelTimer will process the data of these two queues before processing HashedWheelBucket’s bidirectional linked list: – Timeouts Queue Buffers scheduled tasks in the external submission time wheel. – cancelledTimeouts Queue temporarily stores canceled scheduled tasks
- tick
Located in theHashedWheelTimer$Worker
, pointer to time wheel, monotonically increasing counter of step 1
- mask
Mask.mask = wheel.length - 1
, the implementation ofticks & mask
You can locate the corresponding clock slot
- ticksDuration
The actual time represented by each increment of the time pointer, in nanoseconds.
- pendingTimeouts
Total number of scheduled tasks remaining in the current time round.
- workerThread
The thread inside the time wheel that actually performs the scheduled task.
- worker
The logic that actually performs the scheduled task is encapsulated in the Runnable object.
newTimeout()
Before submitting a scheduled task to the Timeouts queue, the start() method is called to start the time wheel, which completes the following two key steps:
- Determine the startTime field for the time wheel
- Start the workerThread thread to execute the worker task.
After that, the deadline of the scheduled task can be calculated according to startTime. Finally, the scheduled task can be encapsulated as HashedWheelTimeout and added to timeouts queue.
4 Execution flow of a time wheel pointer rotation
HashedWheelTimer $Worker. The run () :
- The time wheel pointer rotates and the time wheel cycle begins
- This section describes how to clear scheduled tasks cancelled by users. These scheduled tasks are recorded when cancelled by users
cancelledTimeouts
In the queue. The time wheel clears the queue each time the pointer rotates - Move scheduled tasks cached in timeouts queues to corresponding slots in the time wheel
- Locate the corresponding slot based on the current pointer and process the scheduled tasks in the bidirectional linked list of the slot
- Detects the state of the time wheel. If the time wheel is running, repeat the above steps to continuously execute the scheduled task. If the time wheel is stopped, perform the following steps to obtain the scheduled tasks that are not executed and add them
unprocessedTimeouts
Queue: traverses each slot in the round and callsclearTimeouts
() method; Loop poll() on timeouts queues that are not added to slots - Finally, the scheduled tasks cancelled by the user in the cancelledTimeouts queue are cleared again.
5 Scheduled task application
It is not directly used for periodic operations, but only submits a single scheduled task to the time wheel. When the last task is completed, the newTimeout() method is called to submit the current task again, so that the task will be executed in the next period. Even if GC or I/O blocks occur during the execution of a task, which causes the task to be delayed or stuck, the same task will not be submitted continuously, resulting in a pile of tasks.
Dubbo time wheel is mainly applied in the following aspects:
- Failure retries, for example, when the Provider fails to register with the registry or when the Consumer fails to subscribe to the registry
- Periodic scheduled tasks, such as periodically sending heartbeat requests, handling request timeouts, or reconnection mechanisms when a network connection is disconnected
reference
- zhuanlan.zhihu.com/p/32906730