1. Why time wheel?

In Dubbo, in order to enhance the system’s fault tolerance, there will be corresponding monitoring judgment processing mechanism. For example, in the implementation of the timeout mechanism of RPC invocation, the consumer determines whether the RPC invocation times out and returns the timeout result to the application layer if it does.

In the original implementation of Dubbo, all returned results (DefaultFutures) were put into a collection, and all futures were scanned at regular intervals to determine whether they had timed out.

Although this approach is relatively simple, there is a problem that there will be a lot of meaningless traversal operation overhead. For example, the timeout time of an RPC call is 10 seconds, and the scheduled task for timeout determination is executed once every 2 seconds, there may be about 4 meaningless cyclic detection and judgment operations.

To solve similar problems in the above scenario, Dubbo borrowed from Netty and introduced the time round algorithm to reduce meaningless polling judgment operations.

2. Time wheel principle

For the above problems, the goal is to reduce the need for additional scanning operations. For example, if a scheduled task is executed 5 seconds later, scanning the scheduled task 4.9 seconds later can significantly reduce CPU overhead. This is where we can use the clock wheel mechanism.

In essence, the clock wheel refers to the principle of clock beat in life, so how to achieve it?

In the clock wheel mechanism, there are the concepts of time slot and clock wheel, time slot is equivalent to the scale of the clock; The clock wheel is like a cycle in which the hands beat, and we can place each task in the corresponding time slot.

If the clock wheel has 10 slots, and the cycle of the clock wheel is 10 seconds, then the unit time of each slot is 1 second, and the cycle of the next time wheel is 100 seconds, and the unit time of each slot is 10 seconds. This is like the second hand and the minute hand. Under the second hand cycle, the scale is in seconds. In the minute hand period, the scale is minutes.

Suppose we now have three tasks, task A (0.9 seconds later), task B (2.1 seconds later), and task C (12.1 seconds later). We add these three tasks to the clock wheel, with task A in slot 0 and task B in slot 2. Task C is placed in slot 2 of the next time round, as shown below:

From this scenario, we can see that the clock wheel’s scan period is still the minimum unit of 1 second, but the tasks placed in it are not scanned repeatedly. Each task is scanned only once as required, which is a good solution to the CPU waste problem.

This may lead to a problem, if the continuous stacking of clock wheels, infinite growth, efficiency will appear to decline, then how to solve the problem?

For setting three clock wheels, hour wheel, minute wheel, second level wheel.

3. Dubbo source code analysis

A Timer model is defined mainly through the interfaces of Timer, Timeout and TimerTask, and a time wheel Timer is realized through the class HashedWheelTimer (the default number of time slots is 512, which can be customized). It provides a simple and easy to use interface externally. You only need to call the newTimeout interface to realize the scheduling of tasks that only need to be executed once. With this timer, Dubbo achieves efficient task scheduling in response scenarios.

HashedWheelTimer structure:

4. The application of time wheel in RPC

  1. Call timeouts and retries treatment: the above client calls a timeout processing, can be applied to the clock round, every time we request, to create a process requests timeout timing task on the clock round, in the case of high concurrency, high traffic, the clock round each time only one time slot in the polling tasks, this will save a lot of CPU.

    FailbackRegistry source code snippet:

    	// constructor
        public FailbackRegistry(URL url) {
                super(url);
                this.retryPeriod = url.getParameter(REGISTRY_RETRY_PERIOD_KEY, DEFAULT_REGISTRY_RETRY_PERIOD);    
                // since the retry task will not be very much. 128 ticks is enough.
            	// The number of slots for the retries, set to 128
                retryTimer = new HashedWheelTimer(new NamedThreadFactory("DubboRegistryRetryTimer".true), retryPeriod, TimeUnit.MILLISECONDS, 128);
            }
        
        // Failed time task registry
        private void addFailedRegistered(URL url) {
                FailedRegisteredTask oldOne = failedRegistered.get(url);
                if(oldOne ! =null) {
                    return;
                }
                FailedRegisteredTask newTask = new FailedRegisteredTask(url, this);
                oldOne = failedRegistered.putIfAbsent(url, newTask);
                if (oldOne == null) {
                    // never has a retry task. then start a new task for retry.
                    // If the old task does not exist, place the time wheel and start a new taskretryTimer.newTimeout(newTask, retryPeriod, TimeUnit.MILLISECONDS); }}Copy the code
  2. Scheduled heartbeat detection: RPC framework calls the heartbeat detection sent periodically to the server to maintain the connection state. We can encapsulate the heartbeat logic into a heartbeat task and put it in the clock wheel. How do we deal with the tasks that need to be repeated in a heartbeat and removed once in a clock wheel? We add another piece of logic at the end of the timed task logic to reset the execution time of the task and throw it back into the clock wheel. This allows loop execution.

    HeaderExchangeServer

    .// Set up the heartbeat time wheel. The default slot number is 128
        private static final HashedWheelTimer IDLE_CHECK_TIMER = new HashedWheelTimer(new NamedThreadFactory("dubbo-server-idleCheck".true), 1, TimeUnit.SECONDS, TICKS_PER_WHEEL); .// Start heartbeat task detection
            private void startIdleCheckTask(URL url) {
                if(! server.canHandleIdle()) { AbstractTimerTask.ChannelProvider cp = () -> unmodifiableCollection(HeaderExchangeServer.this.getChannels());
                    int idleTimeout = getIdleTimeout(url);
                    long idleTimeoutTick = calculateLeastDuration(idleTimeout);
                    CloseTimerTask closeTimerTask = new CloseTimerTask(cp, idleTimeoutTick, idleTimeout);
                    this.closeTimerTask = closeTimerTask;
        
                    // init task and start timer.
                    // Start the heartbeat detection taskIDLE_CHECK_TIMER.newTimeout(closeTimerTask, idleTimeoutTick, TimeUnit.MILLISECONDS); }}...Copy the code

    Connection detection, which is performed continuously, is added to the time wheel. AbstractTimerTask source code:

    @Override
    public void run(Timeout timeout) throws Exception {
        Collection<Channel> c = channelProvider.getChannels();
        for (Channel channel : c) {
            if (channel.isClosed()) {
                continue;
            }
            // Invoke the heartbeat detection task
            doTask(channel);
        }
        // Drop it back into the time wheel
        reput(timeout, tick);
    }
    Copy the code

    You can also refer to HeartbeatTimerTask, ReconnectTimerTask source code implementation.


This article was created and shared by Mirson. For further communication, please add to QQ group 19310171 or visit www.softart.cn