Sentinel has been the core scene of Alibaba’s singles’ Day traffic drive for nearly 10 years

This paper introduces the function, principle, architecture, quick start and relative framework comparison of Ali open source current-limiting circuit breaker Sentinel

Basic introduction

1. Explanation of nouns

  • Service traffic limiting: When the system resources are insufficient to handle a large number of requests, the system limits the traffic or functions based on preset rules

  • Service meltdown: When a large number of requests and invocations of the target service time out or fail, the service caller directly executes the default local method for subsequent invocations of the service interface to avoid long-time blocking and affecting other services

  • Service degradation: To ensure the normal operation of core services in the face of a large number of requests, the priority of some services is reduced according to the actual service situation and traffic, and some services are not processed strategically or handled in a simple way

The realization of service degradation can be based on manual switch degradation (seckill, e-commerce promotion, etc.) and automatic detection (timeout, failure times, failure). Fusing can be understood as a kind of service fault degradation processing

2 Why is Traffic limiting degraded

If traffic control is not performed, system resources will be fully occupied and services will time out, and all users will be unavailable. Service traffic limiting is used to control the number of requests. Service degradation eliminates the occupation of system resources by non-core services and maximizes system resources to serve as many users as possible

3 Sentinel profile

Sentinel: Traffic Guard of distributed system, which was opened source by Ali Middleware team in July 2018, is a lightweight flow control product oriented to distributed service architecture. It mainly takes traffic as the entry point and protects the stability of system services from multiple dimensions such as flow control, circuit breaker degradation and system load protection

Sentinel’s open source ecology:

features

1 General Introduction

Sentinel has the following characteristics:

Rich application scenarios: second kill flow limiting, message peak clipping, cluster flow control, real-time fusing downstream unavailable applications, etc

Complete real-time monitoring: Sentinel also provides real-time monitoring capabilities. From the console, you can see the second level data of a single machine connected to the application, or even the summary performance of a cluster of less than 500 machines

Extensive Open source ecosystem: Sentinel provides out-of-the-box integration modules with other open source frameworks/libraries, such as Spring Cloud, Dubbo, and gRPC. Sentinel can be accessed quickly by introducing dependencies and simple configuration

Sophisticated SPI extension points: Sentinel provides an easy-to-use, sophisticated SPI extension interface. You can quickly customize the logic by implementing an extension interface. For example, customize rule management and adapt dynamic data sources

Sentinel is divided into two parts:

The Console (Dashboard) is based on Spring Boot and can be packaged to run directly without the need for additional application containers such as Tomcat

The core library (Java client) is independent of any framework/library, can run in all Java runtime environments, and has good support for frameworks such as Dubbo/Spring Cloud

2 Console Features

  • Real-time monitoring Supports automatic discovery of cluster machines, service health status, QPS passed/rejected by service invocation, invocation time, and chart statistics

  • Rule management and push You can configure flow control, degrade, and hotspot rules and push them in real time

  • The authentication console supports user-defined authentication interfaces and provides basic login functions

3 Core library functions and features

(1) Application flow control

For traffic control of a specified application instance, it monitors the number of QPS or concurrent threads of application traffic and controls the traffic when it reaches a specified threshold to avoid being overwhelmed by instantaneous traffic peaks and ensure high availability of applications

Flow control means include:

  • Direct refused to
  • Warm Up, namely, the preheating/cold startup mode, allows the traffic to slowly increase and gradually increase to the upper limit of the threshold within a certain period of time, giving the cold system a time to Warm Up and avoiding the cold system being overwhelmed instantly
  • Uniform queuing, strict control of the request through the interval time, so that the request at a uniform speed through

(2) Cluster flow control

Unlike application flow control, traffic limiting is performed based on the threshold of a single application instance. Cluster flow control limits traffic only for the total number of calls in the entire cluster. For example, in the following scenarios:

  • Limit the total QPS that a user can invoke for an API, and the application that provides the API deploys multiple instances on multiple machines
  • Because the traffic of multiple application instances is not uniform, some machines start traffic limiting when the total number of cluster calls is not reached

It is impossible to limit the total flow accurately if only the dimension of a single machine is used to limit it. The flow control effect can be better played by controlling the total number of calls of the whole cluster accurately through the cluster and combining with the single machine to limit the flow to the bottom

(3) Gateway flow control

Sentinel supports stream limiting for mainstream API gateways such as Spring Cloud Gateway and Zuul

Gateway Traffic control Customized traffic limiting rules for API gateway scenarios. Traffic limiting can be implemented for different routes or user-defined API groups. Customized traffic limiting can be implemented for paths, parameters, headers, and source IP addresses in requests

(4) Fuse downgrade

If the call link of a resource is not stable, will eventually cause a accumulation of the request, by fusing relegated to a resource in the call link appears unstable state (including the call timeout, abnormal ratio increases, the number of abnormal rise), call of the resource limitation, let request failure quickly, avoid to affect other resources and lead to cascading error

After a degraded resource is degraded, all calls to the resource will be automatically disabled (the default behavior is to throw a DegradeException) within the next degraded time window. After the degraded time window, the resource will be disabled again when the resource becomes unstable

(5)

Parameter traffic limiting Hotspot data is frequently accessed data. Traffic limiting collects statistics on hotspot parameters in the incoming parameters and implements traffic limiting for resources containing hotspot parameters based on the configured traffic limiting threshold and mode. For example, in the following scenarios:

  • The user ID is a parameter that limits the range of the user to the interface QPS
  • The commodity ID parameter limits the frequency of the single interface for the commodity

(6) Adaptive current limiting of the system

In order to solve the problem of delay and slow recovery of system performance caused by adaptive current limiting based on the traditional operating system load (load1, uptime under Linux), Sentinel adopted a new idea: Instead of limiting traffic based on an indirect metric (system load), it balances the number of requests the system can handle and the number of requests it allows in

The goal is to increase the throughput of the system as much as possible without dragging it down, rather than having the load fall below a certain threshold

System protection rules control the inbound traffic at the application level and monitor application data from the total Load, RT, inbound QPS, and number of threads on a single machine. When the actual running reaches the threshold, traffic limiting is implemented. The following thresholds are supported:

  • Load: System protection is triggered only when load1 exceeds the threshold and the number of concurrent threads exceeds the system capacity. System capacity is calculated from maxQps * minRt (minimum response time) as measured by system time operation
  • RT: The average RT(response time) of all inlet flows on a single machine
  • Number of threads: The number of concurrent threads for all incoming traffic on a single machine
  • Inlet QPS: The QPS of all inlet flows on a single machine

(7) Black and white list control

The Sentinel blacklist and whitelist determines whether a resource can pass the Sentinel blacklist and whitelist based on the origin of the request. If the whitelist is configured, the resource can pass the Sentinel blacklist and whitelist only when the source of the request is in the whitelist. If the blacklist is configured, the request source in the blacklist does not pass, and the rest of the requests pass

Quick start

1 Installing the Console

From the Github release page (github.com/alibaba/Sen…) Download the latest console JAR package

Command line startup console:

java -Dserver.port=8080 -Dcsp.sentinel.dashboard.server=localhost:8080 -Dproject.name=sentinel-dashboard -jar sentinel-dashboard.jar
Copy the code

2 Access Sentinel applications

Sentinel ADAPTS common mainstream frameworks, including Dubbo, Spring Boot, Spring WebFlux, gRPC, Zuul, Spring Cloud Gateway, RocketMQ, and Web Servlet. For resources requiring stream limiting, Support for try-catch access with native Java or using annotations

The following uses the common Spring Boot annotation as an example:

<dependency> <groupId>com.alibaba.cloud</groupId> <artifactId>spring-cloud-starter-alibaba-sentinel</artifactId> < version > 2.1.0. RELEASE < / version > < / dependency >Copy the code

Application. Yml Specifies the console address:

Spring: Cloud: sentinel: transport: Dashboard: IP: port numberCopy the code

Define the resources that need to be restricted:

@RestController
public class TestController {

    @GetMapping(value = "/hello")
    // Define the name of the resource to be restricted as hello
    @SentinelResource("hello")
    public String hello(a) {
        return "Hello Sentinel"; }}Copy the code

After requesting the HTTP Hello interface above once, Sentinel client initialization is triggered to see the interface in the console

Add a flow control rule:

If you request the interface frequently, you can see that some requests are rejected:

Note: The above configuration is not persistent and is not recommended for production environments

3 Rule Configuration

Sentinel provides dynamic rule data source support to dynamically manage and read configured rules. Sentinel’s ReadableDataSource and WritableDataSource interfaces are easy to use and easy to use.

Sentinel dynamic rule source ADAPTS to common configuration centers and remote storage. Currently, Sentinel dynamic rule source supports Nacos, ZooKeeper, Apollo, Redis and other dynamic rule sources, which can cover many production scenarios

Realize the principle of

The following describes the basic principles of the Sentinel client

1 Basic Concepts

  • Resource In Sentinel, methods and code blocks that need to be protected by traffic can be called resources. Each Resource needs to define a unique Resource name to match related rules

  • Entry Sentinel function Entry class. Entry can be created automatically by adapting to mainstream frameworks, or explicitly by annotating or calling SphU API. After creation, resources and rules are matched and verified

  • Slot function slots are created by the Enty class. Each resource corresponds to a series of slots, which are used to collect resource information, match rules, and verify resources. Multiple slots form a Slot Chain and call entry() and exit() methods when entering and exiting resources based on responsibility Chain mode

2 Working Principle

String resourceName = "resourceName";
Entry entry = null;
try {
	entry = SphU.entry(resourceName);
	System.out.println("resource running");
} catch (BlockException e) {
	/ / current limit
	throw e;
} catch (Throwable e) {
	e.printStackTrace();
	throw e;
} finally {
	if(entry ! =null) { entry.exit(); }}Copy the code

The main process is as follows:

  • Before entering the resource method, an Entry is created based on SphU. The Entry obtains information about the Slot Chain associated with the resource to be searched. If the Slot Chain cannot be found, the Entry () method is created and called based on the responsibility Chain mode
  • Resource method invocation
  • After the resource method call is complete, the Slot’s exit() logic is triggered by Entry

Framework to compare

Sentinel Hystrix resilience4j
Isolation strategy Semaphore isolation (concurrent thread flow limiting) Thread pool isolation/semaphore isolation Semaphore isolation
Fuse downgrading strategy Based on response time, exception ratio, number of exceptions Based on abnormal ratio Based on exception rate, response time
Real-time statistical implementation Sliding Windows (LeapArray) Sliding Windows (based on RxJava) Ring Bit Buffer
Dynamic Rule Configuration Support for multiple data sources Support for multiple data sources Support co., LTD.
scalability Multiple extension points Plug-in form Interface form
Annotation-based support support support support
Current limiting Based on QPS, traffic limiting based on call relationships is supported Limited support Rate Limiter
Traffic shaping Support preheating mode, uniform speed mode, preheating queuing mode Does not support Simple Rate Limiter mode
System adaptive protection support Does not support Does not support
The console Provides out-of-the-box console for configuring rules, viewing second-level monitoring, machine discovery, and more Simple monitoring view No console is provided, and other monitoring systems can be connected

It is worth adding that compared to Hystrix’s thread pool isolation, this solution has better isolation, but at the cost of too many threads, thread context switching overhead, especially for low latency calls.

Sentinel concurrent thread limiting does not create and manage a thread pool, but simply counts the number of threads in the current request context. If the threshold is exceeded, new requests are immediately rejected, similar to semaphore isolation

reference

Sentinel Official Document

Github.com/alibaba/Sen…

Migrating from Hystrix to Sentinel

Github.com/alibaba/Sen…