The article directories
-
- The profile
- Rely on the import
- The configuration file
- Isolation mechanism
- Retry mechanism
The profile
When Netflix Zuul is used as a gateway to forward incoming requests to the back-end service, there is always a chance that the request may not be available to the back-end service.
When a request fails, you may want to retry the request automatically. To do this, when using Sping Cloud Netflix, you need to include Spring Retry in your application’s classpath. When Spring Retry occurs, load-balanced Zuul automatically retries any failed requests (Zuul retries twice if the back-end service is down in the example below).
- The default HTTP client used by Zuul is now the Apache HTTP client instead of the deprecated RestClient of the Ribbon.
- Netflix Ribbon HTTP client: by setting the opening Ribbon. The restclient. Enabled = true. The client has limitations, including no support for the PATCH method, but also a built-in retry capability.
Corresponding use of Client source code:
/**
* An Apache HTTP client which leverages Spring Retry to retry failed requests.
*
* @author Ryan Baxter
* @author Gang Li
*/
public class RetryableRibbonLoadBalancingHttpClient
extends RibbonLoadBalancingHttpClient {
Copy the code
Rely on the import
<dependency>
<groupId>org.springframework.retry</groupId>
<artifactId>spring-retry</artifactId>
<version>1.3.0</version>
</dependency>
Copy the code
The configuration file
Configuration items:
- Ribbon.MaxAutoRetries: 1 – Maximum number of retries on the same server (excluding the first attempt)
- Ribbon. MaxAutoRetriesNextServer: 1 – will try again next the maximum number of servers (not including the first server)
- Ribbon. OkToRetryOnAllOperations: true – whether can retry this client all operations
- Ribbon. ServerListRefreshInterval: 2000 – the refresh interval of server list
roadnet-service:
ribbon:
NIWSServerListClassName: com.netflix.loadbalancer.ConfigurationBasedServerList
listOfServers: http://10.7.11.13:9006,http://localhost:8081
ConnectTimeout: 1000
ReadTimeout: 3000
MaxTotalHttpConnections: 500
MaxConnectionsPerHost: 100
MaxAutoRetries: 1
MaxAutoRetriesNextServer: 1
Copy the code
Isolation mechanism
In the microservices model, the connections between applications become less strong, and ideally any application that gets overloaded or dies should not affect the other applications. But at the Gateway level, is it possible that one application becomes so overloaded that the Gateway collapses and all applications are cut off?
This is certainly possible, imagine an application that receives many requests per second. In normal circumstances, these requests might respond within 10 milliseconds, but if it goes wrong one day, all requests will be blocked until 30 seconds have expired (for example, the frequent Full GC fails to free up memory efficiently). At this point, the Gateway will also have a large number of threads waiting for a response to the request, eventually eating up all the threads and affecting other normal application requests.
In Zuul, each back-end application is called a Route. To prevent one Route from preempting too many resources and affecting other routes, Zuul uses Hystrix to isolate and limit traffic for each Route.
Hystrix has two isolation strategies, thread-based or semaphore – based. Zuul has a thread-based isolation mechanism by default, which means that each Route request is executed in a fixed size, separate thread pool, so that if one Route has a problem, only one thread pool blocks and the other routes are not affected. (Semaphore by default in 2.27)
With Hystrix, the semaphore isolation strategy is typically used only when the thread overhead is affected by the high call volume, and thread isolation is more secure for network request purposes such as Zuul.
Retry mechanism
In general, the health of back-end applications is unstable and the list of applications can change at any time, so the Gateway must have sufficient fault tolerance to reduce the impact of back-end application changes.
Zuul routes are routed in two modes: Eureka and the Ribbon. The following describes the fault-tolerant configurations supported by the Ribbon.
There are three retry scenarios:
- OkToRetryOnConnectErrors: Retries network errors only
- OkToRetryOnAllErrors: Retry all errors
- OkToRetryOnAllOperations: Retries all operations.
There are two types of retries:
- MaxAutoRetries: indicates the maximum number of retries for a node
- MaxAutoRetriesNextServer: indicates the maximum number of retries for replacing a node
In general, we want to retry only when the network connection fails, or retry 5XX GET requests (retries on POST requests are not recommended, as idempotent data inconsistencies are not guaranteed). The number of retries for a single node should be as small as possible and the number of retry nodes should be as large as possible to achieve better overall performance.
If there is a more complicated retry scenarios, such as the need for some specific apis, the return value of a particular retry, you can also by implementing RequestSpecificRetryHandler custom logic (does not recommend using RetryHandler directly, Because this subclass can use a lot of existing functionality).
Reference:
-
Keep warm together and make progress together
🍎QQ group [837324215] 🍎 pay attention to my public number [Java Factory interview officer], learn together 🍎🍎🍎 🍎 personal vx [Lakernote]