Making an HTTP call is essentially a network request. Network requests are bound to time out, so consider the following:
-
Whether the default timeout is reasonable
-
Because the network is unstable, retry after timeout must be considered, but whether the idempotent design of the server interface allows us to retry
-
Consider whether the framework limits the number of concurrent connections, as browsers do, lest the limit on the number of concurrent HTTP calls become a bottleneck in the case of a large number of concurrent services.
When Spring Cloud is used for microservice development, Feign can be used for declarative service invocation. If Spring Boot is used for microservice development, Apache’s HTTP Client can be used for service calls.
Connection timeout and read timeout
- The connection timeout parameter ConnectTimeout allows you to set the maximum waiting time during connection construction.
- The ReadTimeout parameter ReadTimeout controls the maximum waiting time for reading data from the Socket.
Connection timeout parameters and connection timeout errors
- _ The connection timeout is configured to be extremely long, such as 60 seconds. _TCP The time for establishing a three-way handshake is very short, usually in milliseconds or at most in seconds. If the connection cannot be established for a long time, it may be caused by the firewall. Therefore, setting a timeout for a long connection is meaningless (1 to 5 seconds is enough). If the call is purely made from the Intranet, it can be shorter. If the downstream service is offline and cannot be connected, the call can fail quickly.
- _ check connection timeout problem, but not clear where the connection is. _ Usually, our service will have multiple nodes, if other clients through the client load balancing technology to connect to the server, then the client and the server will directly establish a connection, at this time, the connection timeout probability is the problem of the server; If the server uses a reverse proxy like Nginx to load balance, the client is actually connected to Nginx, not the server. If connection timeout occurs, you should check Nginx.
Error in reading timeout parameters and reading timeout
-
_ The server execution will be interrupted if a read timeout occurs. _ If the client reads a timeout, the server continues
-
_ Read timeout is a concept at the Socket network level and is the maximum time for data transmission. Therefore, it is set to a very short time, such as 100 milliseconds. _ In fact, a read timeout occurs. The network layer cannot tell whether the server does not return data to the client, the data takes a long time on the network, or packet loss occurs. However, because TCP is to establish a connection before transmitting data, for the network situation is not particularly bad service invocation, usually can be considered that connection timeout is a network problem or service is not online, and read timeout is service processing timeout. Specifically, read timeout refers to the amount of time it takes to write data to the Socket and wait for the Socket to return the data. This time, or most of the time, is the time the server takes to process business logic.
-
_ The longer the timeout period, the higher the success rate of the task interface, and the read timeout parameter is set too long. _ HTTP requests generally require results, which are synchronous calls. If the timeout is long and the client thread (usually Tomcat thread) is also waiting for the server to return data, the application may be dragged down to create a large number of threads when the downstream service has a large number of timeouts, and eventually crash. For scheduled or asynchronous tasks, a longer read timeout configuration is not a problem. However, the request oriented to user response or the short and fast synchronous interface invocation of micro service generally has a large amount of concurrency, so we should set a short read timeout time to prevent being slowed down by the downstream service, usually not set more than 30 seconds of read timeout. If the read timeout is set to 2 seconds, the server interface takes 3 seconds, won’t it never get the result? This is true, so set read timeout must be based on the actual situation, too long may let the downstream jitter affect itself, too short may affect the success rate. Sometimes we even have to set different client read timeouts for different server interfaces based on the SLAs of downstream services.
Feign and Ribbon work together. How do I configure timeout?
Conclusion # 1: By default, Feign’s read timeout is 1 second, so this short read timeout counts as pit # 1.
If you want to change the two default global timeouts for the Feign client, you can set them to:
feign.client.config.default.readTimeout=3000
feign.client.config.default.connectTimeout=3000
Copy the code
Conclusion 2: If you want to configure Feign read timeout, you must configure connection timeout at the same time for it to take effect.
If you open FeignClientFactoryBean, you can see that request. Options will only be overwritten if both ConnectTimeout and ReadTimeout are set:
if (config.getConnectTimeout() ! = null && config.getReadTimeout() ! = null) { builder.options(new Request.Options(config.getConnectTimeout(), config.getReadTimeout())); }Copy the code
If you want to set the timeout for a single Feign Client, you can replace default with the Client’s name:
feign.client.config.default.readTimeout=3000
feign.client.config.default.connectTimeout=3000
feign.client.config.clientsdk.readTimeout=2000
feign.client.config.clientsdk.connectTimeout=2000
Copy the code
Conclusion three, a single timeout overrides the global timeout, which is as expected and not a pit
Conclusion 4. In addition to Feign, you can also configure the Ribbon parameters to change the two timeouts. The first letter of the parameter must be capitalized, which is different from Feign’s configuration.
ribbon.ReadTimeout=4000
ribbon.ConnectTimeout=4000
Copy the code
The Feign timeout takes effect when you configure both Feign and Ribbon parameters
clientsdk.ribbon.listOfServers=localhost:45678
feign.client.config.default.readTimeout=3000
feign.client.config.default.connectTimeout=3000
ribbon.ReadTimeout=4000
ribbon.ConnectTimeout=4000
Copy the code
Conclusion 5. Configure timeout for Feign and the Ribbon at the same time. This is somewhat counterintuitive, because the Ribbon is much lower level and you would expect the latter configuration to work, but it doesn’t
The Ribbon automatically retries the request
Some HTTP clients tend to have a built-in retry policy, which is well-intentional-packet loss is frequent but short-lived due to network problems, and it usually succeeds after a second retry, but be careful if this is not what we expect.
If you look at the Ribbon source, you can see that the MaxAutoRetriesNextServer parameter defaults to 1, which means that the Ribbon automatically retries the Get request once if a server node has a problem (such as a read timeout) :
// DefaultClientConfigImpl public static final int DEFAULT_MAX_AUTO_RETRIES_NEXT_SERVER = 1; public static final int DEFAULT_MAX_AUTO_RETRIES = 0; // RibbonLoadBalancedRetryPolicy public boolean canRetry(LoadBalancedRetryContext context) { HttpMethod method = context.getRequest().getMethod(); return HttpMethod.GET == method || lbContext.isOkToRetryOnAllOperations(); } @Override public boolean canRetrySameServer(LoadBalancedRetryContext context) { return sameServerCount < lbContext.getRetryHandler().getMaxRetriesOnSameServer() && canRetry(context); } @Override public boolean canRetryNextServer(LoadBalancedRetryContext context) { // this will be called after a failure occurs and we increment the counter // so we check that the count is less than or equals to too make sure // we try the next server the right number of times return nextServerCount <= lbContext.getRetryHandler().getMaxRetriesOnNextServer() && canRetry(context); }Copy the code
Solutions:
-
Change the sending interface from Get to Post. In fact, there is an API design problem. Stateful APIS should not be defined as Get. According to the SPECIFICATION of HTTP protocol, Get request is used for data query, while Post is used to submit data to the server for modification or addition. The choice between Get and Post should be based on the behavior of the API, not the parameter size. One misconception is that the parameters of the Get request are contained in the Url QueryString and are limited by browser length, so some students will choose to submit large parameters using JSON Post and small parameters using Get.
-
Second, the MaxAutoRetriesNextServer parameter is set to 0 to disable automatic retry on the next server node after a service invocation failure. Add a line to the configuration file:
ribbon.MaxAutoRetriesNextServer=0
Concurrency limits the crawler’s grasping ability
In addition to timeouts and retry pits, a common problem with making HTTP requests is that the number of concurrent requests limits the processing capacity of the program.
Check PoolingHttpClientConnectionManager source code, you’ll notice there are two important parameters:
-
DefaultMaxPerRoute =2, that is, the maximum number of concurrent requests for the same host/domain name is 2. Our crawler requires 10 concurrent entries, and the default value is obviously too small to limit the crawler’s efficiency.
-
MaxTotal =20, the maximum concurrency of all hosts is 20, which is also the overall concurrency of HttpClient. Right now, we have 10 requests and the maximum concurrency is 10 and 20 is not going to be a bottleneck. For example, using the same HttpClient to access 10 domain names, defaultMaxPerRoute is set to 10. To ensure 10 concurrent requests for each domain name, maxTotal is set to 100.
public PoolingHttpClientConnectionManager( final HttpClientConnectionOperator httpClientConnectionOperator, final HttpConnectionFactory
,>
connFactory, final long timeToLive, final TimeUnit timeUnit) { … this.pool = new CPool(new InternalConnectionFactory( this.configData, connFactory), 2, 20, timeToLive, timeUnit); . }
public CPool( final ConnFactory<HttpRoute, ManagedHttpClientConnection> connFactory, final int defaultMaxPerRoute, final int maxTotal, final long timeToLive, final TimeUnit timeUnit) { … }}
Solutions:
Declare a new HttpClient and set maxPerRoute to 50 and maxTotal to 100
httpClient2 = HttpClients.custom().setMaxConnPerRoute(10).setMaxConnTotal(20).build();
Copy the code
100 Common Mistakes in Java Business Development