Background:
Once in the production environment, a lot of repayment orders were suddenly suspended. Later we checked the reason and found that Hystrix call exception occurred during the internal system call. In the development process, because the core thread number is set relatively large, there is no such exception. I put it in the test environment, and occasionally this happens, and then I look for a solution on the web, and the solution on the web is to adjust the maxQueueSize property, and I adjusted it, and it did improve. After running in production for a while, my first thought was to check the maxQueueSize property, which was already set. When is wondering, why maxQueueSize attribute doesn’t work, by looking at the official documentation later found Hystrix and a queueSizeRejectionThreshold attribute, this property is to control the queue threshold, the largest and Hystrix default configuration only 5, So no matter how large we set maxQueueSize, it doesn’t matter. Both attributes must be configured simultaneously
Take a look at the correct Hystrix configuration pose.
Application. Yml:
hystrix:
threadpool:
default:
coreSize: 200 # Maximum number of concurrent threads. Default: 10
maxQueueSize: 1000 #BlockingQueue Specifies the maximum number of queues. Default is -1
queueSizeRejectionThreshold: 800 # even maxQueueSize not reached, after reaching queueSizeRejectionThreshold the value, the request will be rejected, the default value is 5
Copy the code
Next, write a test class to verify several incorrect configurations and see what happens.
Test class code (caller A) :
/**
* @Author: XiongFeng
* @Description:
* @Date: Created in 11:12 2018/6/11
*/
public class RepaymentHelperTest extends FundApplicationTests {
@Autowired
RepaymentHelper repaymentHelper;
@Autowired
private RouterFeign routerFeign;
@Test
public void hystrixTest() throws InterruptedException {
for (int i = 0; i < 135; i++) {
new Thread(new Runnable() {
@Override
public void run() {
job();
}
}).start();
}
Thread.currentThread().join();
}
public void job() {
String repaymentNo = "xf1002";
String transNo = "T4324324234";
String reqNo = "xf1002";
String begintime = "20180831130030";
String endtime = "20180831130050";
TransRecQueryReqDto transRecQueryReqDto = new TransRecQueryReqDto();
transRecQueryReqDto.setTransNo(transNo);
transRecQueryReqDto.setBeginTime(begintime);
transRecQueryReqDto.setEndTime(endtime);
transRecQueryReqDto.setReqNo(reqNo);
Resp<List<TransRecDto>> queryTransRecListResp = routerFeign.queryTransRec(new Req<>(repaymentNo, "2018080200000002", null, null, transRecQueryReqDto));
System.out.println(String.format("Obtain result: [%s]", JsonUtil.toJson(queryTransRecListResp))); }}Copy the code
- The purpose of this test class is to create 135 threads that make concurrent requests to the B server through the RouterFeign class to see if the request results are abnormal.
Feign call code:
@FeignClient(value = "${core.name}", fallbackFactory = RouterFeignBackFactory.class, path = "/router") public interface RouterFeign {@param transRecQueryReqDtoReq * @return
*/
@PostMapping("/queryTransRec")
Resp<List<TransRecDto>> queryTransRec(@RequestBody Req<TransRecQueryReqDto> transRecQueryReqDtoReq);
}
Copy the code
- This class is the client that calls the B server through Feign
Service Provider Code (B Service Provider) :
/**
* @Author: XiongFeng
* @Description:
* @Date: Created in 16:04 2018/5/24
*/
@Api("Repayment Service")
@RefreshScope
@RestController
@RequestMapping("/router") public class TestController { private static Logger logger = LoggerFactory.getLogger(TestController.class); Private static AtomicInteger count = new AtomicInteger(1); @ApiOperation(value ="Withholding Result Query")
@PostMapping("/queryTransRec")
Resp<List<TransRecDto>> queryTransRec(@RequestBody Req<TransRecQueryReqDto> transRecQueryReqDtoReq) throws InterruptedException {
System.out.println(String.format("Check payment results...... Count: % s", count.getAndAdd(1)));
Thread.sleep(500);
return Resp.success(RespStatus.SUCCESS.getDesc(), null);
}
Copy the code
- The purpose of this class, as a service provider, is to count and return results.
Let’s take a look at several configuration errors.
Case 1 (lower the number of core threads, increase the maximum number of queues, but set the queue rejection threshold to a smaller value) :
hystrix:
threadpool:
default:
coreSize: 10
maxQueueSize: 1000
queueSizeRejectionThreshold: 20
Copy the code
The result at this point:
- The left window is the B server and the right window is the A caller. As can be seen from the result, 135 calls, about 32 successful, the rest of the threads throw exceptions.
Case 2 (lower the number of core threads, lower the maximum number of queues, but set the queue rejection threshold higher) :
hystrix:
threadpool:
default:
coreSize: 10
maxQueueSize: 15
queueSizeRejectionThreshold: 2000
Copy the code
The result at this point:
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@7d6d472b rejected from java.util.concurrent.ThreadPoolExecutor@17f8bcb7[Running, pool size = 3, active threads = 3, queued tasks = 15, completed tasks = 0]
Copy the code
- The left window is the B server and the right window is the A caller. As can be seen from the result, 135 calls, about 25 successful, the rest of the threads throw exceptions.
Case 3 (lower the number of core threads and increase the maximum number of queues, but leave the queue rejection threshold unchanged) :
hystrix:
threadpool:
default:
coreSize: 10
maxQueueSize: 1500
Copy the code
The result at this point:
java.util.concurrent.RejectedExecutionException: Rejected command because thread-pool queueSize is at rejection threshold.
Copy the code
- The left window is the B server and the right window is the A caller. In this case, the result is the same as the case 1, 135 calls, about 47 successful, the rest of the threads throw exceptions. The error is the same as in case one
Case 4 (The number of core threads is reduced, the maximum number of queues is not set, but the queue rejection threshold is set to a large value) :
hystrix:
threadpool:
default:
coreSize: 10
queueSizeRejectionThreshold: 1000
Copy the code
The result at this point:
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@23d268ea rejected from java.util.concurrent.ThreadPoolExecutor@66d0e2f4[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
Copy the code
- The left window is the B server and the right window is the A caller. In this case, the result is the same as the case 2, 135 calls, about 10 successful, the rest of the threads throw exceptions. The error report is the same as in case two
Let’s take a look at the correct configuration case
Case 1: Lower the number of core threads and set the maximum number of queues and the queue rejection threshold to larger values) :
hystrix:
threadpool:
default:
coreSize: 10
maxQueueSize: 1500
queueSizeRejectionThreshold: 1000
Copy the code
The result at this point:
- The left window is the B server and the right window is the A caller. At this point, the result is completely normal, with 135 concurrent requests, all successful!
Conclusion: only five official default queue threshold, if you want to adjust the queue, must modify maxQueueSize and queueSizeRejectionThreshold attribute values at the same time, otherwise will be abnormal!
Reference Documents:
Spring Hystrix official documentation
The original address: www.seifon.cn/2018/12/08/…