preface

  • Recently, the production environment logs have been printing an exception about rocketMQ: defaultMQProducer Send Exception

  • Then the online MQ message is successfully sent and the consumer is successfully consumed. It didn’t affect other businesses. The business was busy at that time, so I didn’t check the specific reason.
  • Now that we’re free, we trace a wave based on the printed exception stack.

The analysis process

  • Firstly, ONLINE MQ messages are sent synchronously, and the default timeout period is 3 seconds
  • Based on the exception information, you can confirm that the exception information is thrown when sending an MQ message, and the printed exception information is not customized by the MQ client

  • According to the source of stack exception information, filter out some execution links in the middle, and finally throw the location method invokeSyncImpl()
  • Netty’s writeAndFlush() method is called to write data to the socket and then added to a listener. Failure to send will wake up the thread.

  • Go to the.waitResponse(timeoutMillis) method circled in the figure, which has the countdownlatch.await (timeoutMillis, timeunit.milliseconds) timeout blocking method, and continue

  • There is a tryAcquireSharedNanos() method

Definition: Attempts to fetch in shared mode, aborts if the interrupt, and fails if the given timeout expires. First check the interrupt status, then call {@link#tryAcquireShared} at least once and return on success. Otherwise, the thread will queue and may repeatedly block and unblock, calling {@link#tryAcquireShared} until it succeeds or the thread interrupts or times out.

  • In the method entry, there is a method that checks for thread.interrupted () and throws an exception if it returns true (meaning the current Thread has been interrupted, or false otherwise)
  • According to the stack information log, the exception is thrown here, indicating that the current thread has been interrupted, so it is not a custom exception of the MQ client.

conclusion

  • This indicates that the thread sending the MQ message was interrupted, and then the thread interrupt was detected while waiting for the response result in MQ. So the exception message thrown here does not affect the rest of the business in the production environment, because the message is also sent successfully and then consumed.

How is the thread of execution that currently sends MQ traffic interrupted

  • A thread pool is used to execute a business online, but in order to prevent it from running too long and blocking other services, an expiration timeout is set to force the thread pool task mechanism to be shut down. If the sending MQ thread checks whether the current thread is interrupted while waiting for its response, the result is an exception message.
  • Because it is the core service of the product and the main revenue service, it cannot be adjusted. Therefore, the current exception information log is filtered out.

The last

  • Learn with an open mind and make progress together