The cluster node process mysteriously disappears
The phenomenon of description
We received an alarm and o&M feedback that a RocketMQ node is missing. Log, storeerror. Log, store.log, store.log, and watermark.log in the system. Traffic in and out of the cluster does not change significantly in the normal water level, CPU usage, CPU Load, disk I/O, memory, and bandwidth.
Cause analysis,
To check the cause, you can view historical O&M operations through history. The Broker was started in the current session instead of in the background.
sh bin/mqbroker -c conf/broker-a.conf
Copy the code
The problem with this command is that the Broker node exits when the session expires.
The solution
Standardize operation and maintenance operations, review each operation of operation and maintenance, and it is better to automate the standardized operation.
Start the Broker correctly:
nohup sh bin/mqbroker -c conf/broker-a.conf &
Copy the code
The CPU of the Master node is abnormally high
The phenomenon of description
The CPU of the RocketMQ primary node frequently spikes and then falls back. Service sending times out are serious. As two secondary nodes are deployed on the same machine, the secondary node even hangs directly.
CPU burrs on the active node