1. The principle of mq
No more data, no less data, no more means no more messages, which we solved in the last video; No less, which means no data loss. If MQ is delivering very core messages that support the core business, then this scenario must not lose data.
2. Data loss scenario
Lost data is generally divided into two types, one is mq to lose the message, one is consumed to lose the message. Rabbitmq A: Producer loses data when sending data to RabbitMQ, the producer may lose data during transmission due to network problems. B: RabbitMQ loses data. If rabbitMQ is not persistent, data will be lost once rabbitMQ restarts. You must enable persistence to persist messages to disk, so that if RabbitMQ is down, the data stored will be read automatically after recovery, and generally data will not be lost. Unless extremely rare, RabbitMQ dies before persisting, which can result in some data loss. C: The main reason the consumer loses data is that the consumer hangs up before the purchase is processed, so when you restart rabbitMQ, it will assume that the purchase has been made and the data is lost.
A: The producer lost the data
B: Kafka lost the data
C: Consumers have lost their data
3. How do I prevent message loss
(1) The producer loses the message. Rabbitmq can optionally provide transaction functionality, where the producer starts the transaction before sending the data and then sends a message. If the message is not received successfully, the producer will receive an exception, which can be rolled back and re-sent. If a message is received, then the thing can be submitted.
channel.txSelect(); // start things try{// send messages}catch(Exection e){channel.txrollback (); // roll back things // resubmit}Copy the code
Disadvantages: Synchronous blocking occurs when rabbitMQ transactions are started, with producers blocking to see if they are successfully sent, resulting in throughput degradation.
② : Confirm mode can be enabled. After the producer has enabled confirm mode, a unique ID is assigned to each message written to RabbitMQ, and rabbitMQ will send you an ACK message telling you that the message is OK to send. If RabbitMQ fails to process the message, it will call back to an NACK interface telling you that the message failed and you can retry. And you can use this mechanism to know that you maintain the ID of each message in memory, and if you don’t receive a callback for that message after a certain amount of time, you can resend it.
// Open confirm channel.confirm(); Public void ack(String messageId){} Public void nack(String messageId){Copy the code
The two different transactions are synchronous, you commit an item and it blocks, but confirm is asynchronous, you send a message and the next message can be sent, and RabbitMQ will call back to tell if it was successful. Generally in the producer part to avoid loss, is to use the confirm mechanism. B: RabbitMQ itself loses the data setting message to persist to disk. There are two steps to set persistence: 1. Create a queue and set it to persist. This ensures that rabbitMQ will persist the metadata in the queue, but not the data in it. ② Set deliveryMode to 2 to persist messages and RabbitMQ will persist messages to disk. You have to turn both on at the same time.
And persistence can be combined with the production confirm mechanism, so that the producer ACK will not be notified until the message has been persisted to disk, so that even if rabbitMQ dies before persistence, the data will be lost and the producer will resend the message without receiving the ACK callback. Use rabbitMQ’s ack mechanism, first turn off rabbitMQ’s automatic ACK and then call the ACK manually in the code each time you are sure to process the message. This avoids the need to ack messages before they have been processed.
Kafka A: if the consumer loses data, turn off the automatic submission of offset, and manually submit the offset after its own processing, so that data will not be lost.
Replication. Factor: This parameter must be greater than 1, indicating that each partition must have at least two replicas.
② Set the min.isync.replicas parameter on the Kafka server. The value must be greater than 1, indicating that the leader must be aware that there is at least one follower in contact with the leader.
③ If acks=all is set on the producer side, it means that each piece of data must be written to all replica copies before it is considered as written successfully
(4) Set retries=MAX at the producer end
C: The producer lost data. If ack=all is set as above, the data will not be lost. Only after your leader receives the message and all the followers have synchronized the message, the write is considered successful. If this condition is not met, the producer automatically retries an unlimited number of times.
The previous article, “How to Ensure that Messages are not consumed twice,” and the next article, “How to Ensure that messages are executed sequentially.”