The background,
**1. ** In the process of continuous iteration of each business system, JDK, SpringBoot, RocketMQ Client and other frameworks have also been upgraded. The RocketMQ Client of the higher version sends messages to the lower version, and the message details cannot be viewed at noon on the console. As a result, daily troubleshooting is difficult.
**2, ** It is difficult to implement consistency between the original business end sending messages and the local transaction. It is very expensive to develop and protect against data loss and data inconsistencies. RocketMQ V4.4 adds transaction messages.
** Our reliance on MQ has grown to be as important and stable as DB, and v4.x has added new features and monitoring tools that allow us to better monitor MQ usage.
4. V4.x version has been transferred from Alibaba to Apache community and maintained by him, which promotes wider use, more participants to participate in, and higher reliability and timely responsiveness.
**5, ** The new version has better throughput and support for the new technology, based on these factors, we are considering the upgrade and transformation of MQ.
** upgrade V3_2_6 – > V4.6.0
Second, the process
Due to service features, the RocketMQ cluster should be upgraded iteratively and continuously as follows:
Ask the architect of the upgrade to review the documentation and fix any gaps before they cause an irreversible accident
The following is the basic information used for the upgrade:
The official documentation
Rocketmq.apache.org/docs/quick-…
Rocketmq.apache.org/dowloading/…
Dledger Quick Setup Guide:
Github.com/apache/rock…
Apache RocketMQ Developer Guide:
Github.com/apache/rock…
Two architecture diagrams to read before upgrading:
1. Message storage
2, the message brush plate
Three, early preparation
1. Current environment version status
DEV:
http://10.0.254.191:7080/ V3_5_8 2 m
TEST:
http://10.185.240.76:8081/ V3_5_8 2 m
PRO:
rocketmq.pro.siku.cn/ admin/secoo V3_2_6 2m
2. Jre environment supported by each component version
Version | Client | Broker | NameServer |
---|---|---|---|
4.0.0 – incubating | > = 1.7 | > = 1.8 | > = 1.8 |
4.1.0 – incubating > = > = 1.8 > 1.6 = 1.8 | |||
4.2.0 | > = 1.6 | > = 1.8 | > = 1.8 |
X 4.3. | > = 1.6 | > = 1.8 | > = 1.8 |
X 4.4. | > = 1.6 | > = 1.8 | > = 1.8 |
X 4.5. | > = 1.6 | > = 1.8 | > = 1.8 |
X 4.6. | > = 1.6 | > = 1.8 | > = 1.8 |
4. Set of commands used during the upgrade
#Start the
nohup sh bin/mqnamesrv &
nohup sh bin/mqbroker -c conf/2m-noslave/broker-b.properties &
#The write permission of the broker was disabledBin /mqadmin updateBrokerConfig -b 192.168.x.x:10911 -n 192.168.x.x:9876 -K brokerPermission -v 4
#Restore the write permission of the nodeBin /mqadmin updateBrokerConfig -b 192.168.x.x:10911 -n 192.168.x.x:9876 -k brokerPermission -v 6
#stop
bin/mqshutdown broker
bin/mqshutdown namesrv
#View cluster information, cluster, BrokerName, BrokerId, TPS information, etc
./bin/mqadmin clusterList -n localhost:9876
#Get all topics
./bin/mqadmin topicList -n localhost:9876 -c DevCluster > topiclist
#Obtain topic routing information
./bin/mqadmin topicRoute -t demo-cluster -n localhost:9876
#To get the topic offset
./bin/mqadmin topicStatus -t demo-cluster -n localhost:9876
#Print Topic subscription relationship, TPS, accumulation, 24h read and write total and other information
./bin/mqadmin statsAll -n localhost:9876
#Modifying broker parameters/bin/mqadmin updateBrokerConfig -n localhost:9876 -b 10.0.xxx.2:10911 -k waitTimeMillsInSendQueue -v 500 -c TestCluster
#Send a message
./bin/mqadmin sendMessage -n localhost:9876 -t lqtest -p "this is test"
#consumption
./bin/mqadmin consumeMessage -n localhost:9876 -t lqtest
Copy the code
5. Collect official version features
Only indicate the characteristics of the next important here, interested can look at the rocketmq.apache.org/release_not…
4.0.0 (INCUBATING) into Apache
4.4.0 Support message trace and ACL
4.5.0 Introduction of Dledger multi-copy technology
6. New cluster 4.6.0 Cluster model selection
1. Single Master mode
This is risky because if the Broker restarts or goes down, the entire service may become unavailable. This is not recommended for online environments, but can be used for local testing.
2. Multi-master mode
There are no slaves in a cluster but only masters, for example, two or three masters. The advantages and disadvantages of this mode are as follows:
** advantages: ** simple configuration, the maintenance of a single Master has no impact on the application. When the disk is configured as RAID10, even if the machine is down and cannot be recovered, the RAID10 disk is very reliable, and the message will not be lost (a small number of messages are lost in asynchronous disk flushing, and none is lost in synchronous disk flushing), which provides the highest performance.
** Disadvantages: ** During a single machine outage, unconsumed messages on that machine cannot be subscribed until the machine is restored, affecting the real-time performance of messages.
3. Multi-master multi-slave mode – Asynchronous replication
Each Master is configured with a Slave, and there are multiple pairs of master-slaves. HA adopts asynchronous replication, and the Master has a short message delay (in milliseconds). Advantages and disadvantages of this mode are as follows:
** Advantages: ** Even if the disk is damaged, the message loss is very small, and the real-time of the message is not affected, and the Master is down, consumers can still consume from the Slave, and this process is transparent to the application, does not require manual intervention, performance is almost the same as the multi-master mode;
** Disadvantages: **Master is down, a small number of messages will be lost in the case of disk corruption.
** multi-master multi-slave mode – Synchronous double-write (select this cluster mode based on the current service and concurrency)
Each Master is configured with a Slave. There are multiple master-slave pairs. HA adopts the synchronous dual-write mode.
** Advantages: ** No single point of failure of data and services, no delay of messages in the case of Master outage, service availability and data availability are very high;
** Disadvantages: ** Performance is slightly lower than asynchronous replication (about 10% lower), the RT for sending a single message is slightly higher, and the current version cannot automatically switch from the standby node to the host when the active node goes down
Please choose to upgrade according to your own cluster characteristics and stability, not necessarily the latest cluster mode is the most suitable for you. Make smooth upgrades a priority.
7. TOPIC arrangement
You can write a script to collate the existing topic directory and collate the topic list and partitions after the upgrade is complete.
In RocketMQ, topic is designed as an organization of the same business logic message. It is only a logical concept, and a topic contains several logical queues, namely message queues. Message content is actually stored in queues, and queues are stored in the broker
Be sure to sort out special business scenarios
2. Topic single broker configuration
In case a Topic has only one queue, messages will be lost. I just picked up a picture, so you can look at it
8. Configure the new cluster
brokerClusterName=MQCluster brokerName=broker-ali-76 brokerId=0 deleteWhen=04 fileReservedTime=360 brokerRole=ASYNC_MASTER flushDiskType=ASYNC_FLUSH storePathCommitLog=/data/rocketmq/store/commitlog storePathConsumerQueue=/data/rocketmq/store/consumequeue storePathRootDir=/data/rocketmq/store autoCreateSubscriptionGroup=true ## if msg tracing is open,the flag will be true traceTopicEnable=true listenPort=10911 NamesrvAddr = 10.48 xx. 76:9876; 10.48 xx. 77:9876Copy the code
4. Upgrade steps
When the walkthrough is done, we can get started. My final architectural mode of choice: Multi Master Slave mode – synchronous double write
Process Overview:
1. Modify 2M-2s-sync, runbroker, and runServer configuration parameters
Stop 3.2.6 Nameserver original IP PORT 4.6.0 nameserver, step by step replacement is complete
Stop 3.2.6 Broker Start 4.6.0 broker (check whether there is a single topic problem) and replace it step by step
4. Test the cluster stability and add slaves to the new cluster. The upgrade is complete
Detailed steps:
Ready to operate
1. Download the latest 4.6.0 deployment package
CD/data/xxx_tomcat wget HTTP: / / http://mirrors.tuna.tsinghua.edu.cn/apache/rocketmq/4.6.0/rocketmq-all-4.6.0-bin-release.zip Unzip rocketmq - all - 4.6.0 - bin - releaseCopy the code
2. Modify the configuration
CD /data/xxx_tomcat/ Rocketmq-4.6.0 /conf/2m-2s-sync Change 51, 50 Machine Broker configuration change configuration 2m-2s-sync Change runbroker JVM configuration to avoid using the default configuration and running out of memoryCopy the code
3. The configurations of the two M’s are as follows:
BrokerName brokerClusterName=MQCluster brokerName=broker-60-50 brokerId=0 deleteWhen=04 fileReservedTime=48 brokerRole=ASYNC_MASTER flushDiskType=ASYNC_FLUSH storePathCommitLog=/data/alibaba-rocketmq/store/commitlog storePathConsumerQueue=/data/rocketmq/store/consumequeue storePathRootDir=/data/alibaba-rocketmq/store autoCreateSubscriptionGroup=true ## if msg tracing is open,the flag will be true traceTopicEnable=true listenPort=10911 NamesrvAddr = 192.168. XXX. Both 876; 192.168. XXX. Up to 876Copy the code
4. Replace nameserver
JPS -l sh bin/mqshutdown namesrv CD /data/xxx_tomcat/ RocketMq-4.6.0 nohup sh bin/ mqNamesrv-. /bin/mqadmin clusterList -n localhost:9876Copy the code
Replace the broker
The JPS -l sh bin/mqshutdown broker CD/data/xxx_tomcat rocketmq 4.6.0 ps - ef | grep mq/check/use nohup sh bin/mqbroker - c configuration file conf/2m-2s-sync/broker-b.properties & ./bin/mqadmin clusterList -n localhost:9876Copy the code
Finally, let’s test it by sending a message
./bin/mqadmin sendMessage -n localhost:9876 -t lqtest -p "this is test"
./bin/mqadmin consumeMessage -n localhost:9876 -t lqtest
Copy the code
Congratulation you arrive here your cluster upgrade complete!!