Chapter 2: RocketMQ Installation and startup
1. Basic Concepts
1 Message
A message is the physical carrier of information transmitted by a message system, the smallest unit of data produced and consumed, and each message must belong to a topic.
2. Topic
A Topic represents a collection of a class of messages, each containing several messages, each belonging to only one Topic, and is RocketMQ’s basic unit for message subscription. topic:message 1:n message:topic 1:1
A producer can send messages for multiple topics at the same time; A consumer is only interested in a particular Topic, that is, can subscribe to and consume messages from only one Topic. producer:topic 1:n consumer:topic 1:1
3 Tag
Label set for messages to distinguish between different types of messages under the same topic. Messages from the same business unit can be tagged differently under the same topic for different business purposes. Tags are an effective way to maintain clarity and consistency in your code and optimize the query system RocketMQ provides. Consumers can implement different consumption logic for different subtopics based on the Tag for better scalability.
Topic is the primary classification of messages, and Tag is the secondary classification of messages.
Topic: the goods
Tag = Shanghai
Tag = jiangsu
Zhejiang tag =
——- consumer —–
Topic = goods tag = Shanghai
Topic = tag = | Shanghai zhejiang goods
Topic = goods tag = *
4 Queue
The physical entity that stores messages. A Topic can contain multiple queues, each containing messages for that Topic. A Topic Queue is also called a Partition of messages within a Topic.
Messages in a Topic Queue can only be consumed by one consumer in one consumer group. Messages in a Queue are not allowed to be consumed simultaneously by multiple consumers in the same consumer group.
Another concept you’ll see when you look at references to other related sources is Sharding. Sharding is different from partitioning. In RocketMQ, sharding is the Broker that houses the corresponding Topic. A corresponding number of partitions, Queue, are created in each shard, and each Queue is of the same size.
5 MessageId (MessageId/Key)
Each message in RocketMQ has a unique MessageId and can carry a Key with a business identity to facilitate the query of messages. Note, however, that there are two messageids: a MessageId(msgId) is automatically generated when the producer sends () message, and a MessageId(offsetMsgId) is automatically generated when the message reaches the Broker. MsgId, offsetMsgId, and key are all called message identifiers.
-
MsgId: generated by producer: producerIp + process PID + hashCode of ClassLoader of MessageClientIDSetter + current time + AutomicInteger Auto-increment counter
-
OffsetMsgId: Generated by brokerIp + offset of the physical partition (offset in the Queue)
-
Key: a unique service-related identifier specified by a user
Second, system architecture
RocketMQ architecture is divided into four parts:
1 Producer
Message producer, responsible for producing messages. The Producer uses MQ’s load balancing module to select the corresponding Broker cluster queue for message delivery, which supports fast failure and low latency.
For example, the process of writing logs generated by business systems to MQ is the process of message production
Another example is the process of message production when a second kill request submitted by a user in an e-commerce platform is written to MQ
The message producers in RocketMQ are all Producer Groups. A Producer group is a collection of producers of the same type. A Producer sends messages of the same Topic type. A producer group can send messages on multiple topics at the same time.
2 Consumer
Message consumers, responsible for consuming messages. A message consumer retrieves the message from the Broker server and performs relevant business processing on the message.
For example, the process of a QoS system reading logs from MQ and parsing logs is the process of message consumption.
Another example is the process of message consumption when the business system of an e-commerce platform reads and processes a second kill request from MQ.
Message consumers in RocketMQ come in the form of Consumer groups. A Consumer group is a collection of consumers of the same type that consume messages of the same Topic type. Consumer groups enable load balancing (allocating different queues in a Topic equally to different consumers in the same Consumer Group, not message load balancing) and fault tolerance (a Consmer has failed, The goal that other consumers in the Consumer Group can then consume the Queue that the original Consumer consumes becomes very easy.
The number of consumers in the Consumer group should be less than or equal to the number of subscribed Topic queues. If the number of queues is exceeded, the additional consumers will not be able to consume messages.
However, a topic-type message can be consumed by multiple consumer groups simultaneously.
Pay attention to,Copy the code
1) The consumer group can only consume messages of one Topic, not multiple topics at the same timeCopy the code
2) Consumers in a consumer group must subscribe to exactly the same TopicCopy the code
3 Name Server
Function is introduced
NameServer is a Broker and Topic routing registry that supports dynamic registration and discovery of brokers.
RocketMQ comes from Kafka, which relies on Zookeeper. As a result, earlier versions of RocketMQ, in MetaQ V1.0 and v2.0, also relied on Zookeeper. MetaQ V3.0 (RocketMQ) removed the Zookeeper dependency and used its own NameServer.
It mainly includes two functions:
- Broker management: Receives the registration information of Broker clusters and saves it as basic routing information. Provides a heartbeat detection mechanism to check whether the Broker is still alive.
- Routing information management: Each NameServer holds the entire routing information of the Broker cluster and the queue information for clients to query. Producer and Conumser can use NameServer to obtain routing information for the entire Broker cluster, which can be delivered and consumed smoothly.
Routing registered
NameServer is also usually deployed as a cluster. However, NameServer is stateless, that is, there is no difference between nodes in the NameServer cluster and no communication between nodes. How does the data in each node get synchronized? At Broker node startup, the NameServer list is polled, long connections are established with each NameServer node, and registration requests are made. A list of brokers is maintained within NameServer to store Broker information dynamically.
Note that this is different from other registries like ZK, Eureka, Nacos, etc.
What are the advantages and disadvantages of NameServer’s stateless approach?
Advantages: NameServer clusters are easy to build and expand.
Disadvantages: For brokers, all NameServer addresses must be specified. Otherwise, those not indicated will not register. Because of this, NameServer can’t just expand. If the Broker is not reconfigured, the new NameServer is invisible to the Broker and will not be registered with it.
Broker nodes send heartbeat packets to NameServer every 30 seconds to prove that they are alive and to maintain a long connection with NameServer. The heartbeat package contains BrokerId, Broker address (IP+Port), Broker name, cluster name of brokers, and so on. When a heartbeat packet is received, NameServer updates the heartbeat timestamp to record the last lifetime of the Broker.
Routing out
If a NameServer does not receive a heartbeat from a Broker due to shutdown, outage, or network jitter, NameServer may remove it from the list of brokers.
NameServer has a timed task that scans the Broker table every 10 seconds to see if the latest heartbeat timestamp of each Broker is more than 120 seconds from the current time. If it is, the Broker is deemed invalid and removed from the list of brokers.
Extension: For RocketMQ routine operations, such as Broker upgrades, the Broker needs to be stopped. What does the OP need to do?
The OP needs to disable read and write access to the Broker. Once a client(Consumer or Producer) sends a request to a broker, it receives NO_PERMISSION from the broker. The client then retries other brokers.
When the OP observes that there is no traffic to the Broker, it closes it and removes the Broker from the NameServer.
OP: Operation and maintenance engineer
SRE: Site Reliability Engineer
Routing discovery
RocketMQ uses the Pull model for route discovery. When Topic routing information changes, NameServer does not actively push it to the client. Instead, the client periodically pulls the latest routing information of the Topic. By default, the client pulls the latest route every 30 seconds.
Extension:
1) Push model: Push model. It is a publish-subscribe model with good real-time performance and needs to maintain a long connection. The maintenance of long connections is resource costly. This model is suitable for scenarios:
- High real-time requirement;
- The number of clients is small, and Server data changes frequently
2) Pull model: Pull model. The problem is that the real time is poor.
3) Long Polling model: Long Polling model. It is the integration of Push and Pull models, making full use of the advantages of the two models and shielding their disadvantages.
The client NameServer selects a policy
The clients are producers and consumers
The client must specify the NameServer cluster address during configuration. Which NameServer node does the client connect to? The client first generates a random number and then modulates the number of NameServer nodes to get the index of the node to be connected, and then joins. If the connection fails, round-robin attempts are made to connect other nodes one by one.
The random strategy is used first, and the polling strategy is used after the failure.
Extension: How does the Zookeeper Client select the Zookeeper Server?
To put it simply, go through Shuffe twice and select the first Zookeeper Server.
To be specific, shuffe the first ZK Server address in the configuration file, then select one at random. This selection is usually a hostname. Then obtain all IP addresses corresponding to the hostname, and then perform shufæe for these IP addresses for the second time, and take the first server address from the shufæe result for connection.
4 Broker
Function is introduced
The Broker acts as a message Broker, storing and forwarding messages. The Broker is responsible in the RocketMQ system for receiving and storing messages sent from producers and preparing the consumer for pull requests. The Broker also stores message-related metadata, including consumer group consumption progress offsets, topics, queues, and so on.
After Kafka 0.8, offset is stored in the Broker, and before that in Zookeeper.
modules
The following figure shows the functional modules of the Broker Server.
Remoting Module: The entity of the entire Broker that handles requests from clients. The Broker entity is made up of the following modules.
Client Manager: indicates the Client Manager. Responsible for receiving and analyzing requests from producers/consumers and managing clients. For example, maintaining a Consumer’s Topic subscription information
Store Service: storage Service. Provides simple AND convenient APIS for storing messages to physical disks and querying messages.
HA Service: A high availability Service that provides data synchronization between Master and Slave brokers.
Index Service: indicates the Index Service. Messages delivered to the Broker are indexed based on a specific Message key, and the ability to quickly query messages based on the Message key is provided.
Cluster deployment
Brokers are typically clustered to improve performance and throughput. There may be different queues of the same Topic in each cluster node. However, there is a problem. If a Broker node goes down, how do you ensure that data is not lost? The solution is to scale each Broker cluster node horizontally, that is, re-build the Broker node into an HA cluster, solving the single point problem.
A cluster of Broker nodes is a Master/Slave cluster, which has both Master and Slave roles. The Master handles read and write operations. The Slave backs up data on the Master. When the Master fails, the Slave automatically switches to the Master. Therefore, this Broker cluster is a primary and secondary cluster. A Master can contain multiple slaves, but a Slave can belong to only one Master. The Master/Slave correspondence is determined by specifying the same BrokerName and different BrokerIDS. BrokerId 0 means Master, non-0 means Slave. Each Broker establishes long connections to all nodes in the NameServer cluster and periodically registers Topic information to all NameserVers.
5 Workflow
The specific process
1) Start NameServer. After NameServer starts, it listens to the port and waits for brokers, Producers, and consumers to connect.
2) When the Broker is started, it establishes and maintains long connections to all Nameservers and sends heartbeat packets to NameServer periodically every 30 seconds.
3) You can create a Topic before sending a message. When creating a Topic, you need to specify which Broker the Topic will be stored on. Of course, the relationship between the Topic and the Broker will also be written into the NameServer when creating a Topic. However, this step is optional, and the Topic can also be created automatically when the message is sent.
4) When the Producer sends messages, it establishes a long connection with one of the NameServer clusters and obtains routing information from NameServer, that is, the mapping relationship between the Queue of the currently sent Topic messages and the address (IP+Port) of the Broker. Then select a Queue from the Queue according to the algorithmic policy and establish a long connection with the Broker of the Queue to send messages to the Broker. Of course, after obtaining the route information, Producer caches the route information locally and updates the route information from NameServer every 30 seconds.
5) Similar to Producer, Consumer establishes a long connection with one NameServer, obtains routing information of the Topic it subscribes to, obtains the Queue it wants to consume from the routing information according to the algorithm strategy, and directly establishes a long connection with the Broker to consume the messages. The Consumer also updates the routing information from NameServer every 30 seconds after it gets the routing information. Unlike Producer, however, the Consumer also sends heartbeats to the Broker to keep it alive.
Topic creation pattern
When creating a Topic manually, there are two modes:
- Cluster mode: The topics created in this mode are in the cluster and the number of queues is the same across all brokers.
- Broker mode: Topics created in this mode are in the cluster, and the number of queues in each Broker can vary.
When topics are automatically created, the default is Broker mode, with four queues created for each Broker by default.
Read/write queues
Physically, the read/write queues are the same queue. Therefore, there is no read/write queue data synchronization problem. Read/write queues are logically differentiated concepts. In general, the number of read/write queues is the same.
For example, when creating a Topic, the number of write queues is set to 8 and the number of read queues is set to 4. In this case, the system creates eight queues, namely 0, 1, 2, 3, 4, 5, 6, and 7. The Producer writes messages to these eight queues, but the Consumer only consumes messages in queues 0, 1, 2, and 3. Messages in queues 4, 5, 6, and 7 are not consumed.
For example, when creating a Topic, set the number of write queues to 4 and the number of read queues to 8. In this case, the system will create 8 queues, namely 0, 1, 2, 3, 4, 5, 6, and 7. The Producer writes messages to queues 0, 1, 2, 3, while the Consumer only consumes messages from queues 0, 1, 2, 3, 4, 5, 6, 7. There are no messages from queues 4, 5, 6, and 7. Suppose the Consumer Group contains two consuers, with Consumer1 consuming 0, 1, 2, 3 and Consumer2 consuming 4, 5, 6, 7. But the reality is that Consumer2 has no messages to consume.
That is, there is always a problem when the number of read/write queues is set differently. So why design it this way?
This is designed to make it easier for Topic queues to shrink.
For example, if you create a Topic that contains 16 queues, how can you shrink it to 8 queues without losing messages? You can dynamically change the number of write queues to 8, but the number of read queues remains unchanged. At this point, new messages can only be written to the first eight queues, while consumers consume data in 16 queues. When messages in the last eight queues are consumed, the number of read queues can be dynamically set to 8. No messages are lost during the entire scaling process.
Perm is used to set the operation permissions on the currently created Topic. 2 indicates write only, 4 indicates read-only, and 6 indicates read and write.
Three, single installation and startup
1 Preparations
Hardware and software requirements
System requirements are 64-bit and JDK requirements are 1.8 or above.
Download the RocketMQ installation package
Upload the downloaded installation package to Linux.Decompression.
2. Modify the initial memory
Modify the runserver then executes. Sh
Run the vim command to open the bin/runserver.sh file. Now change these values to the following:
Modify runbroker. Sh
Open the bin/runbroker.sh file using the vim command. Now change these values to the following:
3, start,
Start the NameServer
Nohup sh bin/mqnamesrv & tail -f ~ / logs/rocketmqlogs/namesrv log log files in/root/logs/rocketmqlogs belowCopy the code
Start the broker
Nohup sh bin/mqbroker -n 127.0.0.1:9876 & tail -f ~ / logs/rocketmqlogs/broker log log file in: / root/logs/rocketmqlogs belowCopy the code
4, send/receive message test
Send a message
Export NAMESRV_ADDR = 127.0.0.1:9876 sh bin/view sh org. Apache. Rocketmq. Example. The quickstart. ProducerCopy the code
Receive information
sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer
Copy the code
5. Close the Server
The bin/mqshutdown command is used to shutdown either the name server or the broker.
Close the broker
sh bin/mqshutdown broker
Copy the code
Close the namesrv
sh bin/mqshutdown namesrv
Copy the code
Installation and startup of the console
RocketMQ has a visual dashboard that allows you to visually view a lot of data.
1 download
Download address:Github.com/apache/rock…
2 Modifying configurations
Modify its application. Properties configuration file in SRC /main/resources.
- The original port number is 8080. The port number is changed to an uncommon one
- Specify the RocketMQ Name Server address
3 Adding a Dependency
Add the following JAXB dependencies to pom.xml in the decompression directory RocketMQ-Console.
JAXB, Java Architechture for Xml Binding, is an industry standard. It is a technology that can generate Java classes from Xml Schema.
<! -- JAXB, Java technology for XML binding --> <dependency> <groupId>javax.xml.bind</groupId> <artifactId> JAXB - API </artifactId> < version > 2.3.0 < / version > < / dependency > < the dependency > < groupId > com. Sun. XML. Bind < / groupId > < artifactId > jaxb - impl < / artifactId > < version > 2.3.0 < / version > < / dependency > < the dependency > < the groupId > com. Sun. XML. Bind < / groupId > < artifactId > jaxb - core < / artifactId > < version > 2.3.0 < / version > < / dependency > < the dependency > < groupId > javax.mail. Activation < / groupId > < artifactId > activation < / artifactId > < version > 1.1.1 < / version > </dependency>Copy the code
4 pack
Run Maven’s package command in rocketmq-console.
mvn clean package -Dmaven.test.skip=true
Copy the code
Once the package is successfully packaged, it is in the Target folder
5 to start
Java jar rocketmq - the console - ng - 1.0.0. JarCopy the code
- If the configuration file does not specify the address of port and Namesrv, you can start the configuration file as follows:
Java jar rocketmq - the console - ng - 1.0.0. Jar - server port = 7777 -- rocketmq. Config. NamesrvAddr = 127.0.0.1:9876Copy the code
6 access
http://127.0.0.1:7000/#/
5. Cluster building theory
1 Data replication and disk flushing policies
Replication strategy
The replication strategy is the synchronization of data between the Master and Slave of the Broker. It is classified into synchronous replication and asynchronous replication:
- Synchronous replication: After a message is written to the master, the master waits for the slave to synchronize data successfully before sending an ACK to the producer
- Asynchronous replication: After a message is written to the master, the master immediately sends an ACK to the producer without waiting for the slave to complete data synchronization
The asynchronous replication policy reduces the write delay, reduces THE RT, and improves the throughput of the system
Brush set strategy
Flush strategy refers to the way messages in the broker are dropped, that is, the way messages are persisted to disk after being sent to the broker’s memory. It can be divided into synchronous and asynchronous brush disk:
- Synchronous flush: A message is not written until it has been persisted to the broker’s disk.
- Asynchronous flush: When a message is written to the memory of the broker, it is successfully written without waiting for the message to persist to disk.
1) The asynchronous flush strategy reduces the write delay of the system, reduces RT, and improves the throughput of the system
2) Messages are written to the Broker’s memory, typically to the PageCache
3) For the asynchronous flush policy, the message will be written to the PageCache and return a success ACK immediately. Instead of doing a drop operation immediately, the PageCache will automatically drop when it reaches a certain amount.
2 Broker cluster mode
Broker clusters can be divided into the following categories according to the relationships between nodes in the cluster:
A single Master
There is only one broker (which is not a cluster by nature). This method should only be used in testing, but not in production because of a single point of problem.
More than the Master
A broker cluster consists only of multiple masters, with no slaves. Queues for the same Topic are evenly distributed among master nodes.
- Advantages: simple configuration, the maintenance of a single Master disk has no impact on applications. When the RAID10 disk is configured as RAID10, the RAID10 disk is highly reliable and does not lose messages even when the machine is down and cannot be recovered. (a small number of messages are lost when the asynchronous disk is flushed, but none is lost when the synchronous disk is flushed.)
- Disadvantages: During a single machine outage, unconsumed messages on that machine cannot be subscribed (consumed) until the machine is restored, affecting the real-time performance of messages.
The prerequisite for the above advantages is that these masters are configured with RAID arrays. If it is not configured, a large number of messages will be lost if a Master goes down.
Multi-master, multi-Slave – Asynchronous replication
A broker cluster consists of multiple masters, each of which is configured with multiple slaves (if a RAID array is configured, one master is usually configured with one slave). The master and slave work in active/standby mode. That is, the master processes read/write requests for messages, while the slave backs up messages and performs role switchover after the master breaks down.
Asynchronous replication refers to the asynchronous replication policy mentioned above. After a message is successfully written to the master, the master immediately returns an ACK to the producer without waiting for the slave to successfully synchronize data.
One of the biggest features of this mode is that the slave can automatically switch to master when the master goes down. However, since the slave synchronizes with the master with a short delay (in milliseconds), there may be a small loss of messages when the master goes down.
The shorter the delay for the Slave to synchronize from the Master, the fewer messages it may lose
For the Master RAID array, if the asynchronous replication policy is also used, the latency problem also exists, and messages may be lost. But the secret of RAID arrays is microsecond scale (because they are supported by hard disks), so they lose less data.
Multi-master, multi-Slave mode – Synchronous dual-write
This mode implements synchronous replication in multi-master, multi-slave mode. Synchronous double write refers to that after a message is successfully written to the master, the master will wait for the slave to successfully synchronize data before sending a ACK to the producer. That is, the master and slave will send a ACK to the producer only after the message is successfully written to the slave.
Compared with the asynchronous replication mode, this mode has the advantage of higher message security and no message loss. But the RT of individual messages is slightly higher, resulting in slightly lower performance (about 10%).
There is one big problem with this mode: with the current version, when the Master goes down, the Slave does not automatically switch to the Master.
Best practices
Typically, a RAID10 disk array is configured for the Master and then a Slave is configured for it. It takes advantage of the efficiency and security of RAID10 disk arrays while addressing issues that might affect subscriptions.
1) The efficiency of RAID disk arrays is higher than that of master-slave clusters. Because RAID is hardware supported. Because of this, the cost of setting up RAID arrays is high.
2) What is the difference between a multi-master +RAID array and a multi-master/Slave cluster?
- Multi-master +RAID array, which can only ensure data loss, that is, does not affect message writing, but it may affect message subscription. However, the execution efficiency is much higher than the multi-master multi-slave cluster
- A multi-master multi-slave cluster not only ensures data loss, but also prevents message writing. Its operating efficiency is lower than that of a multi-master +RAID array
Vi. Disk Array RAID (Supplementary)
1 RAID history
In 1988, Professor D. A. Patterson and others from University of California, Berkeley first proposed the concept of RAID in their paper “A Case of Redundant Array of Inexpensive Disks”. That is, Redundant arrays of Inexpensive Disks. Since large-capacity disks were expensive at that time, THE basic idea of RAID was to organically combine multiple low-capacity and relatively inexpensive disks to obtain the same capacity, performance, and reliability as expensive large-capacity disks at a lower cost. As disk costs and prices continue to fall, “cheap” has become meaningless. So the RAID Advisory Board (RAB) decided to replace “cheap” with “independent,” RAID then becomes a Redundant Array of Independent Disks. But this is only a name change, not a substance change.
Memory: 32M 6.4g (IBM 10.1g)
2 RAID level
The idea of RAID design was quickly accepted by the industry, RAID technology as a high-performance, highly reliable storage technology, has been very widely used. RAID uses mirroring, data striping, and data verification technologies to achieve high performance, reliability, fault tolerance, and scalability. Based on the policies and architecture of the three technologies, RAID can be divided into different levels to meet the requirements of different data applications.
D. A. Patterson et al. defined RAID0 to RAID6 primitive RAID levels in their paper. Subsequently, storage vendors continuously introduced RAID levels such as RAID7, RAID10, RAID01, RAID50, RAID53, and RAID100, but there is no unified standard for these RAID levels. Currently, the accepted standards in the industry and academia are RAID0 to RAID6, and the most commonly used RAID levels in practical applications are RAID0, RAID1, RAID3, RAID5, RAID6, and RAID10.
Each RAID level represents an implementation method and technology, and there is no higher or lower level. In actual applications, you need to select the appropriate RAID level and implementation mode based on the data application characteristics of users and considering the availability, performance, and cost.
3 Key Technologies
Mirror technology
Mirroring is a redundancy technology that provides data backup for disks, preventing data loss due to disk faults. The most typical use of mirroring for RAID is to simultaneously produce two identical copies of data on two different disks in a disk array. Mirroring provides complete data redundancy. When one data copy becomes unavailable, external systems can access the other copy without affecting application system running and performance. In addition, mirroring requires no extra calculation and verification, troubleshooting is very fast, just copy. The mirroring technology can concurrently read data from multiple copies, providing higher read I/O performance. However, it cannot concurrently write data. Writing multiple copies may degrade THE I/O performance.
Mirroring technology provides very high data security and is very expensive, requiring at least double the storage space. The high cost limits the wide application of mirroring, mainly for critical data protection, where data loss can be very costly.
Data striping technology
Data striping is a technology that automatically balances the load of I/O operations across multiple physical disks. More specifically, a contiguous piece of data is divided into many small pieces and stored separately on different disks. This enables multiple processes to concurrently access different parts of the data, maximizing I/O parallelism and greatly improving performance.
In simple terms, different data is stored on different disks to achieve concurrent access and improve performance.
Data verification technique
Data verification is performed when data is written into a RAID array and the parity data is stored on member disks. The verification data can be centrally stored on a disk or distributed on multiple disks. When a part of the data fails, the remaining data and parity data can be reverse-checked to reconstruct the lost data.
Compared with the mirroring technology, the data verification technology saves a lot of cost. However, a large number of verification operations are required for each data read and write, which requires high computing speed and requires a RAID controller. In data reconstruction and recovery, inspection technology is much more complex and slower than mirroring technology.
4 RAID classification
From the perspective of implementation, RAID consists of soft RAID, hard RAID, and hybrid RAID.
Soft RAID
All functions are completed by the operating system and CPU, without independent RAID control processing chip and I/O processing chip, the efficiency is naturally the lowest.
Hard RAID
Equipped with special RAID control processing chip and I/O processing chip and array buffer, does not occupy CPU resources. It’s very efficient, but it’s also very expensive.
Mixed RAID
RAID control processing chip, but no special I/O processing chip, CPU and driver to complete. The performance and cost are between soft RAID and hard RAID.
5 Details about common RAID levels
JBOD
JBOD, Just a Bunch of Disks. Represents a collection of disks that have no control software to provide coordinated control, which is the main difference between RAID and JBOD. JBOD connects multiple physical disks in series to provide a large logical disk.
In the JBOD, data is stored from the first disk in sequence. When the storage space on the current disk is used up, data is stored on subsequent disks in sequence. JBOD storage performs exactly as well as a single disk and does not provide data security.
JBOD simply provides a mechanism to expand storage space. The available storage capacity of JBOD is equal to the total storage space of all member disks
JBOD usually refers to a disk array cabinet, whether it provides RAID or not. However, JBOD is not the official term; it’s officially called Spanning.
RAID0
RAID0 is a simple, zero-parity data striping technology. It’s not really a true RAID, because it doesn’t provide any kind of redundancy strategy. RAID0 strips the disks where RAID0 resides to form large-capacity storage space and stores data on all disks to implement independent read and write access on multiple disks.
Theoretically, the read/write performance of a RAID0 consisting of n disks is n times that of a single disk. However, due to various factors such as bus bandwidth, the actual performance improvement is lower than the theoretical value. The bus bandwidth is fully utilized because I/O operations can be performed concurrently. Combined with the fact that no data verification is required, RAID0 provides the highest performance of all RAID levels.
RAID0 has advantages such as low cost, high read/write performance, and 100% storage space utilization. However, RAID0 does not provide data redundancy protection. Once data is damaged, it cannot be recovered.
- Application scenario: A scenario that has low requirements on sequential data read and write and low requirements on data security and reliability but high requirements on system performance.
Similarities between RAID0 and JBOD:
1) Storage capacity: total capacity of member disks
2) The disk utilization is 100%, that is, there is no data backup
RAID0 differs from JBOD:
JBOD: Data is stored in sequence. When one disk is full, data is stored in the next disk
RAID: Data is written to each disk in parallel using the data strip technology. Its read and write performance is n times that of JBOD
RAID1
RAID1 is one of themMirror technology
, it writes data to the working disk and the mirror disk in exactly the same way, and its disk space utilization is 50%. The response time of RAID1 is affected when data is written, but not when data is read. RAID1 provides the best data protection. If the working disk fails, the system automatically switches to the mirror disk.
RAID1 enhances data security and enables data on two disks to be completely mirrored, achieving high security, simple technology, and easy management. RAID1 has full fault tolerance, but is expensive to implement.
- Application scenario: High requirements on sequential read/write performance or data security.
RAID10
RAID10 is a combination of RAID1 and RAID0, so it inherits both the speed of RAID0 and the security of RAID1.
In simple terms, strip first, then mirror. Send incoming data to different disks, and then mirror the data on the disks.
RAID01
RAID01 is a combination of RAID0 and RAID1, so it inherits both the speed of RAID0 and the security of RAID1.
In simple terms, mirror first and then strip. The incoming data is mirrored first, and then the mirrored data is written to disks that are different from the original data, that is, stripes are created.
RAID10 is more fault tolerant than RAID01, so it is generally not used in production environments.
Because RAID01, one of the sides is down, the whole side is down. RAID10, one on either side is down, the whole side is still working.
7. Cluster building practice
1 Cluster Architecture
Set up a cluster of brokers with dual primary and dual secondary asynchronous replication. For convenience, two hosts are used to build the cluster. The functions and broker roles of the two hosts are assigned as follows.
The serial number | The hostname/IP | IP | function | The BROKER role |
---|---|---|---|---|
1 | rocketmqOS1 | 192.168.4.55 | NameServer + Broker | Master1 + Slave2 |
2 | rocketmqOS2 | 192.168.4.55 | NameServer + Broker | Master2 + Slave1 |
2 Clone generates rocketmqOS1
Clone the rocketmqOS host and modify the configuration. Specify the host name rocketmqOS1.
3 Modify the rocketmqOS1 configuration file
Configuration file location
The configuration file to be modified is in the conf/ 2M-2s-async directory of the rocketMQ decompression directory.
Modify the broker – Amy polumbo roperties
Modify the configuration file as follows:
Specify the name of the entire Broker cluster, or RocketMQ cluster
brokerClusterName=DefaultCluster
# specify the name of the master-slave cluster. A RocketMQ cluster can contain multiple master-slave clusters
brokerName=broker-a
# Master brokerId is 0
brokerId= 0
# set the time to delete expired message store files to 4am
deleteWhen= 04
# set the retention period of the message store file that has not been updated to 48 hours, after 48 hours, will be deleted
fileReservedTime= 48
Specify the current broker as the asynchronous replication master
brokerRole=ASYNC_MASTER
# Set the flush policy to asynchronous
flushDiskType=ASYNC_FLUSH
# specify the address of Name Server
namesrvAddr=192.168.4.55:9876; 192.168.4.55:9877
Specify the port through which the Broker communicates with producers and consumers. The default of 10911.
listenPort= 10911
Copy the code
Modify the broker – b – supachai panitchpakdi roperties
Modify the configuration file as follows:
brokerClusterName=DefaultCluster
Specifies that this is another master-slave cluster
brokerName=broker-b
BrokerId for # slave is non-0
brokerId= 1
deleteWhen= 04
fileReservedTime= 48
Specify the current broker as slave
brokerRole=SLAVE
flushDiskType=ASYNC_FLUSH
namesrvAddr=192.168.4.55:9876; 192.168.4.55:9877
Specify the port through which the Broker communicates with producers and consumers. The default of 10911. Since the current host serves as both Master1 and Slave2, the previous master1 uses the default port. These two ports need to be distinguished to distinguish master1 from SlavE2
listenPort= 11911
# specify the path associated with the message store. The default path is ~/store. The current host serves as both Master1 and Slave2. Master1 uses the default path, so you need to specify a different path
storePathRootDir=~/store-s
storePathCommitLog=~/store-s/commitlog
storePathConsumeQueue=~/store-s/consumequeue
storePathIndex=~/store-s/index
storeCheckpoint=~/store-s/checkpoint
abortFile=~/store-s/abort
Copy the code
Other configuration
In addition to the above configuration, other properties can be set in these configuration files.
Specify the name of the entire Broker cluster, or RocketMQ cluster
brokerClusterName=rocket-MS
# specify the name of the master-slave cluster. A RocketMQ cluster can contain multiple master-slave clusters
brokerName=broker-a
#0 indicates Master, >0 indicates Slave
brokerId= 0
#nameServer address, semicolon split
namesrvAddr=nameserver1:9876; nameserver2:9876
The default is the number of queues created for the new Topic
defaultTopicQueueNums= 4
# Whether to allow the Broker to automatically create topics
autoCreateTopicEnable=true
Whether to allow brokers to automatically create subscription groups
autoCreateSubscriptionGroup=true
The port through which the Broker communicates with producers and consumers
listenPort= 10911
The HA HA listening port is used by the Master to communicate with the Slave. The default value is listenPort+1
haListenPort= 10912
# set the time to delete expired message store files to 4am
deleteWhen= 04
# set the retention period of the message store file that has not been updated to 48 hours, after 48 hours, will be deleted
fileReservedTime= 48
Set the size of each file in the commitLog directory. Default: 1 gb
mapedFileSizeCommitLog= 1073741824
ConsumeQueue specifies the number of messages that can be stored in each Queue file for each Topic in ConsumeQueue
mapedFileSizeConsumeQueue= 300000
# When purging an expired file, it is blocked if the file is occupied by another thread (the number of references is greater than zero, such as reading messages)
The delete task also records the current timestamp when the file is first attempted to be deleted. This property represents the deletion from the first rejection
After the start time, the maximum length of time the file can be kept. If the number of references within this time is not 0, the deletion will still be rejected. However,
When the time is up, the file will be forcibly deleted
destroyMapedFileIntervalForcibly= 120000
# Specify the maximum usage of the disk partition where the commitlog and consumeQueue are located. If the usage exceeds this value, delete the expired files immediately
a
diskMaxUsedSpaceRatio= 88
The default path is in the current user's home directory
storePathRootDir=/ usr/local/rocketmq - all - 4.5.0 / store
#commitLog directory path
storePathCommitLog=/ usr/local/rocketmq - all - 4.5.0 / store/commitlog
#consumeueue Directory path
storePathConsumeQueue=/ usr/local/rocketmq - all - 4.5.0 / store/consumequeue
#index Directory path
storePathIndex=/ usr/local/rocketmq - all - 4.5.0 / store/index
#checkpoint File path
storeCheckpoint=/ usr/local/rocketmq - all - 4.5.0 / store/checkpoint
#abort file path
abortFile=/ usr/local/rocketmq - all - 4.5.0 / store/abort
# specify the maximum size of the message
maxMessageSize= 65536
The role of the Broker
# -async_master Asynchronous replication Master
# -sync_master Synchronizes the double write Master
# - SLAVE
brokerRole=SYNC_MASTER
# Flush strategy
# -async_flush Asynchronously flush disks
# -sync_flush Synchronously flush disks
flushDiskType=SYNC_FLUSH
# Number of sending thread pools
sendMessageThreadPoolNums= 128
# pull message thread pool number
pullMessageThreadPoolNums= 128
# Enforces the local IP address, which needs to be modified per machine. Official introduction can be empty, the system automatically identifies by default, but multiple network cards
The IP address may be incorrectly read
brokerIP1=192.168.3.105
Copy the code
4 Clone rocketmqOS2
Clone the rocketmqOS1 host and modify the configuration. Specify the host name rocketmqOS2.
5 Modify the rocketmqOS2 configuration file
For the rocketmqOS2 host, you also need to modify the two configuration files in the 2m-2s-async subdirectory of the conf directory of the rocketMQ unzip directory.
Modify the broker – p. roperties
Modify the configuration file as follows:
brokerClusterName=DefaultCluster
brokerName=broker-b
brokerId= 0
deleteWhen= 04
fileReservedTime= 48
brokerRole=ASYNC_MASTER
flushDiskType=ASYNC_FLUSH
namesrvAddr=192.168.4.55:9876; 192.168.4.55:9877
Specify the port through which the Broker communicates with producers and consumers. The default of 10911.
listenPort= 12911
Copy the code
Modify the broker – a – supachai panitchpakdi roperties
Modify the configuration file as follows:
brokerClusterName=DefaultCluster
brokerName=broker-a
brokerId= 1
deleteWhen= 04
fileReservedTime= 48
brokerRole=SLAVE
flushDiskType=ASYNC_FLUSH
namesrvAddr=192.168.4.55:9876; 192.168.4.55:9877
Specify the port through which the Broker communicates with producers and consumers. The default of 10911.
listenPort= 13911
storePathRootDir=~/store-s
storePathCommitLog=~/store-s/commitlog
storePathConsumeQueue=~/store-s/consumequeue
storePathIndex=~/store-s/index
storeCheckpoint=~/store-s/checkpoint
abortFile=~/store-s/abort
Copy the code
6 Starting the Server
Start the NameServer cluster
Start NameServer in rocketmqOS1 and rocketmqOS2 respectively. The startup commands are identical.
nohup sh bin/mqnamesrv &
tail -f ~/logs/rocketmqlogs/namesrv.log
Copy the code
Start two masters
Start the Broker Master on rocketmqOS1 and rocketmqOS2 hosts respectively. Note that they specify that the configuration files to load are different.
nohup sh bin/mqbroker -c conf/2m-2s-async/broker-a.properties &
tail -f ~/logs/rocketmqlogs/broker.log
Copy the code
nohup sh bin/mqbroker -c conf/2m-2s-async/broker-b.properties &
tail -f ~/logs/rocketmqlogs/broker.log
Copy the code
Start two slaves
Start the Broker slaves in rocketmqOS1 and rocketmqOS2 hosts respectively. Note that they specify that the configuration files to load are different.
nohup sh bin/mqbroker -c conf/2m-2s-async/broker-b-s.properties &
tail -f ~/logs/rocketmqlogs/broker.log
Copy the code
nohup sh bin/mqbroker -c conf/2m-2s-async/broker-a-s.properties &
tail -f ~/logs/rocketmqlogs/broker.log 12
Copy the code
Mqadmin command
In the bin directory of the MQ decompression directory, there is an mqadmin command, which is an operation and maintenance command for managing mq topics, clusters, brokers, and other information.
1 modified bin/tools. Sh
Before running mqadmin, change the ext directory location of the JDK configured in bin/tools.sh of the MQ decompression directory. The native ext directory in/opt/apps/jdk1.8 / jre/lib/ext.
Use the vim command to open the tools.sh file and add the ext path after the -djava.ext. dirs line in the JAVA_OPT configuration.
JAVA_OPT="${JAVA_OPT} -server -Xms1g -Xmx1g -Xmn256m - XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m" JAVA_OPT="${JAVA_OPT} - Djava.ext.dirs=${BASE_DIR}/lib:${JAVA_HOME}/jre/lib/ext:${JAVA_HOME}/lib/ Ext: / opt/apps/jdk1.8 / jre/lib/ext "JAVA_OPT =" ${JAVA_OPT} - cp ${CLASSPATH}"Copy the code
2 run mqadmin
Run the command directly, and you can see the commands it can add. There are many functions that can be done with these commands.
[root@localhost rocketmq]# ./bin/mqadmin The most commonly used mqadmin commands are: updateTopic Update or create topic deleteTopic Delete topic from broker and NameServer. updateSubGroup Update or create subscription group deleteSubGroup Delete subscription group from broker. updateBrokerConfig Update broker's config updateTopicPerm Update topic perm topicRoute Examine topic route info topicStatus Examine topic Status info topicClusterList get cluster info for topic brokerStatus Fetch broker runtime status data queryMsgById Query Message by Id queryMsgByKey Query Message by Key queryMsgByUniqueKey Query Message by Unique key queryMsgByOffset Query Message by offset QueryMsgTraceById query a message trace printMsg Print Message Detail printMsgByQueue Print Message Detail sendMsgStatus send msg to broker. brokerConsumeStats Fetch broker consume stats data producerConnection Query producer's socket connection and client version consumerConnection Query consumer's socket connection, client version and subscription consumerProgress Query consumers's progress, speed consumerStatus Query consumer's internal data structure cloneGroupOffset clone offset from other group. clusterList List all of clusters topicList Fetch all topic list from name server updateKvConfig Create or update KV config. deleteKvConfig Delete KV config. wipeWritePerm Wipe write perm of broker in all name server resetOffsetByTime Reset consumer offset by timestamp(without client restart). updateOrderConf Create or update or delete order conf cleanExpiredCQ Clean expired ConsumeQueue on broker. cleanUnusedTopic Clean unused topic on broker. startMonitoring Start Monitoring statsAll Topic and Consumer tps stats allocateMQ Allocate MQ checkMsgSendRT check message send response time clusterRT List All clusters Message Send RT getNamesrvConfig Get configs of name server. updateNamesrvConfig Update configs of name server. getBrokerConfig Get broker config by cluster or special broker! queryCq Query cq command. sendMessage Send a message consumeMessage Consume message updateAclConfig Update acl config yaml file in broker deleteAccessConfig Delete Acl Config Account in broker clusterAclConfigVersion List all of acl config version information in cluster updateGlobalWhiteAddr Update global white address for acl Config File in broker getAccessConfigSubCommand List all of acl config information in cluster See 'mqadmin help <command>' for more information on a specific command.Copy the code
3 Details about the command on the official website
The command is explained in detail on the official website.
Github.com/apache/rock…
consumerStatus Query consumer's internal data structure cloneGroupOffset clone offset from other group. clusterList List all of clusters topicList Fetch all topic list from name server updateKvConfig Create or update KV config. deleteKvConfig Delete KV config. wipeWritePerm Wipe write perm of broker in all name server resetOffsetByTime Reset consumer offset by timestamp(without client restart). updateOrderConf Create or update or delete order conf cleanExpiredCQ Clean expired ConsumeQueue on broker. cleanUnusedTopic Clean unused topic on broker. startMonitoring Start Monitoring statsAll Topic and Consumer tps stats allocateMQ Allocate MQ checkMsgSendRT check message send response time clusterRT List All clusters Message Send RT getNamesrvConfig Get configs of name server. updateNamesrvConfig Update configs of name server. getBrokerConfig Get broker config by cluster or special broker! queryCq Query cq command. sendMessage Send a message consumeMessage Consume message updateAclConfig Update acl config yaml file in broker deleteAccessConfig Delete Acl Config Account in broker clusterAclConfigVersion List all of acl config version information in cluster updateGlobalWhiteAddr Update global white address for acl Config File in broker getAccessConfigSubCommand List all of acl config information in clusterCopy the code