RocketMQ users can now seamlessly migrate to Apache Pulsar. Since then, Apache Pulsar has added compatibility with mainstream message queuing protocols.

We are pleased to announce Tencent cloud Middleware open source RoP! RoP introduced the RocketMQ protocol processing plug-in into Pulsar Broker so that Pulsar can support the native RocketMQ protocol.

The authors introduce

Ran Xiaolong **, ** Tencent Senior Engineer, Apache Pulsar Committer, Apache BookKeeper Contributor

Introduction of RoP

Like KoP, MoP, and AoP, RoP is a pluggable protocol processing plug-in.

By adding the RoP protocol processing plug-in to an existing Pulsar cluster, users can migrate their existing RocketMQ applications and services to Pulsar without having to change their code, while also taking advantage of Pulsar’s powerful features, such as:

• Separate computing and storage

• multi-tenant

• Replication across geographies

• Layered sharding

• Lightweight computing framework — Pulsar Functions

•…

Why develop RoP

Apache Pulsar is the next generation cloud native distributed message flow platform that integrates messaging, storage, and lightweight functional computing. Pulsar has been widely adopted since it became open source in 2016 and was designated an Apache Top-level project in 2018.

RocketMQ is a powerful open source distributed messaging system that provides low-latency, highly reliable message publishing and subscription services based on highly available distributed clustering technology.

Pulsar and RocketMQ have broad user bases and strong development support, and are used by many head companies around the world. At the same time, we have received requests from customers to transfer data between Pulsar and RocketMQ to take advantage of both messaging systems.

Apache Pulsar provides a unified abstraction of the queue and stream consumption models by abstracting the Consumer layer. Pulsar is based on the Protobuf binary protocol for higher performance and lower latency in client-broker interaction. In addition, Pulsar can more easily support and implement multi-language clients such as Java, CPP, Python and Go through Protobuf protocol.

However, for applications written using other messaging protocols (for example, RocketMQ), since the messaging protocol used is different from Pulsar, if Pulsar wants to be compatible with RocketMQ, In order to adapt RocketMQ’s protocol to Pulsar’s messaging protocol layer, users had to rewrite the entire protocol layer, which introduced significant costs for user migration and switching.

To solve this problem, the most intuitive approach is to import the user’s existing data from RocketMQ into a Pulsar cluster using a RocketMQ Wrapper similar to the Pulsar Connector. However, this requires the business side to change its own business code logic and ensure that the data on both sides is consistent, which presents a significant technical challenge for users of RocketMQ. So, can you provide users with an out-of-the-box migration strategy and solution that doesn’t require any code changes? This was the original purpose of RoP.

Apache Pulsar in PIP – 41 (github.com/apache/puls…). In this paper, a new access method is introduced. Netty’s channel and Pulsar’s Broker Service(github.com/apache/puls…) by exposing the Protocol Handler plug-in on the Broker side. Objects are exposed to the user. This allows users to directly manipulate and call the lower-level apis in Pulsar, such as PersistentTopic and ManagerLedger. With this Protocol, the user does not need to change the code, but simply forwards the service request to the RoP, which uses the Protocol Handler’s plug-in to forward the user’s request to Pulsar.

How to develop RoP

RoP architecture

A comparison of the protocols between Pulsar and RocketMQ shows that there are many similarities in the way they handle messages. For example, both protocols contain the following operations:

  • Topic Lookup: all Clients Lookup the Owner Broker of the current Topic before establishing a connection with any Broker. After the metadata is retrieved, the Client establishes a TCP connection with the Owner Broker to interact with the data.

  • Produce: Clients communicates with all the Owner brokers of a Topic and appends messages to the corresponding distributed log.

  • Consume: Clients communicates with all the Owner brokers of the Topic and reads the specified messages from the distributed log.

  • Offset: Messages produced by Producer to a topic are assigned a unique Offset, which is identified by MessageID in Pulsar. Consumers can use offset to retrieve messages from the log at a specified location.

The storage layer of Apache Pulsar uses Apache BookKeeper. Pulsar is the Client of BookKeeper and can easily be used for distributed log operations by calling ManagerLedger objects. Because of this, RoP does a good job of mapping RocketMQ operations to The commitLog and queueLog operations in BookKeeper.

Concept of RoP

Offset and MessageID

In RocketMQ, offsets are used to identify the location of messages, and each message is assigned a unique offset after it is produced to a specified Topic. In Pulsar, MessageID is used to uniquely identify each message, and each MessageID is composed of three parts, ledgerID, entryID and partitionID. We map messageID and offset through reasonable division to uniquely identify each message in the Topic.

Message

For a message, RocketMQ and Pulsar both contain headers and payload fields. By analyzing the message protocol, we can easily convert RocketMQ message to Pulsar message format. In order to be compatible with Tag messages, a special 8-byte field is added in the processing of message protocols to distinguish whether the message belongs to Tag messages.

Topic Lookup

In Pulsar, before establishing a connection with the broker, the client will perform a Lookup operation based on the current incoming Topic to find the Owner broker of the current Topic in the broker cluster. The Owner Broker’s address is then returned and a TCP connection is established with the client, followed by data interaction. In RocketMQ, the client processes the GET_ROUTEINTO_BY_TOPIC command before establishing a connection with the broker. After obtaining the routing information of the topic, the client establishes a TCP connection and then interacts with the broker.

How to use RoP

Currently, RoP has released version 0.1.0 and you can participate in the project in any of the following ways:

Want to give it a try?

RoP can be downloaded and the user guide is available at github.com/streamnativ… Standalone RoP is easily available or deployed within an existing Pulsar cluster.

In addition, to facilitate quick use and validation of RoP, we have provided common usage scenarios and use cases for RocketMQ. You can directly use these code examples to validate the service: github.com/streamnativ…

Want to solve a problem?

If you have any questions, you can create an issue in the RoP GitHub repo or join the RoP wechat group for discussion. Either way, RoP senior experts are always available: github.com/streamnativ…

Want to contribute?

RoP is open source and hosted on GitHub: github.com/streamnativ… Bug, welcome to submit PR.

Special thanks to

I would like to express my special thanks to Zhang Yonghua, Ran Xiaolong, Han Mingze, Xia Zicheng and other students from Tencent Cloud middleware team for their support, as well as the good suggestions of StreamNative in architecture design and scheme, which jointly promoted the smooth implementation of RoP project. In the future, both parties will continue to work hand in hand and make more contributions to the message service.