Abstract:

On September 25, Apache Software Foundation officially announced that RocketMQ, an open source project donated by Alibaba to the Apache community, has officially graduated from the Apache community and become an Apache Top-level Project (TLP). Apache RocketMQ is the first non-Hadoop ecosystem top project in China. Since its open source, RocketMQ has been widely used by hundreds of enterprises in China and abroad.

Writing in the front

November 11, 2016, for Alibaba middleware message team, has a very special significance. On this day, the folks at Bright Top saw the RocketMQ low-latency storage architecture successfully test the waters, achieving its stated goal of keeping Singles Day as smooth as silk. On another battleground, after three months of open source remodeling, Apache RocketMQ Champion Bruce officially launched the Apache Donation Project on behalf of the team. RocketMQ is now on its way to becoming the world’s top open source software.

As of today, RocketMQ has been in Apache Incubator for over 7 months. During this half year, the team has traveled a lot and experienced a lot.

  • It was the first time to go through the Apache software release process in accordance with the Apache Way, and completed the Rc3 release feat which was extremely rare in the whole IPMC, which reflected the rigor and efficiency of the Chinese team and won the praise of the community.

  • First admission of foreign Committer – Japanese Doctor Roman. In two months, he submitted nearly 20 Pull Requests and helped RocketMQ partner with Apache’s other top community projects, making a significant contribution to getting the community started.

  • First hackathon for the community. PMC members followed up the whole process and helped the contestants to Review the design and Review the code.

  • Recruit PMC members for the first time. In the RocketMQ community, PMC is a further Merit to the Committer’s ongoing contributions.

  • Informing the Apache community of the OpenMessaging standard vision for the first time. Soon after, ASF Director Jim volunteered to join RocketMQ’s super luxurious global Mentor team.

  • Held the first large MeetUp for the domestic community. Global graphic live broadcast, 5 pure technology Messaging & Streaming dry goods sharing, conveyed the team oath will do a good community, do a good job of ecological determination.

  • For the first time…

Life is a journey, go for a long time, just know sad, just know difficult, just have tenacity, just have desire. The road ahead, though far, though bumpy. But the pace is still, the pursuit is still, direction is still.

On August 3, 2017, the team officially launched discussions to impact TLP. For the next month, discussions and votes became almost a daily requirement. Finally, ahead of the Release of Java 9, RocketMQ has officially passed a board decision to end its incubation on Apache and become an official member of Apache’s 340 + top-level projects. At this moment, entirely from the Chinese team, the distributed message engine from Alibaba has become a bright new star in the field of global Internet middleware and big data. Its next stop will be a fierce competition with the world’s top established messaging middleware on the international stage. Guided by Apache’s advanced community philosophy, RocketMQ will continue to thrive and fulfill its proud mission.

Next, we take a look at RocketMQ’s Apache incubation over the past six months in terms of product, community, and ecology.

Product chapter

Unlike most of Apache’s top projects, RocketMQ has been on the Open Source road for three years before it entered Apache. It has built up a good reputation in China and boasts some Active molecules in the community, but these are far from enough. The team did a lot of remodeling of RocketMQ and even the community before officially opening the donation process. For internationalization, the sidebar feature has been used to redesign and organize the documents on Github. Today, User Guide, Quick Start, Architecture & Design, How to Contribute, Community, FAQ are almost standard document structures for new team products.

The code level was optimized more aggressively. For example, remove GBK characters and fully embrace UTF-8. Rewrite the API JavaDoc. Clean up code and optimize code structure. Optimize abstract dependencies between components using JDepend. Use Findbugs to scan code for bugs and guide specification coding. On the delivery side, standardize the Release process, classify Release notes for New Features, Improvement and bugs. At the community level, there was an all-British interaction, with tips on how to ask questions.

As a result of these careful preparations, RocketMQ has been quietly upgraded from 3.0 to 4.0. 4.0 is an interim release (with no major architectural changes compared to 3.0) and is also a release incubated at Apache. Through incubation, the team relearned the importance of software development processes, especially those that are often overlooked, such as fine-design, code Review, coding protocols, branching models, and release protocols. This is especially important when you are involved in and leading a globally collaborative open source project.

Encoding specifications

RocketMQ is loosely coded, at least in version 4.0. Unlike other Top-level Apache projects, we didn’t restrict the coding details, such as how static variables are used, how logs are typed, and how exceptions are caught and handled. And mainly through Code Style, Copyright, Code Template three blocks to carry out some general direction constraints. To prevent non-conforming code merging repositories, we automate validation behavior through PR Checklist, CheckStyle, and continuous integration. Because we believe that wherever there is intervention, there is a need for check. Where there is human intervention, there is potential for automation.

Research and development process

RocketMQ’s open source model is not an open kernel model in the traditional sense, but rather similar to open source platforms like Apache Hadoop and OpenStack. We try to combine the best of the open world with the best of the proprietary world, producing proprietary products on a truly collaborative platform. We expect the synergies of Redhat, CentOS, and Fedora to be reflected in RocketMQ’s future evolution. To meet this challenge, the team must change its software development process and build an ecosystem that supports both internal and open source communities as well as commercialization.

Branch model

RocketMQ was designed with a vision of future product forms like the One Kernel in mind. Over the years, the team built internal and commercial products around it, including AliwareMQ, MetaQ, and Notify 3.0. As RocketMQ moved to Apache, the evolution of the code became more complex, and the traditional Master branch model was difficult to meet the requirements of open source, commercial, in-group and public and private clouds sharing a RocketMQ kernel. Through continuous thinking and practice, we solve this problem by introducing mirror warehouse mechanism. Specifically, by configuring the Master and Develop branches in each repository, the different repositories are transformed and feature buffered directly by the mirror node. In this way, features incubated on Apache can be incorporated internally and even exported to public and private clouds. Features incubated internally can also be fed back to the open source community in the form of PATCH. As shown below:

Continuous integration, continuous pressure testing, continuous delivery

With Travis and Jenkins, we built two sets of infrastructure. On Github, Travis triggers continuous integration for all commits, all PR, which is one of its strengths. In addition, it supports cross-JDK, cross-platform Matrix validation (though unfortunately, OSX seems to have problems all the time). Internally, continuous integration, continuous pressure testing, and continuous delivery (Docker, Pandora, AWS Cloud, Azure Cloud) were implemented using Jenkins 2.0 Pipeline. As shown below:

As can be seen from the above figure, the whole continuous integration is divided into several stages: code static test, unit test, integration test and Sonar quality evaluation. For unit tests, not only should core logic be overridden, but it should be fast (built-in 3-minute threshold for error reporting and emailing). Countless practices tell us that if it is not fast enough, more students will choose to skip single tests when compiling locally, and over time, single tests will gradually lose meaning and become increasingly unmaintainable. It has to be complete, it has to be fast, it has to be maintainable. This is a huge challenge for a single test writer, requiring interactions between multiple threads to avoid sleeping, Mock as much as possible from the bottom down, elegantly named object structures, and assert exceptions rather than throw them.

Post code

After entering Apache, it is important to learn and follow the Apache release process, especially during the incubation phase.

The figure above depicts the release process of an Apache incubation project. Fast, steady, this is the first time the RocketMQ team has released a message to the community. Throughout all the Apache incubation projects, learning and releasing in such a short time with such a small number of times (RC3) proves that Chinese people also have a rigorous and professional attitude to work. Of course, we are also lucky to meet Justin and Bruce, two warm-hearted and professional mentors. Here, a few important files generated by Release need to be highlighted, as shown below:

Source code (SRC) and binaries (bin) are packaged separately. The ASC is a signature file where committers sign the distribution with their private key, which anyone can verify with our publicly available 512-bit public key.

In addition, there is another thing that has to be mentioned. Apache classifies licenses strictly. For example, Apache License 2.0/1.1, BSD, MIT/X11 belong to Category A. These protocols are compatible with Apache License and can be imported without any concerns. Licenses in Category B, such as EPL 1.0, MPL 1.0/1.1/2.0, can be imported in binary form. For licenses in Category X, such as GPL, LGPL cannot be imported into products.

RocketMQ discussed the License issue extensively in the community when it was first released. How to deal with the dependency of dual licenses, how to distinguish the licenses dependent on Source and Binary Release, and how to deal with the NOTICE. This knowledge is used by us to guide community contribution, which not only strengthens the memory, but also passes on to the community.

PR & Jira processing

This is a RocketMQ way embodiment. For PR, we designed a Checklist, as shown in the figure below:

All PR must have 3+ Committer Review, Committer submitted PR, try not to Merge yourself. In addition, the PR merger was a time-consuming and laborious work. Through research, the team members wrote an automated Python script, which skillfully reduced the normal work from a few minutes to a few seconds, reflecting to some extent the ingenuity of Chinese engineers to “win while being lazy”. With Jira, we emphasize Components, each of which has 2 to 3 committers or Maintainer controls. Jira processing usually goes through the Resolve phase, followed by the Close phase, specifying the affected version and which version to fix.

Discourse community

In the product chapter, we mainly review the gratifying changes in the past six months from the aspects of coding specification, branch model, continuous delivery, release specification and so on. In the next community section, we will focus on the community, focusing on Apache Way, RocketMQ Way, and the Community Maturity Model. As for brand & community Building, there are many practices in the industry, such as DevRel, JUG, etc. Here, put up a picture to get a feel for it.

Apache Way

An important concept in the Apache Community is Community over Code. Community is an important criterion for judging whether an incubation program can graduate. But that doesn’t mean Code isn’t important. Of all the top Apache projects you’ve worked on, do you have a Code that has impressed you (if I may speak of Code, Code is a mirror, the face of an engineer, and the most direct reflection of a craftsman’s pride and shame)? I’m sure those of you who have studied many of Apache’s top projects will feel the same way. With Apache’s emphasis on the community today, it’s easy to misunderstand that design doesn’t matter, code doesn’t matter. Just think, if the product design, product quality is not good, how can expect to have a healthy and diverse community? So we care about community, but we care about design, we care about code quality, we care about the user experience.

The image above is an excerpt from our previous MeetUp sharing and is our understanding of Apache Way.

  • Community-over Code, first you have to know a Community, which is a universal thing, like Developers, Writers, Testers, Sysadmins, Devops, Users, etc., but no Employee is required. The community created by Employees is unhealthy, unsustainable, and devoid of diversity. Some of the biggest companies in the world, early on in the Apache project, were just trying to control the community, but they didn’t realize that the essence of the Apache community concept is to foster collaboration with the community, with other businesses, with other individual contributors, to accelerate the growth of the project and the individual. Communities don’t just happen. You either throw your project on Github or open source it and it happens automatically. It takes a lot of thought to run a community. Developing a community member, getting an international contributor is a long-term process. The first international contributor to the RocketMQ community, which we followed and encouraged for nearly two months. Of course, many foreigners have also expressed interest in contributing to the community, great oaks grow from little acorns, start with the most basic work, let the community see your efforts, see your determination to continue to follow up and contribute to the community, Apache Committer naturally not far away.

  • I came to a point where I thought, “What can I gain by participating in the construction of the community? All the efforts of a Contributor in the community are there for all to see and follow. How many issues and PR did I submit?” How many Code reviews have I participated in, and whether there are always insights on the mailing list? All these are sediment in Github, JIRA, and the mailing list. After being recognized by the community, Contributor becomes Committer. Committer becomes PMC Member. Step by step, the community recognizes your contributions and you grow.

  • Open Development – For Everything, especially For Development, everyone can participate in every line of Code, from DISCUSS before Development to Code Review in the Pull Request, Even use aspect Feedback as a User. All discussions, collisions of ideas, even arguments, are Public. For the Chinese community, we have been encouraging all important offline discussions (including QQ) to be reflected in the mailing list. Everything is open and transparent, and no private decisions are made.

  • Absolute openness of Decision Making – Consensus & Votes will also bring certain disadvantages, such as cumbersome and procrastination in the process. To make a decision in the community, you first need to initiate a discussion on the mailing list. Once a consensus is reached, you need to initiate another vote and wait at least 72 hours (taking into account global time zones) before announcing the result of the vote. Although the process is complicated, once the resolution is passed, it is resolutely implemented, which is also a manifestation of the community’s executive power.

In learning and practicing the Apache Way, the team also developed a set of best practices we call the RocketMQ Way.

  • International Collaboration – Disable Chinese comments, allow User mailing list to ask questions in Chinese, but Dev mailing list must be in English. Please refer to the official instructions for the use of several mailing lists.

  • PR management-pull Request submission and consolidation must follow a certain checklist. Each PR must have adequate unit testing, integration testing, and incremental code coverage requirements; Merge PR with 3+ +1, Committer PR does not Merge itself.

  • Diversity Review – Implement CTR(Commit then Review) for Committer code and RTC(Review then Commit) for Contributor code.

  • Branch Model – Adopt the dual trunk Model (Master and Develop), uphold the concept of One kernel.

  • Community CodeMarathon – An occasional hackathon to find active contributors to the Community and train and mentor deep participants in the Community.

  • Ecosystem Assemble – Independent warehouse that camps for community projects. Before each individual can contribute, they first submit Apache’s ICLA.

  • One Contributor, One Mentor – Each PMC Member has the obligation to help the community contributor, Mentor and lead them to become a Committer or even a PMC Member on the Apache Way.

Changes in site traffic

In more than half a year, we have published a number of technical articles in the cloud community, CSDN, InfoQ Chinese website and international website, written international papers, delivered keynote speeches in ApacheCon, LinuxCon and other international top open source summits, and broadcast Meetup with simultaneous texts all over the world. Through this series of international output, website traffic ushered in a gratifying change.

As you can see from this graph, the US is now RocketMQ’s second most visited country in the world. This provides a good foundation for the team’s upcoming Aliware MQ international strategy. At the same time, this continuous high flow is also constantly urging us to write high-quality code, do high-quality products, to the world output of Chinese intelligence (quality) made.

Maturity assessment

To be honest, RocketMQ didn’t do any learning on this model when it came to Apache (hate teaching to the test). We hope to build a global community and graduate through our own understanding and practice. Surprisingly, when the team started reviewing the graduation TODO List in August, Apache’s theoretical guidance was surprisingly consistent with our practice. Long live understanding! Below is RocketMQ’s community maturity assessment.



Ecological discourse

As the first Apache top-level project in China that is not Hadoop and Spark big data ecology, we are working hard to build an ecosystem with Messaging as Core while building a diversified community. As mentioned earlier, we provide a separate Apache repository for community ecological contributors. This practice is now being followed by other Top-level Apache projects. So far, rocketMQ-Console, RocketMQ-JMS, RocketMQ-Flume, RocketMQ-CPP and other eco-projects from the community have been validated online within companies like RocketMq-Druid, Rocketmq-ignite, RocketMQ-Storm, etc., and the code has been merged into each other’s repository, which gives RocketMQ more connections to other top open source projects and communities. For more information about ecological projects, please refer to the official website.

Messaging ecological

The picture above shows the team’s efforts to build an ecosystem around the OpenMessaging standard. OpenMessaging is a refinement of the team’s experience in the Messaging field for many years. After more than half a year of development, this standard has entered the Linux Foundation, and then will enter the CNCF, becoming one of the indispensable standards of cloud computing. And this apI-level cross-platform, cross-language standard is going to try to solve all the problems that we’ve encountered before but haven’t solved completely. As shown in the figure below, distributed transactions, our internal products MateQ and Notify also have similar features, but this kind of distributed transaction is more of a sender distributed transaction, not a true distributed transaction. Load balancing: In pull mode and push mode, the policy will be different, and the sharding of the message itself will be different according to different business scenarios. Distributed tracing, mainly considering Opentracing in Linux CNCF. Protocol bridging concerns how to bridge seamlessly with existing standards such as JMS. Stream computing, by introducing a stream computing operator, calculates the content in the message delivery process. Benchmark, similar to SPECJMS, pulls all Messaging engines together on the same tone for performance testing.

To sum up, RocketMQ 5.0, the next generation messaging engine based on OpenMessaging, will continue to work in four areas: e-commerce and peak filling under high concurrency. In the field of Internet of Things, a large number of connections are online at the same time. In the field of big data, swallowing is king. In the financial field, it is important to have high reliability and data redundancy.

Write in the last

Finally, let’s take a closer look at the question of how much the team is investing in open source.

In principle, we expect everyone on the team to invest some time (10-20%). From time to time, we are surprised to find that the final investment is much less than this value. Of course, this is also determined by the reality of the team. In addition to open source, the team currently has r&d support for internal, public cloud and private cloud. In addition to RocketMQ, the team continues to work on AliwareMQ, Kafka, Relay and other products. And, of course, don’t forget OpenMessaging, the challenging evolution of the specification. Naturally, team recruitment becomes an urgent matter. We hope that students with high attrition in the field of distributed, big data, multi-language architecture and pursuit will join the team, and the team will strive to create a challenging space for them to display. We value growth and focus on sustainable development. Graduation from the Apache RocketMQ Top-level program is just the beginning. The road ahead is long and the future is promising.

It was started in the evening of August 3, 2017, and the final draft was made before the Apache board resolution on September 20, 2017

This article is the original content of the cloud habitat community, shall not be reproduced without permission, if need to be reproduced, please send email to [email protected]; If you find any content suspected of plagiarism in our community, you are welcome to send an email to [email protected] to report and provide relevant evidence. Once verified, our community will immediately delete the content suspected of infringement.

The original link