Recently, Apache Foundation, the world’s largest open source foundation, announced the latest batch of Committers (core contributors) for Hadoop. Zhu Qi, a member of the IQiyi Big Data team, accepted the invitation of the Apache community and officially became a member of the Hadoop Committer.
Zhu qi has been contributing to the Apache community since 2018 and was merged with over 10,000 lines of code, the first Hadoop Committer on IQiyi.
What qualities do you need to be a Hadoop Committer? What are the responsibilities? How do you balance your open source work with your day job? We spoke with Zhu to get an insight into what it takes to become a young Hadoop Committer at IQiyi.
01 How to select Apache Hadoop Committer?
For those of you who are not familiar with it, we may need to solve a few problems first. What is Apache Hadoop? How are committers selected?
Apache Hadoop, a distributed application based on MapReduce and Google Papers published by Google, is one of the most popular big data open source software frameworks in the world.
Apache assigns committees (PMCS) and several committers to each project. The election method and organizational structure are quite clear. Committer’s main responsibility is to maintain the sustainable development of the project, such as the development of new functions of the project, and the proposal of new functions as the development trend. In addition to directly submitting codes, they can also reject any Patch at a time.
As the world’s most “data center operating system”, Hadoop has high requirements on code quality. So to become an Apache Hadoop Committer, ** you must first have a deep understanding of the community’s existing open source code and architectural design. ** Because Hadoop involves a lot of relatively complex concepts in distributed computing, it is not easy to fully understand the existing design and code.
In addition, be a Committer and be a major contributor to at least one major feature direction, which is reflected in the amount of merged code that is merged into the project.
Hadoop, as a support project at the bottom of big data, usually does not have a lot of code changes, which means that contributors not only need to have strong coding skills and expertise, but also have solved project problems in the community in order to be nominated by the PMC and get the opportunity to be selected. Zhu told us that more than 10,000 lines of code had been incorporated by the community before he was selected as a Committer.
Currently, there are more than 230 Hadoop Committers worldwide, including more than 10 committers from China.
02 From non-computing majors to Hadoop committers
Although still young, Zhu already has solid experience working in the open source community. Zhu qi majored in chemical materials during his undergraduate study, and he became interested in the software industry after taking a program language design course, so he directly transferred to the major related to computer during his graduate study. Although his major is not big data, he has always been very interested in big data, especially Hadoop.
After working in IQiyi, Zhu Qi began to contact with Hadoop projects. After joining iQiyi, she undertook work related to big data in the intelligent platform department. Due to work reasons, she studied a lot of the underlying source code of some big projects and accumulated the ability to read codes over time.
Photos of Zhu Qi’s life
“Input is very important, you have to have input for output. As a contributor to the open source community, you have to have input. You have to understand the logic of the original code before you can input new modules to develop and adapt, and then you can accumulate.” Zhu qi said, “I started working with the open source community as a Contributor, starting from finding small problems and streamlined things. Once you understand the process, you can dive into the code.”
By actively participating in the open source community, Zhu himself has learned a lot. From being a participant in the development of a top level project to now being a Committer, I’ve solved a lot of problems and studied a lot of code for other members of the community and community development. “Now I can better understand the code logic in other projects, and can also contribute to the company’s various big data projects.”
In the company, Zhu qi also found that more and more students began to participate in open source projects, together with in-depth understanding of each project. “On the one hand, this can also be fed back to the bottom department of the company, such as big data related content, which will strengthen the maintenance ability. On the other hand, it can improve the cost optimization and performance optimization, reduce the company’s real expenditure, improve user experience, and facilitate the promotion of the platform.”
03 Iqiyi continues to build open source culture and give back to the community
In recent years, IQiyi has always attached great importance to the development of open source culture. It has established an open source working group and released a standardized open source project process. It is committed to establishing a standardized, orderly and secure open source architecture, encouraging internal and external open source of projects, enhancing technical exchanges and improving research and development efficiency.
Iqiyi’s own business practices and explorations will also actively feed back to the open source community and solve many project problems in the open source community. Taking Zhu Qi’s big data team as an example, iQiyi’s big data architecture and the open source community have always maintained synchronous and benign development.
“For example, iQiyi recently followed the community to upgrade the hadoop-related architecture and test the new Hadoop project. In this process, some problems will appear. The problems of IQiyi will feed back to the community, on the one hand, it can feed back to the community, on the other hand, it solves the community problems.”
Based on Hadoop open source content, iQiyi’s own business improvement has been merged into the community, and hadoop-related parts have been merged: CS scheduler unified FS compatibility, global scheduling optimization, GPU resource-related optimization, large-scale cluster event-driven optimization, elastic scaling queue, Proxy Server optimization, Router Based Federation problem repair and optimization, and other functions.
Like Zhu, many employees at ** IQiyi are active participants in the open source community. ** They contribute code to the open source community, answer questions from other community participants, and keep the community running smoothly.
Zhu Qi felt deeply about this, “Apache has been developing for more than ten years, because there are different users and developers, benign maintenance of ecology, common communication and interaction. “It’s a place where people can share experiences and solve their own problems, but also learn from other companies and developers to prevent future problems.”
04 From the beginning to the end, open source advanced walkthrough!
Experienced the development of open source culture in recent years in China, Zhu Qi is also quite feeling.
“In the past, open source was not so popular. People thought open source was borrowed, and they were not willing to contribute or participate in it, but there were also many problems. A lot of companies don’t use open source projects, they’re closed source, the developers leave, and the code has a lot of historical baggage. Now open source has become a big trend. For the company, various Internet companies at home and abroad, especially iQiyi, pay more and more attention to open source culture in the construction of internal engineer culture. Through open source, not only the company’s projects will be better and better, but also the contributors themselves can gain a lot from it.
Finally, Zhu qi also gives some advice for students who want to get into open source culture, from beginner to advanced to ultimate.
Introduction to 1.
Familiar with the contribution process for open source projects, each project will have how to Contribute documentation.
Start contributing to small issues, such as spelling errors in code, project documentation improvements, and minor bug fixes.
2. Advanced
Be able to fix and optimize the performance of projects that have problems in production or in the community.
3. The ultimate
Understand and even participate in the design documents of major features of the project, and then add sub-tasks to make development contributions according to the project architecture.
“Overall, open source community contributions are an ongoing effort that takes time and effort. Start with a good understanding of a module, and work your way up, contribute, participate in discussions and review code, and wish you all the best in the open source community.”