In addition, at the end of this article, we also launched for the first time the relevant content course “Kanban + Stand meeting”, which is the 36 courses of improving R&D efficiency that ali Agile coaching team has dedicated to build. Welcome to sign up.
Our discussion of r&d effectiveness is essentially about improving synergistic efficiency in the overall technology ecosystem. From a r&d perspective alone, the ultimate goal for the technical team is a flexible 7*24 hour release window with faster business iteration capabilities.
The implementation of the 7*24 hour release window is not simple and is limited by many factors. I simply decomposed it.
A,
Let’s start with the basics. When a startup team has only a few people and one or two systems, r&d efficiency can be ignored. Because there is no dependency between systems, the dependency within the system is also completely in a controllable range, local from a Tomcat or Apache can be developed and debugged. In addition, the high-frequency communication between team members can basically realize the requirements of sending messages whenever and wherever you want.
As the business grew more complex, the number of developers expanded to a dozen people. The first step in improving performance is to clarify the dependencies within the system and promote the specialization of roles. This is also known as MVC, which layers the logic within the system by separating the view, model, and controller. The complex code logic is relegated to the Model layer, leaving the view layer to the more specialized front end.
Of course, there is still room for some extension within the system, such as modularity, partitioning bundles for different businesses, and so on. But still did not break through its own bottleneck, and a single system is difficult to break through the characteristics of the machine.
Second, architecture,
When the technical team has reached the scale of dozens or hundreds of people, when the business has been unable to scale horizontally through a single application. Distributed architecture is an effective way to solve problems. In 2007, Ali Group was promoting SOA, no matter taobao or Alipay, the original single application was constantly split out, also at this time, bearing the center of the message and other middleware booming.
In this way, the decoupling between systems is realized, the productivity of technical personnel is activated, and the elasticity of the system is increased, and the low-cost expansion of service capacity is realized. However, because of the complex invocation relationship, for a project that spans multiple applications, it undoubtedly increases the cost and quality risk of integration.
At the same time, if the application scale is not planned and controlled, the number of applications will continue to expand, which will affect the overall development and maintenance cost.
3. Configuration management
Ali had a special position called SCM 5 to 10 years ago, responsible for code management, configuration item management and application deployment within the technical team. Especially in the early days of sertization, developers’ coding productivity was greatly released, the number of applications exploded, and the demand for configuration management increased, which eventually led to the transformation of configuration management (as of now, full-time SCM teams have been “eliminated”).
Before we talk about configuration management, let’s talk about code branching. This is also the starting point for many changes in r&d models. I’ll start with my own point of view: there is no right or wrong (advanced or backward) code branch management mechanism, only the right or wrong management model for your team now and in the future.
At a larger level, what we’re talking about is solving the problem of parallel development, where multiple projects or teams work on the same set of applications. If it’s just serial development, you don’t have to worry much about code management strategies.
1. Branch development and trunk release. The core idea is to use a fixed trunk as an integration branch. Development is done with branches, and the life cycle terminates after merging into the trunk branch. Of course, there are also emergency release branches.
2. Branch development and branch release. Write baselines are performed upon successful release to ensure that the trunk is up to date and stable. At the same time, the branch publishing method does not rely on the large integration, maintaining a strong flexibility.
The process reflected in the project is as follows:
3. Other modes: trunk development, branch distribution, etc. Because we don’t use it very often, we’ll skip it.
Related: How do we manage code branches at Alibaba
Platform support: The humanization of early piping also resulted in inefficient code integration and deployment. Coordination between different roles is done by people. So in that context, there is a need for a supporting PMO organization. Under such a historical background, Aone (known as Cloud Efficiency) was also born to solve the complex process of r&d collaboration, construction, integration and testing from the perspective of platform.
In order to have a clearer understanding of the pain points of that period, I looked for the blueprint of Aone (Cloud Effect) around 2009, and I could get a glimpse of it (I did not experience this period personally, but only conducted some interviews and collected some materials for the old people at that time). My guess is that this approach to solving problems for the future has led to the Aone platform.
Four, test,
When you have a small technical team, a small number of applications, and a small user base for your business, you can do without testing. Only when the business grows to a point where users have less and less tolerance for quality should a professional testing role be introduced. Secondly, in the offline delivery stage of software, due to the high cost of software recall, no effort is spared in testing. However, with the deepening of the era of online delivery, whether the test team can quickly realize the evaluation and feedback of software quality has become a very critical issue. It also determines whether the 7*24 hours continuous software delivery channel can be truly realized after getting through all the above links.
Before we do that, let’s go back to the last chapter. The Aone (Cloud Effect) platform makes code development, configuration, application deployment online, and now there is only one final step left — testing. Since 2010, B2B test teams have wanted to integrate layered automation platform with Aone (Cloud Effect) r&d collaboration platform to achieve a quick validation mechanism for testing through system calls and ultimately achieve unattended regression testing.
This is very significant. After the servitization of the application, the risk of technology is actually convergent, and everyone can be service-oriented development to achieve high cohesion and structural coupling. And app distribution is more flexible. But for testing, it’s a huge challenge:
1. The level of testing has increased.
2. There are more rounds of testing. Every integration, every release can be a full test regression.
The Aone (Cloud Effect) push has taken over the role of SCM. The rapid development of research and development platform and the requirement of 7*24 hour release of business also began to impact the ability of rapid feedback after code integration. This is both a challenge and an opportunity. Otherwise, all of the productivity freed up in the early stages is stuck in the last part of the test, and there is no way to unpack it (each unpack doubles the test effort). The efficiency of integration testing can only be improved by continuously overlaying the requirements of integration.
After generations of students on the 1688 Test team, we are finally getting somewhere in this area. We have achieved more than 60% unattended release testing through layered automation, and blocked failures at hundreds of levels (including pages, UI, etc.) throughout the year.
Its implementation logic is as follows:
Five, the cultural
At this point, the continuous delivery channel for a truly 7*24 hour business has been fully established.
<Aone/ Cloud Effect Flow chart >
Let’s review:
1, the application of hierarchical architecture, front end, back end, test each should do their job, through the power of specialization to stimulate a round of productivity.
2. Service-oriented architecture enables technical personnel to conduct service-oriented business development, realizing high cohesion and low coupling in architecture. Further release the vitality of large-scale technical team.
3. The establishment of the RESEARCH and development platform provides a continuous delivery channel to realize the rapid and accurate transmission of the development and testing process.
4. Relying on the RESEARCH and development platform, it has realized the automatic deployment of the environment, application monitoring and code inspection. Swept away the infrastructure of the r&d process. Let technical people focus on code production.
5, test automation verification system, reduce system integration risk, improve the frequency of integration. Finally realized the code quickly on-line.
Author’s brief introduction
Shi Xiang, graduated from Nanchang University, is now working in alibaba new retail technology business group CBU Technology Department as a senior expert, responsible for quality technology, system stability and DevOps team. I used to work for ZTE, Alipay and other companies. I am good at solving quality problems in r&d process through technical means. I have rich experience in system high availability, test tool development and r&d efficiency improvement.