Brief Introduction: 3 months to consolidate infrastructure, Xianfeng Fruit so as to achieve digital research and development. To simply and quickly improve the delivery quality and efficiency of production and research teams has become a necessary option to support organizational business innovation. Let’s take a look at how Xianfeng gradually breaks the ice.
Founded in 1997, Xianfeng Fruit, after 25 years of development, has become a global enterprise integrating new retail, smart cold chain logistics and SUPPLY chain B2B platform, and is one of the well-known fruit chain enterprises in China. At present, there are more than 2,200 stores in China and 23 modern cold chain storage centers with a total of 480,000 square meters.
With the changes of the external environment, the digital transformation of Xianfeng Fruit accelerated again in early 2021. In just a few months, the r&d team expanded more than twice, and some problems began to be exposed:
- Research and development infrastructure is not perfect, and there is a lack of professionals in related fields. The investment in manpower and time is high and the results are slow.
- Many links feel problematic, but do not know how to observe, and do not know what the best practice is. As companies invest more and more in production and research, the need to deliver business value faster and better is becoming more urgent.
To simply and quickly improve the delivery quality and efficiency of production and research teams has become a necessary option to support organizational business innovation. Let’s take a look at how Xianfeng gradually breaks the ice.
First, sort out the process and find problems
To solve a problem, you need to know what the problem is.
PI Xuefeng, head of r&d of Xianfeng Fruit, knows that there is a lack of professional r&d transformation personnel in the team. In order to promote the transformation as soon as possible, we must ask for foreign assistance. Skin xue feng considers cost, cloud product integration, functional comprehensive and ease of use, time effect the conversation eventually chose ali platform, thus met led by senior r&d transformation industry experts He Mian ali time effect best practice team, invite them for fresh abundant fruit the whole development process of end-to-end research, help clear team problems in every link.
The transparent walls of Xianfeng Fruit’s office are covered with sticky notes detailing the development process
The cloud Effect best practice team and the PI Xue Feng team sorted out the problems into two categories.
1. End-to-end production-research collaboration
- High collaboration costs and data silos associated with bulk production and research collaboration tools.
Some PRD documents for product managers exist in language, some use pin documents, and some are directly local, developing using GitLab while testing maintaining use cases and test plans on XMind.
- Waste of delivery resources, unclear delivery progress and poor delivery quality caused by the lack of a unified and transparent collaborative process.
The product is unable to timely understand the progress of demand, whether the research and development encountered bottlenecks, after the launch of concentrated exposure of problems, high rework rate.
2. Project delivery capability and delivery quality
First, define the engineering problem definition: after accepting a development task, code writing, joint commissioning, testing, integration, until deployment online is called an application change, and the problems in the whole change process are called engineering problems.
After analysis, xianfeng’s engineering problems mainly include three aspects:
- The change process is not smooth, and there are many waiting and conflicts among various roles.
The test role and the development role focus on different branches. The management of branches depends on the manual operation of the development role. As the two sides are out of step, the cost of branch management and communication is high.
- Delivery quality relies heavily on manual validation of tests.
In the current CI/CD process, there is no built-in fast quality guard capability and must rely on manual verification by offline test roles, resulting in delayed quality feedback.
- The deployment operation and maintenance of cloud native application architecture relies on a small number of experts.
Xianfeng’s application architecture has fully shifted to stateless, and its infrastructure has fully shifted to cloud native, but at the same time, new requirements have been put forward for application deployment and operation and maintenance capabilities, which rely on a few experts. Xianfeng hopes to deposit these practical experiences so that each developer can deploy and operate the application.
Two, “three steps” to solve the problem
Based on the above key issues, Xianfeng Fruit implemented the “three-step” strategy under the advice of aliyun Cloud Efficiency best practice team, defined the team performance improvement goal, established the corresponding process and mechanism, implemented the continuous delivery practice with application as the core, and realized the “small steps and quick steps” of r&d.
The first step is to pull the cross-functional team to reach a goal-feedback closed-loop consensus
Due to problems such as low collaborative efficiency and slow delivery caused by decentralized tool chain and opaque collaborative process, PI Xuefeng first established a business goal-oriented cross-functional team, including product, design, development and testing, and made clear that the efficiency goal of each cross-functional team was to improve delivery efficiency and quality. In order to make the team more clear in the process of implementation and achieve the combined effect of “1+1>2”, PI Xuefeng set two phased goals for the team after the consensus of the team:
- The delivery efficiency target mainly refers to shortening the development cycle of requirements. 85% of requirements submitted to the development can be online within two weeks;
- Deliver quality objectives, clarify development access and development entry test criteria, continuously reduce the number of defects and online issues by 20%.
Xianfeng set up a cross-functional team in the internal composition
After the composition of team members is clarified, the overall delivery process of requirements is further clarified. Especially from the perspective of efficiency, a closed-loop mechanism of delivery efficiency feedback needs to be established.
After discussion, the mechanism was finally established as follows: starting from the alignment of business objectives, business planning was carried out regularly, corresponding requirements review and R&D schedule were carried out based on the business planning, and the team developed, tested and accepted requirements through bi-weekly iteration or weekly iteration. On top of this, it aligns planning, planning and progress by establishing monthly planning, weekly scheduling and daily station meetings.
Overall delivery process
About demand delivery cycle and also made a clear definition, development cycle as the chart, demand delivery cycle from the “selected” to “published”, demand development cycle from “development” to “release”, in the process of actual landing, the end of the development cycle will be to “published”, so that more can reflect the business point of view.
The second step is to identify processes and mechanisms based on consensus
1. Demand flow mechanism and state consensus
After investigating the current situation of the team and clarifying the problems in the process of team collaboration, we designed the flow state and flow mechanism of the demand specifically, and reached a consensus with team members. The idea behind consensus is to suggest a unified cognitive and communication language.
2. Pull through and visualize the end-to-end business value stream
After clarifying the demand flow state and flow mechanism, the mechanism and consensus need to be implemented in the cloud effect. User value drivers: each team collaboration based on demand, each demand need to focus on customer value, on the one hand, users who need to be clarified, what is the target, the other requirements need to be broken up into small granularity (a requirement development testing is completed in two weeks), and of course for a small demand needs to achieve measurable can be released.
Before and after functional pull through: In the whole flow mechanism of demand, attention should be paid to the demand stage, development stage, test stage and release stage, and the whole process should be opened up, and the roles in each stage should be pulled together to make the whole collaboration process smooth and efficient.
Left and right module alignment: In development, requirements are broken down into development tasks. Often a requirement is split into front-end development tasks and back-end development tasks, and sometimes the back-end development tasks are split into different modules. At this point, each development task under the requirements, need to align the interface, align the tuning and testing time.
Business value stream landing on cloud effect products
3. Clear access rules for each stage and form a built-in quality mechanism
After the workflow of demand is clear, the next step is to clarify the access rules for the flow of demand into each state, not only to allow smooth flow of demand, but also need high-quality flow. At the same time, from the perspective of built – in quality, the quality of demand is not by the final link of the check, but need to clear quality requirements from the source, so that the quality of each link can meet the clear requirements, until the final high quality delivery.
We will clearly define the flow rules for each phase, especially for requirements access development and pre-release development, as these are the requirements drop-in process of product, development, and test roles, and the drop-in process of requirements is the most problematic.
4. Define requirements prioritization mechanisms
Clarifying the requirement priority mechanism is particularly important in the process of team consensus, because the level of requirement priority represents the level of value, which is directly related to the goal. In real-time landing, it was found that the requirements that the team assigned to the iteration were prioritized urgently, rather than in a clear order of priority.
We need to have an absolute priority list of requirements, the highest priority requirements that can be delivered first, while also making it easy for the team to actively challenge the requirements to create the most reasonable requirements priority list.
5. Define the person responsible for the requirements after entering the development
The requirements Owner is responsible for coordinating the requirements into tasks until the requirements are developed and tested, tested, and released. On the one hand, let the requirements into the development be responsible for, on the other hand, also cultivate the responsibility of team members.
6. Form the rhythm of monthly planning, weekly schedule and daily station meetings
Establish the overall rhythm of the monthly plan, weekly schedule, and daily station meeting, each of which is closely aligned with the state of the requirements.
After the requirements are planned, the status of the requirements is updated to Selected. With scheduled requirements, the requirement state becomes more “undeveloped.” With post-site requirements, the status of the requirements is updated to the latest.
The third step is to practice continuous delivery with applications at the core
In terms of engineering, based on the current situation of Xianfeng Fruit, PI Xuefeng decided to fully embrace the engineering practice method with cloud native application as the core. Specifically, there are two main points:
1. Develop research and development mode based on feature branch and implement it into application change process
In order to ensure the collaborative efficiency of all roles in the process of change, xianfeng decided to remove the test branch and adopt a r&d mode similar to the characteristic branch, with only one long-term branch remaining, whose branch mode is similar to the following figure:
Based on this branch mode, Xianfeng sets the master branch as the protection branch, and avoids manual deployment and branch management operations by defining and connecting the whole process through the cloud efficiency pipeline of application dimensions, ensuring that what is sent is measured. The application pipeline template is as follows:
The above process falls into the release pipeline of cloud Effect AppStack by application, similar to the following figure:
2. Aggregation of choreography, environment, monitoring and R&D processes with cloud native applications as the core
Xianfeng began to transform the cloud native application architecture in the past two years. There are only a few SITE reliability Engineer (SRE) in the R&D team, who are responsible for formulating the overall R&D and operation rules. The deployment and operation of applications are all in the charge of front-line R&D, but there has been a lack of a tool platform from the perspective of r&d. Gather resources and operations related to application development. And that’s exactly what the cloud-enabled AppStack application delivery platform is designed to do. To this end, AppStack opened the public beta, Xianfeng will be the first time to start the trial, and gradually all applications moved up.
As can be seen from the figure above, the R&D team does not operate cloud resources directly, but can operate resources in the application environment of AppStack. On the one hand, it is more in line with the habit of cloud native research and development, and on the other hand, it is safer.
Of course, tools are only part of the cloud native transformation, which includes three aspects: technical architecture, deployment architecture, and engineering practices.
2.1 In terms of technical architecture, each application can be independently deployed, verified, operated and maintained, and make full use of cloud native infrastructure to enhance flexibility and toughness.
Xianfeng’s R & D infrastructure is fully cloud based on cloud resources and open standards to build applications, mainly using the following cloud products:
- Ali Cloud ACK: fully compatible with K8S and free of operation and maintenance, regardless of production or test environment application containers are loaded on it;
- Ali cloud RDS and other database products: follow open source protocol standards (such as MySQL), can be seamless migration, convenient operation and maintenance, and better performance;
- MSE NacOS: commercial version of the open source configuration center NacOS;
- Ali Cloud ARMS: one-stop observability platform, mainly using THE K8S monitoring and application monitoring, can also integrate RDS monitoring, no intrusion on Java applications;
In selection, Xianfeng fully considers the openness of the standard to ensure that applications can be loaded on different cloud service providers without modification.
2.2 In terms of deployment architecture, one set of orchestration for each application is applied to multiple sets of environments, and environmental differences are reflected through variables, so as to separate images from configurations.
Xianfeng’s expectation for the deployment architecture is that an application defines a deployment architecture, and differences in different environments are differentiated by variables. An image can be deployed to multiple environments, and no environment-related configurations are retained inside the image. To this end, Xianfeng adopts the following practice methods based on AppStack.
First, SRE defines the orchestration template of the enterprise (such as one Service, one Deployment).
Second, in each application, the application owner chooses this template to define his or her own deployment choreography, resolving any differences between environments by defining variables.
Third, apply the owner to define different sets of variables to suit different environments.
Fourth, the application manager binds the variable groups to the environment.
Finally, the r&d team does deployment and operations directly on the environment.
2.3 In engineering practice, research and development should be self-released, self-operated and maintained, but SRE can configure and control permissions and policies globally.
Xianfeng divides the r&d roles into three categories: application leader, development and test, and an enterprise-level SRE role, which assigns permissions to each of the other roles.
SRE defines the operation rights of different environments for each role. Development and test roles can deploy and operate the development test environment, but cannot operate the production environment. Only the application owner can deploy and maintain the production environment.
Iii. Efficiency improvement effect
Shortened development cycle
After three months of implementation, the production and research team of Xianfeng Fruit has been able to achieve 85% of the demand within two weeks of launch.
There have been some hiccups in the development/implementation of this target.
When we first set the “85th line” of the development cycle at two weeks, some of our students asked, “Isn’t the delivery time of a requirement strongly related to the size of the requirement?” Yes, we will first reach a consensus with the production and research team, that is, what is a requirement? Our criteria for defining requirements are independent deliverables and acceptance tests, and on that basis, the less granular the better.
The figure below is the statistical chart of the development cycle of Xianfeng Fruit three months after the transformation. From the chart below, it is not difficult to see that 85% of the requirements delivered by the pilot team in February have reached the preset goal of two weeks within 13 days.
In addition, through this chart, we can also see some other problems, such as batch delivery of requirements, not continuous release of single requirements.
Relatively ideal demand delivery cycle diagram:
Average lead time: 10 days (less than 2 weeks)
Desired scatter distribution:
- Vertical and downward concentration —- response capacity and predictability improvement;
- Improve scatter density —- improve delivery efficiency;
- More evenly distributed horizontally —- continuous delivery;
Delivery quality improvement
After three months of operation, the number of online problems of xianfeng Fruit’s production and research team decreased by 20%, and the research and development mode was fundamentally changed.
In the early stage, xianfeng Fruit’s production and research team adopted a development model similar to xiaowaterfall. The team designed, coded, and introduced defects centrally, but there was no immediate integration and validation. Defects remained hidden in the system until late in the project, when the team began integration and testing, and the defect concentration exploded. The more defects are discovered in the later stage, the difficulty and cost of repair are greatly increased.
After analyzing the status quo issues, the team began to evolve toward a continuous delivery model. In the whole process of iteration, “single application deployment, single requirement delivery” was basically realized through the above “three-step” strategy. The team developed with small-grained requirements, continuously integrated and tested them, and found and solved problems in real time. Defect inventory is controlled and the system is always in a near-releasable state. This mode is closer to a continuous release state, which increases the team’s ability to respond externally.
Iv. Suggestions on transformation of traditional enterprise R&D
After three months of practice, the production and research team of Xianfeng Fruit has realized the digital transformation of the R&D process and achieved the expected goal of improving the R&D efficiency. However, there are still some issues for the team to continuously improve and improve, such as the closed-loop construction of the whole business monitoring starting from business requirements, and the improvement of test automation capabilities, etc.
Xianfeng Fruit, as a new retail representative of “traditional industry” r&d transformation and “digital” transformation, has encountered some problems that many similar enterprises have already encountered or will encounter in the transformation. Here we make a simple summary, hoping to help enterprises with similar problems:
- Team consensus is important. In the whole implementation process of Xianfeng Fruit, it is very important for the whole team to reach consensus and reach consensus, whether it is the establishment of indicators at the beginning or the subsequent setting of procedures and norms. For example, why do we need to look at this indicator, what is the requirement, and what is the definition of requirement completion? Only when the team has a real consensus, can the whole process be smoothly followed up.
- Business drivers are fundamental. The purpose of RESEARCH and development is to achieve business value, so the end-to-end delivery process through business requirements, alignment of various functional development work, to ensure that we are working for the “user”, the final output is valuable.
- Embrace cloud native. Cloud native technology stack has been mature, at the same time, with the rapid development of business, no matter from the utilization rate of resources, manpower cost, availability and response speed, the traditional way of infrastructure construction has been difficult to meet the demands of enterprise development, timely “embrace cloud native”, improve business agility and quick response ability is also becoming increasingly important.
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.