Abstract: The ultimate goal of continuous delivery is the combination of efficiency and credibility.
I. Efficient and reliable continuous delivery
1.1 Purpose of software development
Continuous delivery is a topic that is often mentioned by everyone. Efficiency is the purpose of continuous delivery. In the case of Huawei Cloud, the ultimate purpose of continuous delivery is the combination of efficiency and credibility.
Overall, the goal of ** software development is to continuously and quickly deliver high quality valuable software to customers. ** First, R&D is a fast and continuous delivery process; Secondly, research and development is customer-oriented, and the software delivered must be of high quality and value to customers. Quality is multi-dimensional, and its measurement criteria include stability, security, reliability, scalability and so on, in addition to value.
1.2 Software life cycle
From the perspective of the whole process of software life cycle management, the idea of a product is generated, followed by development, deployment and release, and finally operation and maintenance. This is a complete end-to-end process, and every link is indispensable.
-
The development process will be targeted to do a variety of tests, including static test, dynamic test, black box test, white box test, gray box test, etc., these tests will be layered in different environments, by different roles to bear;
-
The release process includes the management of wIP. What is released is determined by the release schedule of the front end.
-
Every link has deployment action, and the process of automatic deployment will link up with the development process of the front end.
A deeper layer:
-
In this process, we can see many methodologies, such as design thinking, lean entrepreneurship, digital transformation, etc. Innovation here represents the appeal of the business side. Traditional businesses need to achieve empowerment and transformation through new methods and technologies.
-
Innovative idea falls into the R&D link to realize, development deployment and release with CI/CD support;
-
Finally, operation and peacekeeping operations are not only required after the launch, but also cloud-oriented services. Since it is a service, it is necessary to build a whole set of user-oriented, content-oriented and product-oriented operation systems for service.
Back on the business side, map the above product-to-r&d process to the entire product life cycle.
-
Partial front-end stage is value-added, product innovation is valuable, write code is valuable;
-
The back-end phase is non-value-added, such as testing, which is necessary but not value-added to the end customer.
The value-added part, we pursue is the final effect; For the non-value-added part, we are looking for efficiency, that is, I want to complete the process efficiently in order to achieve the final result.
To ensure effectiveness through process, so process is also essential. However, there is a lot of repetitive work in these processes, such as testing, deployment, we need to automate the way to enable. That’s what DevOps is all about, automating processes and tools that allow machines to do repetitive tasks.
Looking at the big picture, it is not hard to see that throughout the entire software/product life cycle, quality is tested, verified, and feedback through layers. In addition, it also includes a large number of security activities, such as in the stage of product innovation to consider the appeal of security; In the development process through black box, white box, static, dynamic and other tests to ensure safety.
As we all know, Huawei is in the communication industry. Communication is a field related to national economy and people’s livelihood. It has very high requirements on security, and the products and services provided must be reliable and reliable.
Two, credible project
2.1 Huawei Software Engineering Construction Process
Since the 1940s and 1950s, with the advent of the first mainframe computers, software engineering has emerged and developed to this day. Huawei was founded in 1987. At the beginning, it operated by individual soldiers or small teams. Later, with the growth of the business, the size of the team continued to expand. To solve this problem, we started to introduce IPD, which ushered in the FIRST GENERATION of IPD1.0. ** With the development of Internet technology, agile development, CMMI, DevOps continuous delivery, cloud native and other methodologies and practices, we have incorporated these excellent methodologies and practices into the IPD1.0 framework, making it more flexible to adapt to the actual software development needs.
In 2019, we started working on IPD2.0, and one of the key points was credibility.
2.2 What is “Trusted”
Let’s define trustworthiness first.
Reliability refers to the certainty degree of toughness, Security, privacy, Safety, reliability and availability of each system in addition to business intention.
That is, each system needs to have six features in addition to the business intent (or functional appeal) :
-
Resilience, can service outages automatically pull up? Can the system withstand large concurrent shocks?
-
Here are two words for Security: **Security and Safety. ** Where Security focuses on data Security, information Security, Safety focuses on environmental Safety, etc.
-
Other considerations include privacy, reliability, availability, etc.
Huawei trust Project takes these six features as its basic appeal, connects them with business intentions, achieves business objectives efficiently, and brings value to customers at the same time.
2.3 Trusted Value Stream framework
If you are familiar with practices like Lean, you know that r&d is really a value stream delivery process. ** So what does a trusted value Stream framework look like? The trusted value stream framework is the end-to-end process of the product life cycle. It starts with the end. What we want is the safety of the results, the reliability of the results and the reliability of the process.
Disassemble the entire product life cycle:
-
At the level of security governance, we need credible governance, including credible requirements, responsibilities and rights, hierarchical governance, talent and culture, etc.
-
At the level of process credibility, there should be relevant deterministic prevention, deterministic response, confidentiality, integrity, consistency and two-way traceability, risk management, etc., so as to achieve overall control in the process;
-
Finally, it comes to the credibility level of the results, including the six features mentioned above.
Further refined, it also involves the product definition stage, product realization stage, product pre-release stage, product use stage, etc. ** Each stage is based on the traditional software engineering, adding specific actions to meet the overall credibility requirements. For example, in the product definition phase, credible appeal should be considered in planning and design. Product implementation phase, considering whether the code is safe, whether the compilation is available multiple times, whether the package delivered to different customers is correct, whether the test is repeatable… There are also many policies to implement trust during the pre-release and use phases of the product.
On the technical level, we can make trust by using modeling and simulation technology, encryption and decryption protocol of cryptography, operating system trust and other technologies.
This is a complete path through process credibility to result credibility, and ultimately we continue to improve through measurement at all levels.
2.4 Huawei Cloud HE2E DevOps FRAMEWORK V2.0
Huawei cloud HE2E DevOps framework is an operationable end-to-end development methodology and tool chain based on advanced concepts in the cloud industry and huawei’s 30 years of r&d experience.
-
First, it is an end-to-end DevOps development framework, from planning and design to iterative development to continuous testing to continuous delivery. We believe that just doing the engineering end is not enough to support the whole business. We must extend and return to the business side to achieve end-to-end closed-loop.
-
Second, its entire process incorporates a large number of trustworthiness measures to support the entire DevOps process.
Huawei Cloud HE2E DevOps released version 1.0 in 2018 and upgraded to version 2.0 this year. ** Huawei Cloud HE2E DevOps framework V2.0 is divided into six phases: planning and design, development and integration, testing and feedback, security and audit, deployment and release, operation and maintenance and monitoring. ** This article will focus on the engineering side, including development and integration, test and feedback, security and audit, deployment and release.
Development and integration
3.1 CloudNative practice
The cloud service itself is born in the cloud and grows in the cloud. Therefore, we have been doing CloudNative practice, continuously optimizing the four parts of architecture, project delivery, full-function team and cloud environment, and continuously improving the quality.
-
Architecture optimization, micro-service transformation of the overall architecture, unified micro-service framework into SpringCloud, business separation and architecture decoupling, lightweight communication.
-
Full-function team. After the architecture is split, the team needs to make corresponding acceptance and matching according to the architecture. We piloted a full-function service-oriented team, and a small 2-Pizza team was responsible for the development, testing, deployment and operation and maintenance of a single Service end-to-end.
-
Project delivery, practicing DevOps delivery mode, building end-to-end delivery pipeline, independent development, construction, testing, deployment, release, live network operation and maintenance of micro-services.
-
The research and development of all-cloud relies on the development of cloud I layer and P layer resources and mature basic services, and builds the tool chain and environment of cloud service based on Huawei cloud.
3.2 Construction of iron Triangle by Cloud Native ability
From the above four aspects, it is not difficult to conclude the iron triangle necessary to build Cloud Native capabilities: architecture, organization and engineering. I think value delivery should also be added in the middle. To return to the original statement: The purpose of r&d is to deliver high-quality, valuable software to customers consistently and quickly.
1) Architecture
-
Complete decoupling is achieved by using service-oriented architecture/microservice architecture. The system is divided into multiple functional cohesive, appropriate granularity, clear business boundaries, independent and autonomous services/micro-services.
-
Use self-service, agile cloud infrastructure services. Relying on the services of the underlying cloud infrastructure to provide running resources, the cloud monitoring service can be used to monitor its own running status. Meanwhile, the cloud monitoring service can operate and maintain events based on the running status to achieve elastic scaling and self-healing of faults.
-
Reuse the infrastructure and architectural capabilities provided by cloud native public services through apis. No matter internal or external services, API can automatically generate relevant interfaces and use cases for construction and definition. Open platforms and open banks in digital transformation are supported by API. Now the popular term “API economy” can be understood, not expanded here.
2) Engineering level
-
Systems are decoupled from the environment, processes, and configurations, and teams of people are decoupled and matched accordingly.
-
DevOps, operations and development are highly synergistic and share responsibilities.
-
Practice rapid iteration, continuous delivery, rapid corresponding business change, shorten TTM.
3) Organizational level
-
A full-function team includes the roles of product, architecture, design, development, testing, operation and maintenance. The team eats its own dog food, and its own products must be released and maintained online by itself. Everyone takes turns to be responsible for the release process, so that everyone knows the process of product launch and release. This has several benefits: skill transfer, a full stack, and no dependency on one character.
-
Cloud operation and maintenance team: Based on the monitoring and alarm capabilities provided by the cloud platform, a special team is set up to take charge of the quality of system operation, ensure system availability and uninterrupted upgrade and rollback of services. Operations teams are more about building tool platforms and processes to empower.
3.3 Objectives of microservitization
-
System decoupling, functional cohesion, improve demand delivery efficiency. Make the system agile by dismantling and decoupling the business.
-
Practice API First. Through servitization, contracts are established between service providers and consumers through microservice apis.
-
Low cost scalable architecture. Flexible horizontal and horizontal expansion, smooth cloud, architecture to support the rapid development of the application market business.
-
Service autonomy. Through the combination of online micro-service governance and cloud platform, the elastic expansion and fuse downgrade of micro-services can be realized.
-
Explore integrated service teams. Establish 2-Pizza Team and make the business truly agile through the construction of full-function Team.
3.4 Microservices practice – unified top-level design
Summed up is: large corps operation, top-level design, unified understanding, organization empowerment. The roles are divided into R & D & operation team, design & development & test backbone and architect. What each role should do is defined and supported through training and empowerment.
3.5 Microservices 12 Design Principles
We have a one-stop microservices management platform to support the 12 design principles of microservices.
3.6 Clean Code
1) Clean Code mechanism
The industry agrees that Clean Code can effectively reduce vulnerabilities, reduce system vulnerability, and is an important measure to achieve software trust. Internally, we also put a lot of emphasis on Clean Code, putting in place a complete, fast feedback mechanism that consistently generates good Code.
Huawei has its own definition and culture of Clean Code. Support related processes through Committer mechanism, access control mechanism, code sharing to the extreme, and control the overall quality through developer testing; A three-layer model and six levels of standards are established to support real-time code evaluation, periodic evaluation of version delivery and annual evaluation of product long-term evolution.
2) Clean Code evaluation
The industry’s Clean Code evaluation criteria are: Efficient, portable, secure, concise, reliable, and testable. Huawei continues to carry out effective code evaluation of the latest standards in the industry, and has established a three-layer quality model and a six-level quality scoring standard based on practice.
Key initiatives include: building a rating scale where we will rate different systems; Develop visual tools; Establish regular assessment guidance scenarios; Continuously update evaluation models for standard industries.
3.7 Huawei Committer Engineering Practice
Committers come from the open source community. As we all know, the collaboration mode of the open source community is that developers distributed around the world make collaborative contributions to the development, and the ability, level, and responsibility of code contributors are uneven. Therefore, the Committer review must be carried out before the code is put into the library to ensure the quality. Huawei also has an open source mechanism, and adopts this practice to form its Committer mechanism.
Huawei Committer mechanism is actually a set of processes to ensure overall code quality, including three roles:
-
Developer, responsible for creating local code branches, local development, development self-inspection and testing, etc.
-
Reviewer, responsible for reviewing the code;
-
Committers do final quality control.
At the bottom, IT infrastructure ensures compilation, deployment, testing and other processes, and automated tools enable IT to reduce the repetitive work of reviews.
The entire Committer process consists of several core points:
-
Code development, submission and incorporation permissions should be separated to avoid attackers using employee permissions to implant malicious code.
-
Coach engineers to improve team software capability by reviewing and auditing opinions.
-
The in-repository review also drives front-end Code review in reverse, facilitating Developer’s ability to write Clean Code.
Test and feedback
4.1 Huawei Cloud Full-scenario Test Service Framework
Huawei cloud full-scenario test service framework provides a one-stop end-to-end test automation and intelligent solution to build a test center for enterprises and improve their testing professionalism and performance. The whole testing framework is divided into three parts: test design, design execution, test analysis, including test design, test case management, service interface test, web user interface test, terminal test, performance test, security test, diversion test, online test and test analysis report. The bottom layer has a complete test management platform to support.
The full scenario testing service has three main points:
-
Test Left, Test moved Left. Testing can be initiated and performed early in the development or product life cycle, and testing is not the sole responsibility of testers, meaning that testing is an activity, not a separate role.
-
Automate Everything possible. Built quickly to do some validation in the early stages of the business, all these test cases gradually become automated in order to support the process of rapid business delivery over and over again.
-
Test Right, Test Right. The class A production environment was never a real production environment, there was no way to simulate all the scenarios in the production environment, and we needed to do a lot of online testing.
In the whole process of testing, there are corresponding testing actions in each stage: acceptance test, unit test, functional test in the development stage, and code quality control in coding is also testing; Testing links include regression testing, integration testing, performance testing, security compliance testing, etc. Deployment involves A/B testing, complex testing of the online environment and production environment, and eating your own dog food.
The whole test will be carried out in three levels, namely personal level, service level and product level. Each level of the assembly line will involve quality activities, and the assembly line will be graded:
-
The quality activities of the individual pipeline go from the local development environment to the Alpha environment, including code review unit testing, build, security scanning, interface testing, etc., and then branch into the Beta environment.
-
The quality activities of the service pipeline are from Beta environment to Gamma environment. In addition to the above tests, there are also aging feature regression tests, browser compatibility tests, performance tests, etc.
-
Product-level pipeline quality activities go from Gamma environment to production environment, this level needs to include specific testing, such as product-level performance testing, reliability testing, long stability testing, safety testing, etc.
4.2 Continue to carry out quality activities in the process of microservice delivery
There are different environments in the microservice delivery process, such as Alpha environment, Beta environment, Gamma environment, etc., and each link has relevant quality inspection access control and acceptance standards, as well as the quality activities of the live network. These quality activities are undertaken by different roles:
-
In the front-end design stage, architects and development engineers are mainly involved;
-
The development process is mainly carried out by the development engineer, and the architect also does the development work, but he does more of the key design and development across services;
-
In the release stage, the development engineer performs the release action, and the test engineer will control the quality from end to end, and support and empower the development engineer.
-
Production environment, development engineers and operation and maintenance engineers together to support the system, test engineers will do some live network test, diversion test, chaos test, etc.
4.3 Test measurement index system
In addition to the process, we will also have some quality definitions and metrics. ** Test metrics generally fall into two categories: ** Process metrics and results metrics. Process measurement includes coverage, execution rate, test efficiency, etc. Results include: functional testing, performance testing, security testing, reliability testing, etc.
5. Security and audit
5.1 DevSecOps values and practices
Safety and trustworthiness are directly related. When it comes to trustworthiness, people will naturally associate with security. The need to balance DevOps speed with existing security requirements spawned a model called DevSecOps.
What is DevSecOps? **DevSecOps is based on the “security is everyone’s responsibility” principle, highlighting how application developers can build security checks into their integration and deployment pipeline. In a nutshell, it’s about incorporating security activities into every part of the DevOps development process. In the practice of Huawei cloud, we pay more attention to embedding security activities in the software development process to ensure the reliability and stability of software production.
5.2 Huawei Enterprise Security Engineering Practice
Also from the point of software life cycle, huawei enterprise security engineering practice, based on the conversation line, in the planning, coding and build, validate, release and ops operation etc. Each stage has the corresponding security testing point and practice involved, the bottom will be standards and specifications, technology and capability, tools and processes to support, the whole process to ensure the safety of network, Realize security design, security coding, security testing, security operation and maintenance.
5.3 CodeCheck security
CodeCheck, an internal tool developed by Huawei, responds to stringent security requirements and ensures overall code quality and security. We look at CodeCheck in terms of capability, efficiency and eco-operations.
-
In terms of capability, there are three levels: desktop level – coding development process, embedded IDE, performing quick checks during coding; Personal access control level – when the code is submitted for warehousing, access control and rules set are provided for quick inspection; Version-level: In the process of continuous integration and delivery, it provides engineering capabilities of full check and closed loop alarm handling.
-
In terms of efficiency, front-end desktop level and personal access control level can achieve the standard of second to minute level; To the online link, because there are many links to run, so it is hour level.
-
In terms of ecological operation, our tool is to serve internally and support customers externally.
5.4 Safety Rules
Our safety rules come from two levels:
-
External perspective: Learn from external rules and deposit industry standards and best practices into a knowledge base and rule set for secure coding.
-
Internal perspective: huawei has a large number of product lines and product forms. We defined TOP10 quality and safety issues and accumulated long-term experience in scene sorting and safety inspection into rules.
The industry often referred to the ** static analysis security test ** (SAST) ** includes four layers: compilation, construction, syntax analysis, semantic analysis, security check involves the technology stack is significantly deeper, Huawei provides full stack capability, multi-language support, support in-depth semantic analysis ability. If the quality and safety together, need some rules and engines to support, but also need to introduce some intelligent means, automatic inspection, automatic repair.
** Huawei divides the rule set into three layers: company level, product line level, product level, version level, and engineering level. ** Corporate-level rule sets include mandatory rule sets, such as low false positives, which must be swept by default, while inspector rule sets can be checked as needed; Product-level rule sets and product-level rule sets are mandatory for product line administrators. Other rule sets can be selected based on product features.
** The whole tool is not just a simple installation + operation, but a MN matrix operation system. ** From the definition of rules, there are rules at different levels, such as company level, product line level and product level; In terms of time cycle, there is a design process for which rules go into the developer desktop IDE, which rules go into the pipeline, and which rules go into the build.
5.5 Enterprise-level expert services
At the security level, we offer enterprise-level specialist services with a service strategy similar to hospital visits. Diagnosis and treatment of common “fever and cold” and provision of basic services; For “serious diseases”, provide professional automatic examination and engineering supporting capacity; Provide consultant expert services for “difficult and miscellaneous diseases”.
Deployment and release
6.1 Continuous delivery of core practices
The process of continuous delivery is associated with continuous development and integration, automated testing, and so on. For example, hierarchical fast closed-loop testing will be carried out in different levels for the purpose of fast feedback and closed-loop.
In addition to hierarchical fast closed loop, the core practices of continuous delivery also include: small iterations and high tempo delivery, automation & visual pipeline, automated continuous deployment, shortening single point of time, efficient standardized environment, etc.
6.2 Released branching model and CI/CD pipeline
Behind automated deployment is a standardized environment that requires a good match and mapping between the branch structure of the code and the overall CI/CD pipeline.
We pull it out from the main code, a branch of code above do related development, submit automatically after pulling up a CI assembly line, for static checking code and building automation, including deployment package preparation, code, etc, in the end through personal level of assembly line, ran to the production environment, the whole process is associated with CI/CD line together.
6.3 Trusted built-in assembly line
We built the concept of trust into the whole assembly line, automatic inspection, verification, feedback in the process, support the implementation of CI/CD process of efficient trust.
If the code layer is trusted, it can be built repeatably and efficiently by build time; At the same time, creditability related checkpoints will be added to test activities. Once launched, practices like Chaos Monkey are also directly related to trust, such as elasticity and stability, which we often emphasize.
Everything as Code
Some may ask: How do I protect my environment from someone planting bugs in it? In fact, both the environment and the process can be codified, and then it can be incorporated into the overall version control, so that all modifications and changes can be automated, and security audits can be performed against those environments and processes.
We call it Everything as Code. In addition to infrastructure, choreography, configuration, testing, databases, pipelining, and code can all be presented in this way, enabling versioning, automation, and standardization. And the same goes for Service, Service as Code is actually the API that we want to emphasize.
6.5 Grayscale Publishing Policy Drives automatic deployment and rollback
At the time of release, we will have mechanisms at all levels of release. First of all, we will do a gray level release, find some key MVP customers, or internal users, release to the self-use environment; No problem then two gray scale, then will be moderately expanded, open to the whole network about 10% of users; Then do three levels of gray; Finally, the full release.
In this process, whenever a problem occurs at any level, it can be repaired and rolled back immediately. At the same time, we will do some online diversion tests and explore the business level in combination with AB tests. We always emphasize enabling the business through technical means.
Seven,
The above is the Introduction of Huawei Cloud DevOps system, the core of which in addition to what we usually call “efficient”, “continuous delivery”, is “credible”. We call this architecture ** “DevOps Architecture Framework for the Cloud and Life” **.
At the top are business decisions, which we plan and review on a regular basis and adjust at a regular pace. Next is service-oriented organization, decoupling of architecture, development & operation and maintenance landing; Further down are tools and environmental support; At the bottom is the service process support.
How to construct such a system? We should always focus on the overall goal of R&D efficiency, select and apply relevant R&D tools, and use tools to carry new capabilities, support efficient operation, continuous delivery, efficient collaboration, intelligent assisted development, continuous feedback and improvement, so as to build the entire continuous delivery capacity.
— — — — — — — — — — — — — — — — — — —
Content source: [2020 CSDI SUMMIT China Software Development and Management Industry Technology SUMMIT] Author: Yao Dong
Guest profile: Chief technical solution architect of Huawei Cloud Application Platform Department, senior DevOps and Lean/Agile expert, Huawei Cloud Sharing expert, core organizer of DevOps community in China, founder of IDCF (International DevOps Coaching Association)
Click to follow, the first time to learn about Huawei cloud fresh technology ~