In the Internet industry, the iteration of front-end projects is very rapid, which puts forward very high requirements for infrastructure. At the same time, the particularity of front-end projects also determines its requirements for infrastructure to be different from back-end and algorithmic projects. On the other hand, the basic technology of the industry is constantly evolving, and we are constantly exploring our own best practices for front-end research and development in the context of complex internal and external needs and backgrounds, while also integrating the excellent ecology of the industry.
Around 2019, the concept of DevOps was well established and the industry standard was generally accepted. According to the 2019 Accelerate State of DevOps Report, efficient DevOps performance is available everywhere. Front-end generation and hosting platforms such as Gatsby and Zeit and other Serverless architectures have also been popularized. Hosting operation and maintenance has developed a lot in terms of front-end friendliness and capacity expansion.
In bytes, front CI/CD has very convenient: we can after each code submitted preset tasks, such as code style and quality inspection, security checks, build product testing, deployment to the integrated environment, after the deployment environment to run various tests (compatibility test, the real machine test, performance test, etc.), output various test report; We can quickly launch, roll back, switch, etc., in a matter of minutes.
These conveniences effectively guarantee the rapid iteration of multiple bytedance products and provide users with more secure and stable services.
However, a few years back, the experience was very bad: after the feature was developed, I needed to send the MR link to my colleagues and ask them to help me with Code Review. Then go to the compiler platform to initiate compilation; After compiling, go to the cloud platform to launch a test online or small traffic online, verify that there is no problem and then launch the official online…… The whole process is very manual, and the process is driven by point-to-point communication, which is time-consuming and error-prone, just like slash-and-burn.
As a witness of the whole evolution, the author tries to explain how the front-end engineering change happened in Bytedance based on his own experience.
The front-end deployment
Once upon a time, our service changes relied heavily on manual processes. One reason was that the infrastructure was not in place and it was difficult to do engineering. Another reason is that there is so much business, so much variation, and so fast a pace that there is no time or resources to precipitate best practices. After the early slash-and-burn operation caused various online accidents, we decided to slow down the iteration of the product and solve the tool problem first. Requirements can never be completed, but without good tools and tools, completed requirements can become hidden dangers.
We organized a very extensive survey, collecting the situation and pain points of each business team, and found that each team was consciously doing some engineering construction. Considering the universality of technology, we became the project team to solve the front-end engineering problems.
The first thing that needs to be addressed is deployment. Initially, front-end and back-end services were not differentiated and were deployed on ByteDance’s internal private cloud. The industry’s back-end cloud giants like Azure and Lambdas are just stepping into validation and roll-out. At the front end of the field, we don’t have an obvious case study. It turns out that even the most progressive practitioners of serverless concepts like Zeit Cloud were still busy rolling out mirror-based no-service architectures. But front-end services are fundamentally different from back-end services. Most front-end projects (BFF aside) are hosted for static resources that don’t involve other expensive computing resources, and one of the simplest solutions is just one Nginx. However, cloud engine needs to consider universality, and the initial design is also biased towards back-end computing services, so the online service is complicated and time-consuming, which cannot meet the requirements of rapid online front-end services.
Consider a separate cloud engine to build a dedicated front-end deployment platform, initially thought of nothing more than nGINx based to do. But a little digging reveals that front-end hosting is clearly not enough to provide HTTP services. Later, we decided to implement this service based on Golang in consideration of the need to realize complex routing matching, small traffic and other logic, as well as maintainability. This service can be used directly as a public service, or can be deployed separately to avoid affecting the stability of other services due to the high traffic of some services.
After the deployment platform is launched, the online time of front-end services is shortened from a few minutes to a few seconds. Meanwhile, it provides functions such as small traffic, route management, and domain name management, which are widely used by colleagues of all business lines. The deployment platform only provides static page deployment at the initial stage, and then gradually adds functions such as micro-front-end deployment and Node.js BFF deployment. At present, the front-end deployment platform has become one of the most commonly used basic services in the front end of the company.
A routing
The deployment problem has been solved, and every time it goes online, it still needs to complete the code combination on GitLab, complete the packaging on the compilation platform, and then select the version for going online on the deployment platform. The whole process is still fragmented and the series has not been completed. There are probably two stages in the process of streamlining. Around 2018, when DevOps concept was fully enterprisalized, tools like Jenkins and Gitlab Runner were rapidly popularized. We also introduced and proposed comprehensive process tools based on our own expertise and industry evolution.
CLI phase
We abstract r&d activities into different stages, such as CI stage, compile stage, test stage, launch stage, etc. Then, the basic services within the company are divided according to stages, so that each basic service can be assigned to a certain stage.
Each underlying service maintainer is different, so the design concept and interface are different. If you want to concatenate processes, you should have the same interface, or design a set of interface specifications, and then develop an adapter for each service so that all services can be concatenated. We treat each basic service as a job, and each job belongs to a certain stage. For jobs, the data structure is defined as follows:
type job = string;
interface IJob {
name: job;
stage: IStage;
context: any;
}
Copy the code
A plurality of jobs form a pipeline, which has all stages, and each stage may have 0 or more jobs. The actual situation depends on how users assemble the pipeline. Pipeline data structure is defined as follows:
interface IPipeline {
[key: job]: IJob;
}
Copy the code
We leave the assembly of the assembly line completely to the user to provide more flexibility. So a command line tool was developed to assemble the pipeline through interactive questions and answers. This form is similar to various scaffolding tools common in the front-end domain (create-React-app, VUe-CLI, etc.) and is expected to reduce the cost of use.
Using this command line tool, the user configures triggers, selects the base services to concatenate, configures the necessary context information for each base service, and the pipeline is assembled.
Git events are generated when the code is submitted, which are notified to the triggers, and the pipeline runs the configured jobs in the order defined by the stages.
Using this tool, you can connect code management, build services, access testing, deployment and other services in tandem, reducing unnecessary platform switching and greatly improving r&d efficiency.
At the same time, the front-end cloud has gradually entered the Serverless stage, emerging development modes such as Vercel and Netlify emerge one after another, and our internal front-end cloud has gradually accumulated a certain amount of experience. Some time after the launch of the above tools, we organized a large-scale user return visit to collect the usage of each team. On the one hand, everyone affirmed the efforts of this product and could really solve the problems of the business team. On the other hand, many suggestions were also made. Based on our survey accumulation and application research, we summarized several key problems to be solved in the future according to the feedback:
- It is not convenient to use CLI and display information. GUI interface should be provided to lower the threshold of use.
- Internally, the pipeline should be reusable to further precipitate best practices.
- Provide more quality tools, control online process, improve online quality.
GUI phase
The reasons for the problems encountered in the practical application of the above-mentioned solutions are that development modes such as Lambda SAM, which are suitable for the completely Serverless concept, put forward higher requirements on the integration of processes and r&d tools, just as the R&D and operation advocated by DevOps culture have been integrated and overlapped in every aspect. In order to adapt to this stage, we gradually introduced more and more advanced development strategies.
We investigated various DAG workflow engines, and during the same period, a team in the company developed a process engine compatible with ASL. After communication, I thought it could meet the demand, so I decided to build a new service based on this process engine to solve the aforementioned problems of return visit collection.
We developed a new deployment platform, using our own ASL compliant process engine to drive tasks. It provides an easy-to-use interface to arrange assembly lines, and users can directly drag and drop assembly lines on the platform.
We abstracted the underlying services into atomic services and defined uniform specifications and interfaces for atomic services. For an atomic service, four lifecycle functions are exposed: prepare, execute, loop, and Finish. For a pipeline, the platform invokes each atomic service according to the orchestration of the pipeline, and only if the previous atomic service succeeds can the subsequent atomic service be executed. For atomic services, the platform calls each of the four functions in sequence, and only if the first function succeeds can the next function be called.
With such a platform, our main task is to provide more and better atomic services. The most effective approach is of course to establish a development specification and let developers develop atomic services for the business, so we developed atomic services open platform. For developers who need it, they can register their atomic services with the platform, fill in the lifecycle Webhooks, and the platform will automatically generate compliant atomic services that can then be used in the pipeline.
The GitHub Actions service is called Workflow. Each workflow consists of one or more jobs with multiple steps. Each step performs one or more actions in turn. We can see that our assembly line is also composed of multiple jobs. Each job has a fixed life cycle, and each life cycle function can be customized.
Front-end r&d platform
More recently, the concept of DevOps has evolved into a more refined concept of SRE. Our exploration efforts have not stopped, and we have begun to develop a more integrated, systematic and open ecological stability construction.
The front-end cloud deployment platform mentioned above solves problems such as process series and project deployment, and also has open capabilities. Developers can develop their own atomic services based on the actual business to supplement the capabilities of the platform. Is that enough? We’re thinking about that.
Along with the growing complexity of the front-end project, front-end r&d activity involves the link more and more, just solve the problem of deployment and routing is not enough, a complete platform r&d activities need to be solved from the project, to demand management, development, testing, integration, build, release, validation, monitoring and so on each link. Therefore, it is imperative to build such a research and development platform covering the whole life cycle of the project.
There are a lot of problems to solve, but you don’t have to start from scratch. Bytedance has many basic services inside, such as different scaffolding tools, multiple interface management platforms, monitoring services across app and Web, different low-code/no-code platforms, and so on. We need to design a unified specification to integrate and get through different platforms, and finally provide services for developers on a unified platform.
Just as we gave this project the code name “Universe System” at the beginning, the front-end R&D platform is expected to be a one-stop development platform with both depth and breadth, front-end orientation, information focus and service convergence. The construction of this platform is still in the active iterative period, welcome interested students to build together.
conclusion
The rapid development of Internet business gives birth to DevOps. DevOps can promote the rapid iteration of business. However, when the business develops to a certain extent, it needs to pay more attention to the stability and expansibility of business, which is what SRE cares more about. We started from front-end deployment engineering, to front-end CI/CD process, and finally to front-end RESEARCH and development platform, which actually fits the road from DevOps to SRE. We hope that the experience we share can inspire you.
About us
We are byteDance’s front-end infrastructure engineering team, responsible for building the underlying infrastructure framework, developing technical practices, building r&d infrastructure, facilitating knowledge dissemination, promoting technical solutions, and developing technical standards by enabling and developing efficiency and security, building R&D suites, services, and platforms.
The vision of the team is to develop the front-end ecology, improve the depth of technology, ensure reliability and stability, enhance technology reserves, and promote the development of the industry. The research direction of the team is not only front-end /UI/ full stack, but also to solve the technical trap caused by large traffic and large concurrency, to go deep into cutting-edge technology and basic technology, promote technological development, and participate in technical community and technical standards. Welcome interested students to communicate together.
Contact: [email protected]; Email subject: Name – Years of work – Front-end infrastructure engineering.
read
- 2018: The Year Of Enterprise DevOps
- 2018 in Review: A year of innovation
- The 2019 Accelerate State of DevOps
- How to autoscale continuous deployment with GitLab Runner on DigitalOcean
- Serverless computing wars: AWS Lambdas vs Azure Functions
- What is the AWS Serverless Application Model (AWS SAM)?
- Application delivery network
- Love DevOps? Wait until you meet SRE
- gatsbyjs.com
- vercel.com
Welcome to “Byte front end ByteFE”
Resume mailing address: [email protected]