The author of this article is Zhenwei Wang, Research and Development Director of CODING. The full text is about 5000+ words and the expected reading time is 20 minutes
The cloud evolved from the traditional IDC room. At the beginning, the cloud was designed to solve elastic computing and high availability problems in data centers. It can be said that the public cloud has made it possible for thousands of enterprises to flexibly rent data center resources on demand, and has played a key role in promoting the development of digital society.
But it is too narrow to simply think of the cloud as the on-demand renting of resources. With the development of cloud technology and industry standards, the concept of cloud native has emerged. Cloud native has revolutionized traditional applications, which can run on local development computers before being taken over by the cloud as a flexible, highly available resource provider. Cloud native applications are different from traditional applications. Traditional applications are oriented to operating system programming, while cloud native applications are oriented to cloud programming. Cloud-native applications are difficult to develop, debug, test and launch in a non-cloud environment.
CODING is committed to serving developers. In the era of cloud development, CODING has been actively exploring cloud technologies and concepts to solve practical problems. With the company’s business development and technological innovation, CODING itself has been practicing the most advanced development mode and embracing the cloud process. The evolution of CODING’s development environment generally goes through six stages, gradually creating a five-star development experience.
In the six stages of development, the development environment gradually went to the cloud, from 0% to 100%. In the sixth stage, CODING used Nocalhost to complete 100% of the development environment went to the cloud. As the upcloud process progresses, the following chart shows the experience ratings for different development environments.
The following sections will explain each of the six phases.
Phase 1: High-powered laptop
We started CODING in 2014 with just a few programmers. Like most startups, we could develop the first version of CODING with one home broadband, one gigabit wireless route, and a few laptops.
The typical situation is that everyone has a MacBook Pro (I7 + 8G + 256G SSD), using JetBrains SERIES IDE development, after writing the code to run or debug in the IDE, IDE will automatically compile and start the application, developers can easily debug the program. In fact, the experience at this stage was five-star, but with the development of business, technology, etc., the development experience gradually decreased, and it was unbearable.
Basic condition at this stage
The CODING development environment at that time
- Number of developers: around 10
- Application architecture: Java monomer back – end + Angular.js front – end separation
- IDE: IDEA + WebStorm development
- Build method: manual
- Deployment mode: Cloud host + Nginx + Tomcat deployment
- Development environment: Laptop + Tomcat
- Start time of CODING service: 10 seconds
Cloud computing and technology architecture industry development over the same period
- Cloud services, such as cloud host and cloud object storage, are gradually being accepted
- Docker began to be known at home
- Kubernetes is open source
- The concept of microservices was introduced
Existing problems
- There is no stable test environment
- Manual build packaging and deployment is inefficient
- The performance and availability of single back-end applications are bottlenecks
Development experience score: 5 stars ⭐⭐⭐⭐⭐
Although there were some problems at this stage, all the developers were satisfied with all aspects of the development environment. The main indicator was “write code -> Build -> start -> Debug -> self-test”. The cycle was very quick (called “code – self-test feedback loop” below). All it takes is a Debug button in the IDE, and each loop takes about 10 seconds, which means developers can look at their code faster and more frequently, rather than waiting around or doing a lot of meaningless mechanical work.
Stage 2: High matching laptop + LAN server
At the beginning of 2015, the developers were already frustrated by the mechanical operation of manual build, and the test students were always teasing the unstable test environment, chaotic version and poor experience running on the developers’ computers. We decided to add a LAN PC as a shared server (I7 + 16G + 500G SSD) for build and test server work. Jenkins on the server can be easily built and packaged automatically, and the test environment has its own resources and dedicated maintenance.
Basic condition at this stage
The CODING development environment at that time
- Number of developers: around 20
- Application architecture: Multiple Java backends + Angular.js backend separation
- IDE: IDEA + WebStorm development
- Construction method: LAN server + Jenkins
- Deployment mode: Cloud host + Nginx + Tomcat deployment
- Development environment: laptop + SpringBoot process
- Startup time of CODING service: 20 seconds
Cloud computing and technology architecture industry development over the same period
- Cloud services, such as cloud host and cloud object storage, are gradually being accepted
- Docker began to experiment with some teams in China
- The CNCF Foundation was established
- Mesos, Kubernetes, OpenShift and other cloud resource choreography solutions are becoming popular
Existing problems
- Some components in the back-end service are split, making it difficult to set up the local development environment
- The deployment is cumbersome and requires maintenance of connection relationships and configuration files between multiple services
- One set of test environments is not enough
- Laptop performance is already stretched thin
Development experience Score: 4 stars ⭐⭐⭐⭐
At this stage, an automated build and stable test environment were implemented, but the back-end services began to become four services, and the local environment was cumbersome to deploy, and only one test environment was not enough to use. In addition, the slow startup of the service itself is causing new problems. Developers can still easily run CODING on their laptops after a series of configurations, but the code-test feedback loop takes about 30 seconds.
Stage 3: high desktop + LAN cabinet
When it came to 2016, we really could not bear the problems caused by the single testing environment. With more and more personnel and more and more complicated business, we purchased 10 second-hand Dell R710 to form a cabinet. After upgrading the memory and solid-state disk, thousands of gigabit switches and virtualization system were configured. The cabinet was placed in the engine room of the office, and although it was noisy, it solved some of the development and testing resource problems. Individual developers are realizing that even top-tier laptops don’t have the performance to support a smooth CODING development experience, and many are using desktop consoles (i7 + 32GB + 1T SSD) to support their development.
Basic condition at this stage
The CODING development environment at that time
- Number of developers: around 40
- Application architecture: Multiple Java/Ruby/Golang backends + Angular.js backends separated
- IDE: A variety of ides and editors
- Construction method: LAN VIRTUAL machine + Jenkins + Docker
- Deployment mode: Cloud host + Nginx + Docker + self-developed container System deployment
- Development environment: Desktop + Docker-compose
- Startup time of CODING service: 40 seconds
Cloud computing and technology architecture industry development over the same period
- Domestic cloud service manufacturers began to build PaaS products represented by containers and elastic databases
- Docker began to experiment with some teams in China
- Microservice frameworks, led by Spring Cloud, are beginning to be understood
Existing problems
- The noise of the cabinet is so loud that even when it is in the office machine room, it still disturbs colleagues
- The server in the cabinet is unstable and often crashes due to poor heat dissipation, power overload, and server aging
- The number of microservices and configuration information further increases, making it more difficult to set up the local development environment and painful for novices to get started
- Local computer and LAN virtual machine development do not cooperate smoothly due to lack of tool support (native coding, virtual machine running)
Development Experience Score: 3 stars ⭐⭐⭐
At this stage, most developers use Docker-compose to support the development environment. It is relatively easy to build the local development environment, but after modifying the code, they still have to compile and package the Docker image. The docker-compose up -d command is then called to restart the container to see the effect of the modified code. The coding-self-test feedback loop takes a further increase from 30 seconds to about 2 minutes.
The fourth stage: high-configuration development computer + high-configuration LAN server + LAN cabinet
In 2017, in order to cope with the problem of slow compilation and small memory of notebook, the company provided local area network desktop (AMD R7 + 64G + 1T SSD) for each team to support development. These LAN desktops are grouped into a virtual machine resource pool and divided into virtual machines for developers to use. In the small room of the office, no space, air conditioning, sound insulation or other facilities can accommodate one more cabinet. Therefore, the only cabinet is used to support the test environment and pre-production environment drill.
Basic condition at this stage
The CODING development environment at that time
- Number of developers: around 80
- Application architecture: Multiple Java/Ruby/Golang backends + react.js backends separated
- IDE: A variety of ides and editors
- Construction method: LAN VIRTUAL machine + Jenkins + Docker
- Deployment mode: Cloud host + Nginx + Docker + self-developed container System deployment
- Development environment: LAN virtual machine + Docker-compose
- CODING service startup time: 3 minutes
Cloud computing and technology architecture industry development over the same period
- Kubernetes is maturing and being accepted by more and more teams
- Many teams began to try to transform their single architecture business with Spring Cloud, Dubbo, etc
- The concept of Service Mesh was introduced
Existing problems
- High skill requirements for developers (all need to know how to use and manage Linux)
- The elastic capacity of resources is poor, unable to cope with the problem of high and low peaks
- Development and test environments are very different from production environments
- Local computer and LAN virtual machine development do not cooperate smoothly due to lack of tool support (native coding, virtual machine running)
Development Experience Score: 2 stars ⭐⭐
This stage is a lot of problems, and also the longest lasting one. Miserable developers waste a lot of energy building, maintaining, and updating their development environment. After writing the code, you must go through compilation, package the Docker image, push it to the image warehouse, pull it down on the virtual machine, restart the container, and wait for the completion of startup to view the running results of the code. The code-test feedback loop has risen to nearly 10 minutes. This forces developers to write a lot of code blindly before they can try to run debugging once.
The fifth stage: high provisioning development computer + cloud host
Fast forward to 2019, and CODING began to use cloud hosting provided by Tencent Cloud to support development and testing. The company got rid of a lot of leftover cabinet servers and desktops, the office returned to quiet, and the development and testing environment became stable. But the only benefits cloud consoles offer are stability and quiet, with little change in other aspects of the development experience. As the business becomes more and more complex, 32GB memory can no longer complete CODING, so i9 + 64GB desktop computers are everywhere in the office.
Basic condition at this stage
The CODING development environment at that time
- Number of developers: around 120
- Application architecture: multiple Java/Python/Ruby/Golang/PHP back-end + React. Js end before and after separation
- IDE: A variety of ides and editors
- Construction method: CODING CI + CODING products library
- Deployment mode: Kubernetes (TKE)
- Development environment: cloud host + docker-compose/minikube
- Startup time of CODING service: 40 minutes
Cloud computing and technology architecture industry development over the same period
- Kubernetes is becoming the de facto container choreography standard
- Service Mesh is emerging
- CI/CD grayscale publishing began to rise
- Cloud native applications are proposed
Existing problems
- Synergy issues between local PC and cloud host development (native coding, cloud host operation, network, interface, storage, etc.)
- High skill requirements for developers (all need to know how to use and manage Linux)
- The elastic capacity of resources is poor, unable to cope with the problem of high and low peaks
- Development and test environments differ greatly from production environments (docker-compose or Minikube for development, TKE for production)
- Kubernetes YAML and Docker-compose configurations for a large number of services are difficult to manage
Development experience Score: 1 star ⭐
Cloud host has significantly improved the stability of the office room, but in essence, a more stable Linux server cannot help developers build a CODING development environment quickly, nor can it speed up the coding-self-test feedback loop. Developers still need to code, build, package images, push images, remotely pull images, and restart containers to see what they’ve coded. With the increasing complexity of the system, the coding-self-test feedback cycle of some CODING services continues to increase even with stronger cloud performance and more stable cloud host.
In some businesses, the cycle time has reached a staggering one hour.
In fact, this stage is not much different from the previous stage, only cloud host to replace LAN virtual machine. The main reason for the significant increase in the coding-self-test feedback cycle time is the sharp increase in the number of microservices. The 150 microservices of CODING have an internal startup dependent order. If the dependent services are not started, Pod startup of downstream services will fail. Each failure will cause Kubernetes to extend the restart interval, and it will take a long time for all services to be started. If the initiation time of intervention service is changed from disordered to orderly, the initiation time can be significantly reduced.
The sixth stage: develop computer + cloud container cluster + Nocalhost
By 2020, CODING was determined to solve this problem once and for all. As a result, Nocalhost, an open source product, was launched in December. Nocalhost is designed to solve the problem of difficult development and debugging of cloud native applications. Currently, it can support rapid deployment, development and debugging of applications based on Kubernetes. At present, the back-end developers of CODING have been developing CODING using Nocalhost, and the underlying CODING is based on TKE, a large-scale container cluster of Tencent Cloud. CODING has moved its development environment to the cloud to achieve a five-star development experience.
In general, CODING companies maintain a large Kubernetes cluster (TKE) and use Nocalhost to allocate space to developers, where developers can deploy CODING at any time. After deployment, developers can select the microservices they want to develop and switch to the development mode. Then, with the IDE side directly connected cluster, modify the code and HotReload can directly view the running results. For dynamic languages such as PHP and Python, because of their natural support for fast HotReload, the coding-self-test feedback loop is directly reduced to 1 second, and implementation takes effect as soon as it is saved.
Basic condition at this stage
The current CODING development environment
- Number of developers: around 200
- Application architecture: hundreds of Java/Python/Ruby/Golang/PHP back-end + React. Js end before and after separation
- IDE: A variety of ides and editors
- Construction method: CODING CI + CODING products library
- Deployment mode: Kubernetes (TKE)
- Nocalhost + Kubernetes (TKE)
- CODING service startup time: 4 minutes
The current state of the cloud computing and technology architecture industry
- Istio has become the most popular Service Mesh solution
- Serverless Kubernetes began to rise
- The complexity of Kubernetes began to attract attention from the industry, and people began to think about how to look at Kubernetes from a developer’s point of view
- Serverless as the next generation of cloud computing technology began to attract attention
Existing problems
- The elastic capacity of resources is not thorough enough, and the cost is slightly higher. (Try using Serverless Kubernetes in the future to save costs for low peaks at night)
- It is difficult to share public service problems, and it is wasteful to deploy a set of public services for each person. (Nocalhost will solve this problem in the future with the Service Mesh solution)
Development experience score: 5 stars ⭐⭐⭐⭐⭐
The operability of this solution is very strong. The team does not need to purchase hardware equipment, nor do they need to master complex computer room networking, virtualization management software and Kubernetes cluster maintenance technology. Directly in the cloud service provider to open the development environment dedicated Kubernetes cluster and install Nocalhost can realize the development environment on the cloud.
The only problem with this solution is that the cost is slightly higher, but we believe that with the development of cloud technology and the refinement of flexibility, the cost will eventually come down and the development environment of cloud native applications will also be on the cloud in the future. This solution has greatly improved the development efficiency of the development team. Besides, for CODING, a behemoth with 150 microservices, any novice programmer can run the whole environment within 5 minutes and realize the second-level code-self-test feedback loop. This is a five-star development experience for developers. Nocalhost can control the service startup sequence to ensure the speed of application deployment, and directly convert the microservices in the cluster into development mode to ensure the similarity of the environment. Automatic code synchronization and HotReload greatly improve the efficiency of the code-test cycle.
The following figure shows an example of a CODING artifact product developed using IDEA based on Nocalhost
conclusion
The simplicity of a single application is naturally opposed to the complexity of a microservice application. With the development of business, technology and industry, microservitization is an inevitable trend. In this process, the developer’s work experience is often sacrificed over the safety and stability of operations. As a company committed to serving developers, CODING is serious about making development easier. As we continue to solve problems, we are also actively rewarding the developer community. Nocalhost is an open source, vendor-independent project. Your contributions are welcome.
- Nocalhost 官网: nocalhost.dev
- Nocalhost GitHub: github.com/nocalhost
Nocalhost team is looking for excellent talents for a long time. Those who are interested in serving developers and building a cloud native open source ecosystem can send their resumes to [email protected]