Author | a xiao source |Erda Erda public account

Background story

In early 2017, we started to build endpoint’s own PaaS platform based on DC/OS (MESOS + Marathon). The core mission was to solve the company’s software development and deployment delivery efficiency problems. Build the company’s r&d efficiency (DevOps), intelligent Monitoring operation (APM, Monitoring) and other technical platforms. DC/OS actually works pretty well:

  • First, it integrates into a complete container platform and does not need to focus too much on detailed modules such as networks.
  • At present, K8s mainly relies on Helm to implement, but the biggest problem existing in DC/OS is that the number of users is still too small. Once the problem occurs, we need to explore and solve the problem by ourselves. (It is estimated that our team is one of the most professional DC/OS operation and maintenance teams in China, and the denominator is no more than 3.)

Here is a story that our team remembers very deeply: one Saturday in the summer of 2018, we went to the company to work overtime to prepare for the launch of the business team, and the company’s entire DC/OS network was down!! Originally, we had planned to go out for dinner after securing the launch, but we had to stay up all night. During the night, we tried everything we could find, everything we could do, and it didn’t work. There was still no way to restore the network. Eventually, we decided to wipe out the entire cluster of network metadata and rebuild the network, and things really worked out (in fact, it took a long time to figure out how to “clean up all the metadata and rebuild the network”). All right, let’s get back to business. The DC/ OS-based PaaS completed production deployment for its first official customer in April 2018 and has since been delivered to several more customers. After stumbling through 2018, we decided to make the transition to K8s in 2019.

The development of Erda

On the first day of the architectural design of the PaaS platform, it was decided that the underlying container platform (DC/OS) should be shielded from abstraction. Therefore, we adopted the concept of Infrastructure as Code to define a specification of dice. Yaml (DICE is the name of PaaS platform). The core of dice. “Application-centric, developer-centric” to define and use infrastructure resources, etc. “App-centric” is not a new concept. Heroku came up with it and practiced it. We just took it a little bit further.

“Application-centric, developer-centric” is very important for endpoints, and we have to get business development students to focus on the business itself, not on: what is a container? What are K8s Deployment, Statefulset, Service, Ingress, Pod, etc.? I don’t care what kind of network the application is running on, right? You don’t have to worry about application monitoring access, monitoring coverage, diagnostic analysis, etc. In fact, this is very important for most enterprises. The design of “abstracting and shielding the underlying container platform” not only reduces the threshold of business development greatly, but also brings great convenience to our team. This is mainly reflected in the transformation of K8s, which only needs to develop a targeted K8s plug-in (not for K8s, but for our PaaS platform). In fact, before supporting K8s, we also cooperated with EDAS of Ali Cloud to develop EDAS plug-in and support application deployment to EDAS. On EDAS model has been delivered to several customers gradually. After we plug-in the container platform, it is also very easy to connect to Openshift/Rancher and so on, which is very friendly for many enterprise private cloud environments. From supporting self-built K8s, to supporting Ali Cloud container services, and then AWS, Azure, we have supported more and more customers along the way. We found that today, creating an enterprise IT environment (test, production) from zero on the public cloud is still a lot of work and requires very specialized skills, such as ECS, network, storage, database, SLB, middleware, container services, security, and so on. In some of our customers, we help them to create cloud environment from 0. So, we needed to improve efficiency, we needed to automate all of these things, so we developed the entire automation process based on Terraform. As a product team, everything we do is productized, and we have implemented Terraform’s automated cloud resource orchestration process into our cloud platform products from the start. As we went on, we developed more and more functions and covered more and more platforms, including edge computing management, fast data platform and mobile development. The original appeal of these features comes from real scenarios of endpoints, such as POS applications that are edge scenarios. Over the past four years, the most gratifying thing for us is not how many product features we have made. It is that since the middle of 2018, we have been releasing one version every month and upgrading to the latest version for all customers. There has never been a code fork maintenance. Including the aforementioned big migration, big upgrade from DC/OS to K8s (which I personally think was a great thing to do). Up to now, our platform has managed hundreds of K8s cluster applications, and one of our goals is to “become the platform for managing the largest number of K8s applications”.

Why do we open source?

At the beginning of 2021 (before Chinese New Year), when we were thinking about our 21 year plan, we started to realize that we had done a lot of things and we had built a strong product. We were so solid and focused on product development that we completely neglected the important things of sharing with more people, giving back to the community, and building an impact. So, after The Spring Festival, our team went all out to open source the entire platform. When we do open source, we think very clearly that we need to open source completely, leaving no so-called advanced features (private), leaving no internal code branches, and transferring the daily iterative development of the whole team to GitHub. We want to be long-termers and build a good enough open source community for three years, five years, maybe even longer to build a top open source project.

There are a lot of great open source projects around the world, but most of them are tool-based, such as: Front-end js library, development framework, basic server software, such as direct open source products (or goods) project is not much, I personally think that open source products will gradually become a new trend in this field, the society is sure to evolve toward a more efficient, low-cost, tools or technology will be the real infrastructure, For most people it’s not something to focus on; Facing the user group, solving the direct problem of the majority of users (the first problem) is the maximization of social value. Direct open source products also bring us new troubles. We hope and welcome people to use open source products for free, but we certainly do not want some company or organization to directly sell our open source products (including services), so we decided to adopt AGPLv3 protocol for open source. The purpose of AGPLv3 is not to restrict the internal use of individuals or enterprises, but to prevent the resale of software or services. We’ve been calling our PaaS platform Dice for 4 years now, and we’re at version 4.0, so we’re really torn in the open source community about whether the first release is 4.0 or 1.0, and it’s going to look very strange to the community. Eventually, we changed the name of Dice to Erda in keeping with all of our products. Erda is the nickname of the Earth in the novel Base, Terminus, Erda, Trantor, Gaia, etc are derived from the names of the planets in the novel. After the name change was completed, the new name was a fresh start, so it was only natural for the open source community to release from version 1.0. The Erda open Source project has a small vision: “To be able to build any kind of application on the Erda platform and continuously improve application development performance; The ability to deploy and distribute applications to any cloud, anywhere; Can continuously monitor, diagnose, and manage applications based on applications.

Erda public account quick play guide

As an open source one-stop cloud native PaaS platform, Erda aims to provide developers with a stable and reliable, comprehensive, ecological compatible, open source cloud native PaaS platform and best practices. We also hope to communicate with the majority of developers and make progress together. This is also the original intention of creating [Erda Erda] public account: To provide a gathering point for technical communication for the majority of developers. Here, in addition to being free to share ideas, you can also regularly see the following we prepare:

  • Product dynamic
    • Latest Technical progress
    • Industry Trend analysis
    • Technical interpretation
    • Best practices
  • Technology to share
    • Related technical interpretation
    • Graphic review of industry conference speeches
  • Activities to share
    • Active dynamic
    • Technical live release
  • · · · · · ·

If you have other want to understand the content, also welcome to the public number background message, or add a small assistant wechat (Erda202106) to join the exchange group!

Welcome to open source

Four years of wind and rain, after all, to go through the mountains and rivers. As an open source one-stop cloud native PaaS platform, Erda has platform-level capabilities such as DevOps, micro-service observation governance, multi-cloud management and fast data governance. Click the link below to participate in open source, discuss and communicate with many developers, and build the open source community. Welcome to follow, contribute code and Star!

  • Erda Github: https://github.com/erda-project/erda
  • Erda Cloud website: https://www.erda.cloud/