The author | KubeVela source project maintainers | alibaba cloud native public number

As the implementation of OAM (Open Application Model) on Kubernetes, KubeVela project has been evolving from OAM-Kubernetes-Runtime in less than half a year, but its development momentum is very rapid. Not only has it topped the GitHub Go Trend and made the first page of HackerNews, but it has also rapidly gained users from all over the world, including MasterCard, Springer Nature, Fourth Paradigm, SILOT, Upbound, etc. There are even commercial products like Oracle Cloud and Napptive built on top of it. At the end of March 2021, the KubeVela community announced the release of v1.0 with all stable apis, officially beginning its move toward enterprise-level production availability.

However, if you don’t pay much attention to the cloud native space, you probably haven’t done much about KubeVela yet. Don’t worry, this article will take the opportunity of v1.0 release to give you a detailed overview of the development of KubeVela project, interpret its core ideas and vision, understand the rising star of cloud native application management platform “where the way lies”.

First of all, what is KubeVela?

In short, KubeVela is a “programmable” cloud-native application management and delivery platform.

But what is “programmable”? How does it relate to Kubernetes? What problems can it help us solve?

The “Capability Dilemma” of PaaS System

PaaS systems (Cloud Foundry, Heroku, etc.) have been touted for their simple, efficient application deployment experience since their inception. However, we also know that today’s “cloud native” is a world where Kubernetes are popular. What problems did PaaS (including Docker) encounter in the end?

In fact, anyone who has tried to use PaaS will be impressed by one of the fundamental flaws of this system: the “capability dilemma” of PaaS systems.

Figure 1 – Capability dilemma for PaaS systems

As Figure 1 shows, PaaS systems tend to have a very good experience in the beginning and always solve problems just right. But over time, a very nasty situation occurs: the application’s appeal begins to outstrip what PaaS systems can provide. What’s more, once this problem occurs, users’ satisfaction with PaaS system will fall off a cliff. This is because redeveloping the platform to add features or modifying the application to fit the platform is a huge investment with low returns. What’s more, everyone starts to lose faith in the platform: who knows if the next system or application overhaul will happen soon enough?

This “life gate” is arguably the main reason why PaaS failed to become mainstream despite having all the elements required for cloud native.

In contrast, the characteristics of Kubernetes are more prominent. While Kubernetes has been criticized for being “complex,” the benefits of Kubernetes will begin to show up as the complexity of your application increases, especially if you start to need support via CRD Controller. You’ll be glad you chose K8s.

The reason for this is that The essence of Kubernetes is a powerful and robust infrastructure capability access Platform, which is called The Platform for Platform. Its APIS and way of working are not naturally suited for direct human interaction, but it can tap into any infrastructure capability in a very consistent way, giving platform engineers unlimited ammunition to build upper systems such as PaaS. This buggy approach to infrastructure capacity makes even the most sophisticated PaaS system seem like an obstructive toy, which is a welcome welcome for many large enterprises that are struggling to build in-house application platforms (the users PaaS vendors really want to win).

Cloud native PaaS: Old wine in new bottles

The previous point is important: if a large enterprise decides to adopt a PaaS system or Kubernetes, the platform team is often the one that makes the decision. On the other hand, just because the platform team’s opinion is important doesn’t mean the end user’s opinion should be ignored. In fact, in any organization, it is the business team that directly creates value that has the highest voice, albeit a little later.

So in the vast majority of cases, any platform team that gets Kubernetes will not directly ask the business to learn Kubernetes, but will build a “cloud native” PaaS based on Kubernetes and use it to serve the business side.

Yi, as a result, we went round and round, and returned to the origin of the story. The only change is that today’s PaaS is realized based on K8s, which is really much easier.

But what about the reality?

The story of constructing PaaS based on Kubernetes seems beautiful, but the whole process is inevitably sad. PaaS development is to put it mildly, but 80% of the work is in designing and developing the UI, and the rest is in installing and operating the K8s plug-in. What’s more regrettable is that the PaaS we built like this is not fundamentally different from the previous PaaS. Whenever users’ demands change, we have to spend a lot of time to redesign, modify the front end, and schedule the launch. As a result, the ever-changing ecology of K8s and its infinitely scalable nature are “sealed” under the PaaS we built ourselves. Finally, one day, the business side could not help asking: your platform team on K8s, what is the value?

The above dilemma of “to solve the inherent limitations of PaaS, resulting in the introduction of a new PaaS and restrictions” is a core problem encountered by many companies in the process of landing cloud native technology. Once again, we seem to be locking users into a fixed set of abstractions and capabilities. The benefit of cloud biogenics is simply that it’s easier for us to develop the platform ourselves — which doesn’t seem to make a lot of sense to business users.

To make matters more troubling, the introduction of cloud native and K8s also makes the role of operations personnel very delicate. Originally, their knowledge of business operations best practices is the most important experience and asset in the entire company. However, after the enterprise cloud protogenics, K8s will have to take over the job. So, a lot of people are saying that K8s is putting “operations” out of business, which is a bit of an exaggeration, but it does reflect the anxiety that this trend has created. What’s more, we can’t help thinking from another Angle: how to apply the experience and best practice of operation and maintenance in the background of cloud biogenesis? Taking a simple workload like a K8s Deployment object as an example, which fields are exposed to the user and which are not, although reflected in the UI of the PaaS, certainly cannot be dictated by front-end development.

KubeVela: Next generation programmable application platform

Alibaba is one of the industry’s pioneers in cloud native technology. Therefore, the above mentioned cloud native technology problems around the application platform are relatively early exposed. At the end of 2019, The basic technology team of Alibaba cooperated with the R&D performance team to explore and try a lot to solve this problem, and finally proposed the idea of “programmable” application platform, which was presented to everyone in the way of OAM and KubeVela open source project. This system has rapidly become the mainstream way for Ali to build its application platform.

To put it simply, “programmability” means that we do not superimpose abstractions (even a UI) on Kubernetes itself in the process of building the upper platform. Instead, we use CUE template language as a code-based way to abstract, manage, and reveal the capabilities provided by the infrastructure.

For example, a CERTAIN PaaS of Ali will provide a capability called Web Service to users. This capability means that any Service that needs to be accessed from outside is deployed in the way of K8s Deployment + Service, exposing configuration items such as image and port to users.

In the traditional approach, we might implement a CRD called WebService and encapsulate Deployment and Service in its Controller. But this inevitably leads to the previous PaaS “capacity dilemma” :

  1. How many Service types should we expose to the user? What if future users want other types?
  2. What if the fields that user A and user B need to expose are different? For example, we allow user B to modify the Label, but user A cannot. How should we design the PaaS?

In KubeVela, user-oriented functionality like the one above can be described in a simple CUE template (here’s a full example). When you write a CUE file, you can use kubectl apply immediately:

$ kubectl apply -f web-service.yaml
Copy the code

More importantly, KubeVela will automatically generate help documents and front-end form structures for this capability based on the content of the CUE template, so users will immediately see how the WebService functionality is used (such as parameters, field types) in the PaaS. And use it directly, as shown in Figure 2 below:

Figure 2 – KubeVela automatically generates forms

In KubeVela, all of the platform’s capabilities such as Canary publishing, Ingress, Autoscaler, and so on are defined, maintained, and exposed to users in this way. The end-to-end design of the user experience layer and Kubernetes capability layer enables the platform team to quickly implement PaaS and any upper platform (such as AI PaaS, big data PaaS) at very low cost, while efficiently responding to the continuous evolution of users.

1. Not just Kubernetes native, platform-as-code

Most importantly, in the implementation layer, KubeVela does not simply render CUE templates on the client side, but uses Kubernetes Controller to render and maintain generated API objects. There are three reasons for this:

  1. Kubernetes Controller is a natural fit for maintaining mappings between user layer abstractions and underlying resources, and always ensuring consistency between the two using the control cycle Reconcile mechanism. Configuration Drift, a common problem in IaC (Infrastructure-as-code) systems, will not occur.
  2. The CUE template Kubectl written by the platform team is applied to the cluster and becomes a Custom Resource in a Kubernetes, which represents an abstract, modular platform capability. This capability can be reused by platform teams across the company and can continue to evolve, and it is a Namespace resource, so different tenants of the platform can be assigned different templates with the same name without affecting each other. This completely solves the problem that different tenants have different demands for the same capability.
  3. If, over time, users design new features for the platform, the platform maintenance team simply installs a new template and the new design takes effect immediately, without any changes, reboots or redeployments to the platform itself. The new template is immediately rendered as a form and appears on the user’S UI.

Therefore, KubeVela’s above design fundamentally solves the long-standing problem of “unreliable” production environment of traditional IaC system, although the user experience is good, and reduces the time of the whole platform to respond to user needs from a few weeks to a few hours in most cases. Completely break down the barrier between cloud native technology and end user experience. It is implemented entirely in Kubernetes native mode, ensuring strict robustness of the entire platform, and any CI/CD and GitOps tool that supports Kubernetes will support KubeVela without any integration cost.

This system is popularly known as platform-as-code.

2. Don’t worry, KubeVela certainly supports Helm

When it comes to KubeVela and CUE templates, many people begin to ask: What is the relationship between KubeVela and Helm?

In fact, Helm, like CUE, is a tool to encapsulate and abstract Kubernetes API resources, and Helm uses Go template language, which naturally fits KubeVela platform-as-code design ideas.

So in KubeVela V1.0, any Helm package can be deployed as an application component, and more importantly, all the capabilities in KubeVela apply to both Helm and CUE components. This allows delivery of Helm packages via KubeVela to give you some very important capabilities that are difficult to provide with existing tools.

For example, Helm packages are mostly third-party, such as Kafka Chart, which is probably made by the company behind Kafka. Therefore, in general, you can only use, but cannot change the template in it, otherwise you have to maintain the modified Chart by yourself.

In KubeVela, this problem is easily solved. Specifically, KubeVela provides an operation side capability called Patch, which allows you to Patch resources encapsulated in components to be delivered (such as Helm package) in a declarative way, regardless of whether this field is revealed through Chart template, and the timing of Patch operation. It is the time after the resource object has been rendered by Helm and before it is committed to the Kubernetes cluster for processing that the component instance is not restarted.

For example, with KubeVela’s built-in grayscale publishing system (AppRollout object), you can also publish the Helm package as a whole incrementally regardless of the workload type (AppRollout object). KubeVela can be grayscale published even if it is an Operator in the Chart), rather than being published for a single Deployment workload like controllers such as Flagger. In addition, if you integrate KubeVela with Argo Workflow, you can easily specify more complex behavior such as the order of Helm packages to be published and the topology.

So KubeVela V1.0 not only supports Helm, it aims to be the most powerful platform for delivering, publishing and operating Helm Chart. Some students in the community have already tried out this feature before this article is published, so you can read it in this article.

3. All-self-service user experience and operation and maintenance in the cloud native era

Thanks to platform-as-Code’s design, the Kubevela-based application Platform is naturally self-service for users, as shown in Figure 3.

Figure 3 – KubeVela self-service capability delivery flowchart

Specifically, the platform team can maintain a large number of coded “capability templates” in the system with minimal human cost. As a platform for end users, business team only need according to your own application deployment requirements on PaaS UI ability to select several templates, fill in the parameters, you can complete a self-service delivery, no matter how complex the application, the business user learning cost is very low, and the default will follow the template defined in the specification; The deployment and operation process of this application is managed by Kubernetes in an automated manner, thus relieving a lot of mental burden on business users.

More importantly, the existence of this mechanism makes operations personnel once again a central role in the platform team. Specifically, they design and write capability templates through CUE or Helm, and then install these templates into the KubeVela system for the business team to use. If you think about it, this process is actually a process in which operation and maintenance personnel solidify the business demands on the platform into reusable and customized capability modules in a coded way, combining the best practices of the entire platform. And in this process, operation and maintenance do not need to carry out complex K8s customization and development, just need to understand the core concept of K8s. On the other hand, the ability of code modules, high reusability, change and online is very easy, and in most cases do not require additional development costs, is arguably the most agile “cloud native” operational practice, can let business really feel cloud native “integration of research, development, delivery and operational efficiency” core value.

4. Multi-environment, multi-cluster, multi-version application delivery

Another major update in KubeVela V1.0 is the improved deployment structure of the system, which provides Control Plane mode, enabling versioning application delivery for multiple environments and clusters. So now, a typical production environment KubeVela deployment looks like figure 4 below:

Figure 4-Kubevela Deployment on the control plane

In this scenario, KubeVela supports a multi-environment application description, Placement strategy for the application, and a grayscale distribution model for simultaneous deployment of multiple versions of the application online through Istio. You can read more about it in this document.

After v1.0, KubeVela will continue to evolve around the above architecture. One of the major work items is to migrate and upgrade all the KubeVela Dashboards, CLI and Appfiles to interact with the KubeVela control plane via gRPC. Rather than dealing directly with the target cluster as in previous versions. This part of the work is still in progress, and students who are experienced in building the next generation of programmable developer experiences are welcome to join us. At the same time, Springer Nature, a leading European tech publisher, is also working on a smooth migration from CloudFoundry to KubeVela.

conclusion

If we summarize KubeVela’s design and capabilities today, it’s not hard to see that it’s an inevitable path for cloud native application platforms today:

  1. Completely based on Kubernetes construction, natural integration ability and universality, natural full of Kubernetes and its ecological capabilities rather than superimposed abstract;
  2. X-as-code based platform capability modularization, with OAM model to achieve ultra-low-cost capability packaging, abstraction and assembly mechanism, fast and agile response to user needs, to provide self-service, lock-free application management and delivery experience;
  3. Unpack components and deploy applications based on Kubernetes controller mode to ensure final consistency and robustness of application delivery and operation and maintenance processes;
  4. The built-in application-oriented publishing strategy and multi-environment, multi-cluster delivery strategy greatly complement the community’s current single-workload centric publishing capabilities;
  5. No matter how complex the application deployment, only 1-2 Kubernetes YAML files are needed to complete the description, which naturally fits and greatly simplifies the GitOps workflow, greatly reducing the cost of getting started using cloud native and Kubernetes for end users. And it doesn’t bring any capabilities or abstract locking.

More importantly, KubeVela proposes a more reasonable organization method for the future cloud native application Platform team based on platform-as-code design idea:

  1. The platform SRE is responsible for the robustness of the Kubernetes cluster and components;
  2. The platform development engineer is responsible for the development of CRD Controller, which together with Kubernetes built-in capabilities provides complete application management, operation and maintenance infrastructure capabilities to the application layer.
  3. Business operation and maintenance combined with business demands, responsible for the code of best practices into CUE or Helm template, modular platform capabilities;
  4. Business users use the platform’s modular capabilities to manage and deliver applications in a completely self-service manner, with low mental burden and high deployment efficiency.

Based on this system, the KubeVela application platform can also be used to achieve a powerful “undifferentiated” application delivery scenario, achieving a completely environment-neutral cloud application delivery experience:

  1. Component providers define the capabilities required for application delivery (workload, operation and maintenance behavior, cloud services) as Helm packages or CUE packages and register them in KubeVela system.
  2. Application deliverers use KubeVela to assemble these modular capabilities into a completely infrastructure-independent application deployment description, while leveraging KubeVela’s Patch capabilities to customize and override component providers’ configurations or define complex deployment topologies.
  3. The multi-environment and multi-cluster delivery model defines application deployment modes and delivery policies in different environments, and configures traffic allocation policies for application instances of different versions.

The release of KubeVela V1.0 is the result of maximizing our validation based on the OAM model and cloud native application delivery scenarios, and represents not only a stable API, but also a mature usage paradigm. This is not the end, however, but a new beginning, opening the future of a “programmable” application platform, an effective way to unleash the full potential of cloud native technology, enabling end users and software deliverers to enjoy it from day one. We expect this project to fulfill its most modest vision: Make shipping applications more enjoyable!

To learn more

You can learn more about KubeVela and the OAM project in the following materials:

  1. Project code base: _github.com/oam-dev/kub… Star/Watch/Fork!
  2. Project official homepage and documentation: _kubevela.io/_, and welcome to participate in the “Cloud… KubeVela document Chinese localization translation work!
  3. Item Nail group: 23310022; Slack: CNCF #kubevela Channel.