Why do we Need automation on the Cloud??


Directed acyclic graph DAG

Directed Acyclic Graph (DAG) is a type of Directed Graph, which literally means there are no rings in a Graph. It is often used to represent dependencies between events and to manage scheduling between tasks.

An example of directed acyclic graphs

Topological ordering of all nodes is often used in directed acyclic graphs, and our system prototype is also implemented on this theoretical basis. Is to determine the order of all elements in accordance with the DAG dependency, the specific algorithm we can search on the Internet or information, here will not be introduced in detail. Once sorted, the next implementation completes the low-level elements first, then the upper-level elements, until all elements are initialized. The above is the theoretical reference of our arrangement system model.

Choreography system prototype

Here we assume that there is a system initialization process as follows:

To create all the elements in the desired order, we follow two key points:

  • The default is parallel execution.

  • Undependent is executed first.

In the implementation of the algorithm, we first decompose the element starting sequence into a directed graph, and calculate the dependency number of each node by traversing. As follows:

Note: Dependencies only need to calculate the neighboring nodes

Following the previous two rules: the dependency number of elements B and D is 0, so they can be initialized first. B and D are independent of each other and can be executed in parallel.

After the execution of any element, the number of all dependencies on these nodes is reduced by one to obtain the number of dependencies of all nodes:

The only elements that can be executed this time are C and F, because their dependencies are 0. After these two elements are executed, subtract the number of dependencies of the elements that depend on them by one to get all node dependencies again:

Follow the above logic recursively until all elements have been executed and the workflow is complete. It ensures that the entire process takes the shortest time sequentially. From the principle of workflow implementation, the ability of orchestration does not emphasize flow control, but the richness of orchestration elements and syntax. A good arrangement system can quickly complete the development of new elements, so as to provide the arrangement ability of new services.

Information transfer between elements

If each element is initialized, it has to record information about the other elements, so there is coupling between the elements in the implementation. To keep each element independent at execution time (that is, the current element is initialized without knowing the information of other elements), the body framework needs to keep a global information and then, when an element is initialized, tell it what it needs. It has no idea what the other elements are, but it has all the information it needs.

As an example, the scheduling framework maintains a global record of what parameters each element needs to initialize. The green ones are provided by the user and the red ones are automatically obtained after the dependent object is created. For example, if the VPC ID is required for VM creation, the VPC ID is known after the VPC is created.

So after D is initialized, C is ready to initialize. At this point, all arguments to create C should be validation values. There is no lack of information when calling the C service’s initialization API. In this way, the creation and destruction apis of C are implemented in a very independent way, dealing only with the C service itself.

As shown in the figure above, when developing a new service, you only need to know the new service itself, and all the desired information (which can be directly requested from users or obtained through dependencies) is managed and delivered through the framework.

This is our plug-in framework, which makes it very easy to add a service because the driver development of the service is completely independent.

Plug-in design

 1. Element lifecycle

Each cloud service object is an element from the perspective of the choreography system. When an element is added to the orchestration, it is required to provide basic execution capabilities such as add, delete, modify, and review. The orchestration system’s plug-in management framework invokes the element’s API based on user actions, such as create or destroy.

Now that you have the element execution flow framework from the previous section, you add a choreographer object that just completes the various behavior drivers for that element. For example, as long as there are methods to create and destroy VMS (apis), it is possible to add an EC2 service to the choreography element, which can be added to the template. The scheduling framework just treats it as a normal element.

2. User-defined plug-ins

Based on the advantages of plug-in framework each element driver independent, and considering the Resource object in Kubernetes also has a custom Resource definition, we can design an element plug-in to support the ability of users to define their own K8S layout objects. The “information” provided by the user is passed on to the underlying API intact. The underlying system interprets the user’s “information”. The orchestration system degenerates into a process control + information transfer channel.

3. Operation waiting & progress

As mentioned above, the operation of some cloud services is very time-consuming. If you cannot provide intuitive feedback on the overall progress, the user experience will be very poor, and the whole execution process will be suspended. So in element-driven writing, progress and waiting feedback must be considered so that the choreography framework is aware of execution progress. This allows the user to know which element is currently executing and how far it is progressing. This ensures that the overall choreography process can give the most direct and user-friendly response to the user.

TOSCA model

With the scheduling framework & plug-in framework, all that is left is the syntax of configuration files. The main reference syntax is AWS Cloudformation and TOSCA syntax. AWS-CFN is centered on resource initialization. TOSCA is a specification that aims to standardize how we describe software applications and everything that is. Taichichuan contains Taichichuan Required for them to run in the “cloud”, TOSCA is more app-oriented.

Given the popularity of container technology, more and more applications are emerging as stand-alone containers, de-emphasizing the need for traditional VMS. We feel that using TOSCA for template syntax is a good choice.

In fact, as you can see in the automation process, template syntax is not the key point. As long as it can be automated, the template can be written without much difference, so the key is to see the ability to automate. It’s like choosing a programming language, Java versus Go, and writing a binary tree traversal doesn’t care if it’s for or while. The main difference between programming languages is in built-in functions/libraries, so providing rich automation convenience in template syntax is the goal. Learn from AWS, which has a lot of built-in functions.

In the cloud, automation is just a necessity. Only when the base of automation is completed can a complete cloud ecosystem be built. Orchestration, as an advanced automation capability, is responsible for bringing the cloud ecosystem to its fullest. Is to test the strength of a cloud vendors hard currency.

Huawei PaaS team has years of exploration and accumulation in cloud, especially in the field of automation & choreography on PaaS cloud. Here I hope to share and promote the development of cloud choreography with the industry, so as to bring better user experience in the use of cloud, so that cloud automation can truly be as ubiquitous as the trend of cloud.