Summary: In terms of cloud computing apis, there is no SUCH API standard as POSIX, and basically each major vendor has its own policy. Of course, some of the industry’s mainstream standards such as OAS are supported by most cloud manufacturers, but the API of cloud manufacturers themselves are often blooming due to historical reasons and technical routes. For example, AWS OpenAPI belongs to RPC style, Azure is WebService style, and GCP is based on gRPC as the mainstream. There are a lot of technical discussions. This paper would like to elaborate the importance of OpenAPI to the overall competitiveness of cloud computing from the perspective of customer experience and r&d efficiency.

The author | xu Ming source | ali technology to the public

A takeaway

Students who are engaged in API technology are very familiar with it. No matter developing software on the operating system or creating distributed network services, they cannot do without calling various APIS. For application developers, systems are built through various programming languages, system calls, and various class libraries and programming frameworks. In order to improve development efficiency and uniformity, various API standards, such as POSIX, have emerged. The implementation of these standards ensures that applications can run on a variety of hardware and software platforms without much modification, greatly improving the overall IT ecosystem.

However, in terms of cloud computing APIS, there is no SUCH API standard as POSIX, and basically each major vendor has its own policy. Of course, some of the industry’s mainstream standards such as OAS are supported by most cloud manufacturers, but the API of cloud manufacturers themselves are often blooming due to historical reasons and technical routes. For example, AWS OpenAPI belongs to RPC style, Azure is WebService style, and GCP is based on gRPC as the mainstream. There are a lot of technical discussions. This paper would like to elaborate the importance of OpenAPI to the overall competitiveness of cloud computing from the perspective of customer experience and r&d efficiency.

Features of OpenAPI for cloud computing

If ali Cloud Feitian operating system and the traditional operating system analogy, then it is also composed of the kernel layer, interface layer, operation interface, business applications, computing, storage, network and other core products constitute the kernel, API layer undertakes the kernel management and control and data communication. Various control interfaces are equivalent to the Terminal/Windows/ MAC OS UI of the operating system, and various industry applications based on cloud computing run on this operating system.

Figure 1 Flying System

Ali Cloud is different from the traditional operating system, and OpenAPI is naturally different from other formats of API system, such as Tao system and B2B open platform. The open business platform outputs services based on business data for the purpose of integrating the business ecosystem, while the open platform of Ali Cloud outputs the control ability, data operation ability and other enterprise-level capabilities of the cloud operating system. The former focuses on the service business model, while the latter focuses on the service technology base. Therefore, the OpenAPI system of cloud computing should be centered on service technology developers and enterprise scenarios to ensure the sound and stability of the technical system, closely connect with the industrial technical system externally (such as open source tools and third-party vendors), and promote the collaborative management of numerous cloud services internally.

Ali Cloud OpenAPI has the following characteristics:

  • Large number: Currently, the number of Aliyun OpenAPI is as high as over 10,000, and the daily usage is over ten billion, distributed in nearly 300 products.
  • Fast growth: The business develops fast, and the annual growth rate of the quantity is close to 100% in recent years.
  • There are many TYPES of APIS: OpenAPI is generally divided into control and data. Control classes are divided into RPC/ROA forms, and data classes are divided into data flows and files. Many services need their own formats.
  • High product collaboration requirements: a single OpenAPI cannot meet user requirements. Scenario-based user requirements require multiple OpenAPI combinations of multiple products to serve, which puts forward higher requirements on API orchestration and API collaboration between products. For example, in terms of stability, a problem with OpenAPI of one product may cause an avalanche of the entire control link.
  • Strong demand for enterprise capabilities: OpenAPI was mainly used to manage the cloud resources or data transfer, operating objects are user assets, in addition to conventional identity management, rights management, for enterprises to serve the operational, financial, legal, regulatory and other departments, when involving numerous cloud products for architecture and the underlying infrastructure the completeness, accuracy, timeliness demanding.
  • Closely combined with the trend of industry technology: Cloud is global, as a platform to serve a variety of scenarios, people can not be separated from the combination of various industry standards and technical systems, cloud computing and open source industry highly combined to prove that we can not do technology behind closed doors.
  • Increased stability risk: If the OpenAPI of the commercial open platform is unstable, it may only affect a certain business function module or process on the customer side, but the problem of cloud OpenAPI may affect the underlying technical system of the customer, and the explosion radius will be larger.
  • Call hot spot concentration: Call volume distribution basically conforms to the 80-20 principle, 20% cloud products bear 80% of the call volume, the experience of core products determines the user’s body sense of Ali Cloud, ensuring the operation of typical customer scenes is crucial.

The above characteristics determine that compared with the traditional open platform, OpenAPI on cloud should focus on the construction of technical capabilities and at the same time take into account the enterprise-level scenarios of customers, so as to do a good job in experience.

Figure 2. OpenAPI user demand hierarchy

Management automation is the core demand of enterprise customers

So what is the core experience of cloud computing customers in the OpenAPI space? To analyze a practical case on Aliyun, the specific points include:

  • Customers expect the entire process to be automated, from code submission to server deployment through automated tools.
  • Many customers want to use a hybrid cloud system, in which the upper cloud and the lower cloud are combined, and the business system is tightly integrated with the cloud platform.
  • Multiple open source software, such as Git /Jfrog/Terraform, are widely used in the client system to integrate into a complete automated process.

To sum up, the customer’s core demand is: customer business system should be able to highly automated integration with the cloud platform. Not only customers, but cloud vendors often emphasize concepts such as resilience and self-healing, backed by highly automated architectures.

To achieve highly automated integration, OpenAPI system is a comprehensive requirement. In contrast to POSIX, a standard, complete, and high-quality API promotes interoperability between operating systems and ensures minimal migration costs for upper-layer applications. As for cloud computing, which aspects should such specifications meet customer needs? In practice, we summarize as follows:

  • Style consistency: POSIX apis are generally consistent in style, such as file handling apis, and their core error codes are consistent. Consistent style, terminology, errors, and modes of operation can reduce the cost of understanding and increase the efficiency of application developers. However, if the API design styles of different products are inconsistent, users will have a high cost of understanding and inconvenience to use, and the professionalism of the cloud platform will be questioned. For example, the current OpenAPI of Ali Cloud has some problems, such as different descriptions of professional terms in different products, inconsistent attributes and data of different products for the same resource information, inconsistent forms of paging API, and even different naming of upper and lower case.

  • Functional integrity: Functional integrity is not hard to understand, but how to define functional integrity is always controversial. Should 10 apis be open enough for a cloud product, or 100 apis? It’s a bit of a matter of opinion, and the product is evolving. POSIX file handling covers a set of standard file handling API, including the create/close/dup/dup2 / an FCNTL/flock/fsync/lseek/mkstemp/open/read/sync/write API, All possible apis for manipulating files exist so that users can fine-tune the files. Therefore, for resources on the cloud, all management actions from the customer’s perspective should be open, as the customer needs to automate their full life cycle management. In practice, the entity relationship model is generally used to design a set of apis that cooperate with each other and cannot be treated randomly.

  • Service availability: The biggest problem in practice is that different teams have different standards for API SLAs. For example, some products require 99.99% in terms of usability, while others find 99% acceptable. In extreme cases, if some OpenAPI allows only one concurrency, then the OpenAPI has no quality of service for the user, and automation can be terminated due to various exceptions. At the same time, if certain restrictions, such as traffic limiting, are necessary, the client should be informed in the ToB scenario, otherwise the client will not know how to optimize its call frequency.

  • The soundness of supporting system: customer experience is the whole process of customer’s psychological feeling from knowing to using the product. The development experience on Linux/Mac is good because the toolchain is mature and complete. Customers on the cloud should also have access to professional and detailed tool support and technical support when developing based on OpenAPI, just as MSDN for Visual Studio, IDE for Java development, and debug tools for any language. Tools such as SDKS, documentation, and debugging tools are must-have products, while features such as code samples, visualization of API calls, and more are also valuable.

In addition, cloud internal systems also need to be highly automated through apis. For some typical scenarios, such as proprietary cloud deployment, new region expansion, and single product expansion, automatic deployment is not good for the overall human efficiency of the company. More importantly, the implementation time will be prolonged, and the customer experience will be worse.

To solve the above problems, the main difficulties lie in how to unify standards, how to establish a comprehensive supporting platform system, how to measure service quality, how to continuously promote service standards and how to examine customer experience.

Cloud computing requires resource-oriented programming

In the Linux/UNIX world, there is a famous saying that everything can be filable. So can everything in the cloud be recycled?

Ali Cloud external OpenAPI is based on HTTP protocol, restful specification has put forward the concept of resource-based design. In practice, there are few apis that adhere to this principle. A common question is “What should be defined as a resource?” “Doesn’t my API work just fine without a resource design?” “When I design, I have the concept of resources, but the client does not need this ah?” .

However, customer confusion is real:

  • Want to build a resource management system, ali cloud how can know I have a list of all resources?
  • So if you get it through OpenAPI, how can you know what resources the API corresponds to, what operations the resources can do, and what is the relationship between resources?
  • How can different products return different properties for the same resource type?
  • Want to query the combination state of a number of resources of different products, at present one by one to write code is too troublesome, what is a good way?
  • It’s too much work to figure out the resource types corresponding to so many apis by ourselves. Could you tell us how Ali Cloud does it by itself?

In the face of customer demand, we need to answer several questions:

  • What is a resource? What resources should be managed? Should unmanaged services also be defined as resources?
  • What kind of resources does Ali cloud have in the end, and where is the unified list? Can it be obtained automatically through OpenAPI?
  • What are the attributes of these resource types? What can I do? What is the corresponding API? What are the states of resources? What is the relationship between resources? Can you make sure the resources are the same?
  • How can resource-oriented programming reduce development costs?

For cloud computing scenarios, the efficiency of internal r&d will also suffer if there is no resource model. The reason is that enterprise customers are different from individual customers. Relatively mature enterprises have strong demand for supervision of people, finance, property, rights and law. Facing challenges such as internal management, profitability and regulatory constraints, a mature IT governance system is highly dependent on the concept of resources. Such as ali cloud of RAM/ActionTrail/Config/RD/ResourceManager, etc. Without the resource model, these products have to define their own resources and communicate with cloud products respectively, and the implementation schedule and quality cannot be guaranteed to be consistent. The same goes for open-source software, such as Terraform, which is resource-oriented. Even many platform services, such as billing, require the concept of resources to be better managed.

Figure 3 Resource producers and consumers

If the resource model can be unified, it is equivalent to the customer and Ali Cloud in a set of object-oriented Java classes or database tables, all products that rely on the resource model will benefit from it, easier to understand, consistent communication, research and development can provide a unified technical solution to improve efficiency.

Therefore, API design oriented to resource programming is very important for customers and the cloud platform itself. If it is not considered in the early stage, it will cost more in the later stage and will certainly affect the overall service quality of Ali Cloud.

Cloud computing needs to precipitate unified OpenAPI/ resource metadata

Metadata is data about data. It describes information about data organization, data domains and their relationships. Metadata platforms are not new, and there are many applications in big data, for example. Since Ali Cloud has hundreds of products and tens of thousands of Openapis, the number of resources must be huge. At this time, it is necessary to have a unified platform to manage resource information. Resources are just an abstraction, and the metadata of OpenAPI is needed to provide complete services.

What is the value of unified OpenAPI/ resource metadata?

1. Promote the consistency of product experience: Each product line of Aliyun develops independently, but it will face the awkward situation of one resource being different from another resource. Each product has its own understanding, which is not conducive to unified customer experience.

2. Improve communication efficiency: A unified model is just like a standard database schema, enabling related business parties to communicate in the same context.

3. Improve r&d efficiency: structured standard model, which allows programs to deal with stereotyped data instead of human; Taking Terraform as an example, with the resource metadata, automatic scripts can be directly written to generate Terraform modules, which improves the access efficiency of cloud products by about 50% and saves the GO language research and development resources and joint adjustment costs in the process.

Figure 4. Code-free automated generation based on API metadata

4. Improvement of business quality and continuous guarantee: A pain point in software development is how to ensure the correctness of past functions with business iteration after the initial release of cloud products. Taking The RAM product of Ali Cloud as an example, if we can put the resource metadata, API access logs, RAM Policy together with the actual authentication log of cloud products, we can check whether the authentication behavior of cloud products meets expectations by comparing the declaration content of metadata with the actual actions. An automated platform inspection mechanism based on data and code is more efficient and accurate than the rule of man.

5. Enabling more business scenarios: Azure has a product called Resource Graph Explorer that manages all the resources on the platform by Resource dimension, not a problem across geographies, somewhat similar to Windows Resource Manager. Complete metadata management will make the development of such products possible. One might wonder, can’t you do it without metadata? Theoretically, it is possible, but it must be half the result with twice the effort, because it requires repeated coordination and communication with various products, and the cost is very high, rather than using a set of platforms to bear the standardized production process, which is not easy to reuse and cannot be compared with the same.

Therefore, the internal value of unified Ali Cloud OpenAPI/ resource metadata management is that many business development around resources will become easier, even without code, improve business efficiency, external customers will be able to get a consistent business model with internal, improve user experience, more convenient integration.

Vi OpenAPI experience is one of the core experiences of cloud customers

If a cloud operating system wants to serve customers well, a well-designed API is a necessity. Otherwise, it is difficult for users to develop and deploy application services quickly and efficiently based on the API layer, which will have a serious impact on business competitiveness. Who is willing to use an operating system that is top in various concepts and capabilities but difficult to manage?

  1. OpenAPI is a service contract for cloud products. Cloud platform not only needs to ensure the quality of service, but also is very difficult to go online and offline. It is difficult for products to forcibly close an API at risk, and incompatible changes are not acceptable, which is equivalent to breach of contract and may cause the collapse of customer business system, followed by legal risks, public opinion risks and customer loss.
  2. OpenAPI maturity is important. Customers shop around when using cloud services. In addition to price, stability, functions and other factors, whether it can quickly and conveniently integrate with customers’ business systems is an important competitive factor. Ali Cloud has contacted many large customers who have clear requirements on API maturity.
  3. Good development experience and rich service ecology may become the core competitiveness of cloud manufacturers. Windows dominates the consumer operating system market by the experience generation difference of Windows system, Linux/Unix occupies the enterprise market by the enterprise development ability under the environment of Windows, and macOS’s good development experience has fought its way out under the dominance of Windows. All these indicate that the final victory must be based on customer experience. When the core service capacity of each manufacturer becomes homogenized over time, the differentiated competitiveness lies in price, experience and service. Now, the advantage of foreign competitors in experience is also the moat that has been accumulated for many years. It is impossible to be comparable without investing time and resources to accumulate.
  4. The customer perspective is particularly important in cloud computing scenarios. The logic is not what interfaces we can open up, but what interfaces the customer needs us to open up. The lack of functional interface openness may lead to the interruption of customers’ production process and seriously affect customers’ confidence.
  5. Industry-compliant apis are more easily compatible with industry technology tools and partners, and more easily accepted by the community. For example, it would be very difficult to find an operating system that does not support POSIX standards. Incompatibility means there is no market. A poorly designed API makes it difficult for the business to collaborate with the external ecosystem, and if it does, it puts pressure on internal r&d resources, affecting business growth and business competitiveness.
  6. OpenAPI is not just an API problem; the supporting service system must be improved. If you look at development on Linux, you don’t just have a system call function, you need documentation/code samples/various debugging tools, and many IDE tools are derived from that. On Ali Cloud, such full-link service is still relatively weak. When customers encounter problems, they either repeatedly submit work orders, which causes great pressure on the company’s service resources, or when the service is not satisfied, customers vote with their feet, which affects the reputation of Ali Cloud and damages the company’s long-term competitiveness.

Therefore, OpenAPI is not only a technical issue, but also a product issue, which is an important part of the product experience and needs to be taken seriously.

Vii OpenAPI customer experience needs strong system support

Can the customer experience of OpenAPI be measured? The biggest problem ali Cloud open Platform faced when it first launched OpenAPI was that some people from both customers and inside joked that Ali Cloud API was not easy to use. They were all point-like problems thrown out and driven by customer demand. There was no systematic method to measure customer experience indicators, and problems could not be solved at scale.

The number of OpenAPI in Aliyun is more than 10,000. The point-like and fuzzy feedback is not helpful to solve the global problem. In addition, users actually have a concept of specific products, but the subjective bias of the survey is greater when OpenAPI is not conceptually strong.

Ali Cloud’s OpenAPI experience is a full-link problem, as shown below:

Figure 5 Typical OpenAPI user usage paths

Long links, any link is not done well may lead to poor user experience, feedback information is also varied, it is difficult to extract effective information. So there are several core issues that need to be addressed in turn:

  1. What kind of customer voice is clear?
  2. How can I get these customer voices? What channels are there?
  3. How to deal with these sounds, how to clean and classify, how to locate the root cause, which link is the problem of analysis?
  4. How to establish overall and subdivided quantitative indicators, and then targeted management, the formation of closed loop?
  5. How to promote and operate?
  6. How to test the effect of governance finally?

Here’s how we do it:

1 Step1 quantitative

  1. Specific user problems: the information that can reflect the specific problems of customers used to be mainly based on work orders, but the users who feedback work orders are only the tip of the iceberg, and more information is not seen. Phone calls, nail group information, website feedback and other contents should also be included. These information add up to specific problems, and many problems together form a valuable big data set, which can lay a data foundation for data-based governance.
  2. Data acquisition: We tried to contact work order system, internal platform, nail group and other channels, and need to get through the data of each platform.
  3. Data cleaning: customers, work order is unstructured data, need to natural language processing techniques, ali cloud open platform and dharma school cooperation, based on the specific target classification keyword, tagging methods such as training the AI, the AI to automated extraction, a large amount of data classification, quantification, sound and returning to the customer for better classification and quantification.
  4. Business value: After classifying the main problems of customers through root cause analysis, optimize and upgrade the products in a targeted way, and test the improvement effect of the products by taking the work per unit user of the problem as the effect index. Some of the problems are found in trends and need to be upgraded. For example, when a customer complains that they can’t find an SDK or API, we need to optimize our API search. Some issues are known to be internal, such as API availability issues, and have mechanisms in place to push for business improvement.
  5. Improve the effect of measurement: first of all, there should be specific indicators, currently the main work order to reduce the amount. But we think it is not enough, because the work order is only the tip of the iceberg, the quantity is not enough, nor is it detailed enough, and the circulation cycle is long. Therefore, we also tried to close the OpenAPI developer portal, on the one hand standardizing product experience, on the other hand directly reaching end customers to get the most detailed feedback. For example, customer feedback can be as detailed as when the page number of an API is set to 0, it causes functionality to fail, the documentation details are not correct, and the required description is not correct.

Through the above actions, we can automatically analyze the major pain points of OpenAPI users and build digital trend charts to track them.

2 Step 2: Governance

  1. There are laws to follow: Ali Cloud OpenAPI Specification has been developed, which has been iterated to version 1.3, covering sub-items such as design style, quality of service, resource specification, domain name specification, document specification and so on, unifying everyone’s cognition at the standard level.
  2. Strict law enforcement: if you want to make standards fall into the ground only text is not enough, there must be a supporting platform mechanism. Close all API management of Ali Cloud, and continuously improve the core capabilities of API management platform from the perspective of improving r&d efficiency and systematic management of the whole life cycle. By depositing canonical review rules into a rules engine, users can automatically scan for problems while developing the API and call back if they fail. For the standards that cannot be automated, establish the audit process and management approval process, improve the production cost of non-compliance issues, and constantly improve the proportion of automated audit. Implement digital management for API quality, evaluate API quality of each BU product through quality quantification, and supervise and urge improvement regularly and synchronously.
  3. Offenders shall be prosecuted: Cooperate with aliyun User experience department to promote the problem improvement, and verify the effect by checking the reduction of work orders on the user side.

All the above capabilities are deposited on the internal management platform, and the internal cloud products can manage THE API one-stop and participate in the API governance process in a process.

Step3 productization

The goal is to close the product experience to external users. In the past, various functions of OpenAPI were scattered and fragmented. FY21, we integrated all functions related to OpenAPI into one product, namely OpenAPI developer portal. In addition to improving user experience through centralized product construction, In addition, troubleshooting, search, and Codesample modules are added for using links.

Through the above three steps, the overall picture of OpenAPI experience governance is as follows:

FIG. 6 User experience management closed-loop diagram

Problems found in the operation of this system will be improved through the continuous improvement of the functions of THE API management platform and API developer portal to gradually improve user experience. According to the work order in 2020, there is an obvious decline.

Eight summary

Ali Cloud is a huge distributed operating system. OpenAPI is equivalent to an important channel for the user layer to operate the kernel layer. Ensuring its stability, function, performance and experience is related to the customer’s core experience, the company’s image and ecological expansion, as well as internal product quality and r&d efficiency. The access layer must be deep, thick and strong so that internal and external customers can be well served. From the team’s two years of practice, only digital and systematic infrastructure construction can make OpenAPI experience better. This paper discusses several important concepts of OpenAPI on the cloud, hoping to attract people’s interest and attention to the value of OpenAPI system.

References:

  1. Apiacademy. Co / 2015/04 / API…
  2. www.itread01.com/articles/14…
  3. Azure.microsoft.com/zh-cn/featu…
  4. Maryamalshamsi98.wordpress.com/2014/05/21/…

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.