API is the core of software system, and the Complexity of software system is the most important factor for the success of large-scale software system. However, the Complexity is not completely destroyed by a single issue, but rather by the added Complexity of many, many small design considerations at the system design level, especially at the API design level (John Ousterhout’s so-called ‘Complexity is Incremental’ [8]). A successful system is not the result of a few bright spots, but rather the accumulation of design efforts.
So, here we are trying to think and make suggestions. On the one hand, what API design is __ good __ design? On the other hand, how can you do that in design?
API design challenges vary so much that it’s hard to have a single rule that applies everywhere, so when discussing principles and best practices, whatever those principles and best practices may be, there will always be scenarios that adapt and scenarios that don’t. Therefore, we will try to not only make some suggestions below, but also try to analyze the situations in which these suggestions are applicable, so that we can also take targeted exception strategies.
The scope of
This article focuses on the general API design of __ __, __ is not more suitable for remote calls (RPC or HTTP/RESTful apis) __, but there are no specific RESTful API issues discussed here.
In addition, this article assumes that the client interacts directly with the REMOTE server’s API. At Ali, there is more indirect access to remote services through the CLIENT SDK for a variety of reasons. The specific issues raised by the SDK are not discussed here, but the discussion still applies if you think of the methods provided by the SDK as proxies for remote apis.
API Design Guidelines: What is a good API
In this section, we try to summarize some of the features, or design principles, that a good API should have. Here we try to summarize more fundamental principles. The foundational principles are those that, if well followed, will allow the API to evolve without most design problems.
A good API
- Apis are used for interaction between programs, but how an API is used, and how the API itself is maintained, depends on maintainers and users having a clear and consistent understanding of the API. This situation is actually not easy to achieve.
- __ simple is simple__ : “Make things as simple as possible, but no simpler.” In practical systems, especially considering the evolution of systems with increasing demands, the problem we see most often is __ too complicated __ design, not too simple. It is therefore generally appropriate to emphasize simplicity.
- Allows multiple implementations__ : This principle seems more concrete, but it’s one I really like. This is a principle often cited by Sanjay Ghemawat. In general, the principle often mentioned when discussing API design is the principle of decoupling or loose coupling. However, this principle is more workable than the loose-coupling principle: if an API itself can have multiple __ completely different implementations of __, the API generally has a good enough abstraction that it is independent of one of its own concrete implementations, then it generally does not have the problem of being too tightly coupled to an external system. So this principle is a little bit more fundamental.
Best practices
This section attempts to discuss some more detailed, concrete suggestions that can make it easier to design an API that meets the basic principles described earlier.
Consider an excellent EXAMPLE of an API: the POSIX File API
If there is only one API design practice to list, it is probably the most helpful and actionable. This article could also be called “Experiencing API Design Best Practices through the File API.”
So the whole best practice can be summed up in one sentence: “Think about how the File API was designed.”
First, let’s review the main interfaces of the File API (take C as an example, many are Posix apis, and use the relatively simple I/O interface as an example [1] :
int open(const char *path, int oflag, ... /*,mode_t mode */); int close (int filedes); int remove( const char *fname ); ssize_t write(int fildes, const void *buf, size_t nbyte); ssize_t read(int fildes, void *buf, size_t nbyte);Copy the code
Why is the File API a classic good API design?
- The File API has been around for decades (nearly 40 years since 1988), and the core of the API has remained stable despite the evolution of hardware and software systems over several generations. This is remarkable.
- The API provides a very clear conceptual model, and everyone can quickly understand the basic concepts behind the API: what are files and the associated operations
(Open, close, read, write)
, clear and clear; - Many different file system implementations are supported, even for very different types of devices, such as disks, block devices, pipes, shared memory, networks, terminals, and so on. Some of these devices have random access, some only support sequential access; Some are persistent and some are not. However, all different devices and different file system implementations can adopt the same interface, so that the upper system does not have to pay attention to the difference of the bottom implementation, which is the strong vitality of this API.
For example, the same interface to open files, the underlying implementation is completely different, but through the same interface, different path and Mount mechanism, achieve simultaneous support. Others include Procfs, Pipes, etc.
int open(const char *path, int oflag, ... /*,mode_t mode */);Copy the code
For example, cePHFS and local file systems have completely different implementations at the bottom level, but the upper level clients can be treated without distinction and operate using the same interface, only differentiated by different paths.
For these reasons, we can see why the File API is so successful. In fact, it was so successful that today’s * -Nix operating system, everything is filed based.
Although we have a very good example of the File API, it is extremely difficult to design an API that is stable over the long term, so a good reference is not enough. Let’s try to expand into some more details.
Document well
Write detailed documentation and keep it up to date. I don’t need to go into detail on this point, but the reality is that many API designers and maintainers don’t take documentation seriously.
In today’s service-oriented/micro-service architecture, an application relies on a large number of services, and every service API is constantly evolving. __ accurately records every field and every method, and keeps updated __. Improving overall r&d efficiency is crucial.
Carefully define the “resource” of your API
Use “resources” plus operations if appropriate. Many apis today can be defined using such an abstract pattern, which has many benefits and is suitable for the design of HTTP RESTful apis. But an important prerequisite when designing an API is to properly define the Resource itself. What definition makes sense? Resource The Resource itself is an abstract Abstraction of the core object for a set of API operations.
The process of abstraction is __ the process of removing details __. When we do design, if the process or operation Object in the real world is concrete, the choice of abstract Object may not be so difficult, but it needs a lot of thinking about which details should be included. The File API, for example, abstracts the File Resource as “a data record that can be uniquely identified by a string.” This definition removes the details of how a file is identified (that is left to the specific implementation of the individual file systems) and the organizational structure of how it is stored (again, to the storage system).
While we want the API to be simple, it is more important that __ chooses the right entity to model __. In the underlying system design, we tend to prefer simpler abstract designs. In some systems, the design of the domain model itself is often not so simple and requires more careful consideration of how resources are defined. In general, the concepts in a domain model are abstract, and it is better for people to understand the model if they can approximate their experience in the real world. __ is often the key to choosing the right entity to model. In combination with the design of the domain model, we can refer to relevant articles, such as Teacher Bai’s article [2].
Choose the right level of abstraction
Closely related to the previous issue is the need to choose the appropriate Level of abstraction when defining the object. Different concepts are often interrelated. Again, take the File API as an example. When designing such an API, there are several possible options for choosing the level of abstraction, such as:
- Mixed text and image objects
- “Data block” abstraction
- File abstraction
These different levels of abstraction may describe the same thing, but are conceptually different levels of choice. When designing an API for data access client interactions, the File “File” is more appropriate abstraction, and design an API used in File system or device drivers, a block of data or data block devices may be appropriate abstraction, when design a document editing tool, may use “mixed text image object” File abstract level.
For example, in the definition of database-related API, the abstraction at the bottom may be aimed at the storage structure of data, while in the middle, the database logical layer needs to define various objects and protocols for data interaction, and the abstraction at the View layer is different [3].
Different data models are recommended for different layers
This is closely related to the previous one, but emphasizes that the models differ from layer to layer.
In the service-oriented architecture, data objects often go through multiple layers in the process of processing. For example, view-Logic Model-storage is a typical hierarchical structure. Our recommendation here is to use different data structures for different layers. In the book, John Ousterhout [8] emphasized more directly: Different layer, Different Abstraction.
For example, the seven-tier model of a network system, each with its own protocol and abstraction, is a typical example. The previous file API is a Logic Layer model, and different file storage implementations (file system implementations) use their own independent models (for example, fast devices, memory file systems, disk file systems, and so on have their own storage implementation apis).
When API design tends to use the same model for different layers (for example, between a system using a back-end storage Service and its own model, see figure below), it may mean that the responsibilities of the Service itself are not clearly defined, and should the functionality actually sink?
The problem with different layers using the same data structure is also the evolution and maintenance of the API. During the evolution of a system, it may be necessary to replace the back-end storage, or to separate the cache requirements for performance optimization. In this case, it may be found that binding the two layers of data together (or even storing the FRONT-END JSON directly in the back-end) leads to unnecessary coupling and hinders the evolution.
Naming and Identification of the resource
When an API defines a resource object, the general need is to provide Naming and identification. Naming /ID generally has two choices (not an internal system ID, but one that will be exposed to the user) :
- Use free-form string as ID (string nameAsId)
- Naming /ID expressed in structured data
When to choose which method, need specific analysis. The naming method defined by free-form string leaves the maximum freedom for the concrete implementation of the system. The problem is that the inherent structure of naming (such as paths) itself is not part of the API’s mandatory definition, but becomes an implementation detail. If the name itself has a structure, the client needs to have the logic to extract the structure information. It’s a balance that needs to be struck.
For example, the file API uses free-form string as the identification of file name, and the URL of the file is specified by the specific implementation of the file system. This allows the Windows operating system to use “D:\Documents\ file.jpg “and Linux to use “/etc/init.d/file.conf”. And if the file named data structure is defined as
{
disk: string,
path: string
}Copy the code
This structure, which exposes the “disk” and “path” parts of the structured data, may be suitable for Windows file organization but not for other file systems, which is to say, leakage of implementation details.
If the abstract model of a Resource Resource object naturally contains structured identity information, adopting a structured approach simplifies the logic for clients to interact with it and strengthens the conceptual model. At this point, the flexibility of the logo is sacrificed for other advantages. For example, the design of a bank transfer account can be expressed as
{
account: number
routing: number
}Copy the code
Such a structured identity, consists of two parts, account and identity between Banks, this design contains a certain business logic, but this part of the business logic is __ described the system of internal logic rather than implementation details __, and such a design may help the implementation of simplified and avoid some unstructured string id security problem and so on. So a structured identity might be more appropriate here.
A related question is, when should __ provide a numeric unique ID? This is a common question. There are several issues to consider in relation to this:
- Are there already structured or string identifiers that uniquely and stably identify objects? If you already have it, you don’t need numerical ID;
- Is the 64-bit integer range sufficient?
- Numeric ids may not be that user-friendly. Is numeric IDS helpful to users?
If there are answers to these questions and there are no obstacles, then it is ok to use the number ID, __ otherwise use the number ID__ with caution.
Conceptually what are the meaningful operations on this resource? What operations are conceptually reasonable for this object?
Once the resources/objects are identified, we need to define which operations need to be supported. At this time, the emphasis is “__ is Conceptually reasonable __”. In other words, operation + Resource together sounds natural (if the resource itself is also properly named). Of course, this “if named correctly” is a big if, which is not easy to do). Operations are not always CRUD (create, read, update, delete).
For example, if an API operates on quotas, the following might sound natural:
-
Update quota
(Update amount),transfer quota
(Atomized transfer amount)
But trying to Create a Quota doesn’t sound natural, because the concept of a Quota seems to represent a quantity that doesn’t need to be created conceptually. As an additional consideration, does this object really need to be created? What do we really need to do?
Potency For update operations, prefer idempotency whenever feasible
Idempotency refers to the potency of an operation Whereby an operation with this property can be carried out many times without affecting the result of the initial implementation Be applied multiple times without changing the result beyond the initial application. “[3]
Idempotency clearly has many benefits in system design, such as safer client retries that make complex processes easier to implement. But Idempotency hasn’t always been easy to achieve.
- Idempotency created by Create type idempotency, repeated creation is easy to occur after repeated calls, to achieve idempotency, The common practice is to use a __client-side generated de-deduplication token __ and a Unique ID for repeated retries so that the server can recognize duplicates.
-
Update type Idempotency Update Value (Update) type apis should avoid “Delta” semantics in order to achieve Idempotency. For the update class operation, we simplify to two implementations
-
Incremental
Increase or decrease in quantityIncrementBy(3)
Semantics like that -
SetNewTotal
(Set new total)
Semantics such as IncrementBy are hard to avoid error retries, whereas SetNewTotal (3) (total set to X) semantics are easy to idempotent. Of course, in this example, IncrementBy also has some advantages, namely that it is easier to process in parallel when multiple client requests are added at the same time, and SetTotal can cause parallel updates to overwrite (or block) each other. Here, it can be argued that the semantics of updating the increment and setting the new total are different strengths and weaknesses that need to be resolved depending on the scenario. If concurrent updates must be prioritized, you can use the semantics of update increments and help address idempotency with a Deduplication Token.
-
- Idempotency__ : The idempotency__ problem of Delete is that after an object has been deleted, a second attempt to Delete it may result in an error because the data cannot be found. This behavior is generally fine, though not strictly idempotent, but it has no side effects. Idempotency deletion is also supported if Idempotency needs to be implemented by staged deletions in the Archive->Purge lifecycle format or by persisting the Purge log format.
Compatibility is compatible with
API changes need to be compatible, compatible, compatible! It’s so important that it should be repeated for three times. Compatibility here refers to backward compatibility, and the definition of compatibility is not Break the use of the client, that is, __ the old client can access the new version of the server (if it is under the same large version) without error behavior __. This is especially important for remote apis (HTTP/RPC). Compatibility has been well summarized, for example [4].
Common __ incompatible __ changes include (but are not limited to)
- Deletes the value of a method, field, or enum
- Method and field names
-
The method name field is unchanged, but semantic and behavioral changes are also incompatible. This category is easy to overlook.
For more specific descriptions, please join [4].Copy the code
Another important compatibility question is, how does __ make incompatible API changes __? In general, incompatible changes require a __Deprecation process that implements __in steps at the time of a major release. The Deprecation process is not described here. In general, it is necessary to support old and new fields/methods/semantics while maintaining compatibility with past versions and giving clients enough time to upgrade. This process is time-consuming, which is why API design is so important.
Sometimes, an oriented internal API to upgrade and development of the students often tend to choose high efficiency, adopt a called “release” of the pattern to be incompatible changes, immediately notify known to all the clients, their service apis to do an incompatible change, everyone together, at the same time update, switch to the new interface. Such an approach is highly undesirable for several reasons:
- We often don’t know all of our customers who use the API
- The release process takes time and there is no real “synchronous update”
- Regardless of the backward compatibility model, if a new API has problems and needs to be rolled back, it can be very troublesome, such a plan will probably not have a rollback scheme, and the client may not be able to follow suit.
Therefore, it is strongly not recommended to use the synchronous upgrade mode to process incompatible API changes that have been applied in the production cluster.
Batch mutations
How batch updates are designed is another common API design decision. There are two common patterns:
- Client batch update, or
-
Batch update is implemented on the server.
As shown in the figure below.Copy the code
API designers may wish to implement a server-side batch update capability, but we recommend avoiding this as much as possible. Unless it makes sense for the client to provide atomized + transactional batching __, implementing server-side batching has a number of disadvantages and client-side batching has advantages:
- Bulk server updates introduce API semantics and implementation complexity. For example, semantics, state representation, etc. when partial updates are successful
- Even if we want to support batch transactions, consider whether different back-end implementations can support transactionality
- Batch updates often challenge server performance and are easily abused by clients
- Implementing batching on the client side makes it easier to spread the load among different servers (see figure)
- Client batch allows the client to determine the retry policy more flexibly
Be aware of the risks in full replace
Full replacement refers to the replacement of the old Object/Resource schema with a new Object/Resource in the Mutation API. The API will look something like this
UpdateFoo(Foo newFoo);Copy the code
This is a very common Mutation design pattern. However, such a pattern has some potential risks that API designers must be aware of.
When using Full Replacement, the update object Foo may already have a new member on the server, and the client is not aware of the new member until it has been updated. Adding a new member to a server is generally a compatible change, but if the member was previously set by another client that knows about the member, and a client that doesn’t know about the member does full-replace, the member may be overwritten.
A safer way to Update is to use Update masks, which introduce explicit parameters into the API design to indicate which members should be updated.
UpdateFoo {
Foo newFoo;
boolen update_field1; // update mask
boolen update_field2; // update mask
}Copy the code
Or update mask can be repeated as “A.B.C.D”.
However, due to the complexity of the maintenance and code implementation of this API approach, there are not many apis that adopt this pattern. Therefore, the title of this section is “be aware of the risk” rather than requiring update masks.
Do not attempt to create your own error codes or error return mechanisms
API designers sometimes want to create their own Error code, or express different mechanisms for returning errors, because each API has a lot of detailed information that the designers want to express and return to the user, thinking “the user might need it.” In fact, doing so often just makes the API more complex and harder to use.
Error-handling is a very important part of user API usage. To make it easier for users to use apis, it is best practice to use standard, uniform Error Code, rather than create one for each API. For example, HTTP has standard Error code [7], and Google Could API is designed with uniform Error code [5].
Why not create your own Error code mechanism?
- Error-handling is a matter for the client, and it is difficult for the client to pay attention to the details of so many errors. It is generally handled in two or three cases at most. Rather than trying to distinguish error details, clients are most concerned about whether the error should be retried or whether it should continue to be returned. The various error code mechanisms only complicate the process
- Some people feel that providing more custom error code helps convey information, but that information is not meaningful unless it is handled separately by the system. Fields inside an Error message can do the same if they just pass information.
More
For more Design patterns, see [5] Google Cloud API Guide, [6] Microsoft API Design Best Practices, etc. Many of the issues mentioned here are also covered in these reference documents, and common design specification considerations such as versioning, Pagination, and filter are also discussed. I will not repeat it here.
reference
【 1 】 the File wiki en.wikipedia.org/wiki/Comput… O [2] is white, the domain model design series, yq.aliyun.com/articles/63… 【 3 】 Idempotency, wiki en.wikipedia.org/wiki/Idempo… [4] Compatibility cloud.google.com/apis/design… [5] API Design patterns for Google Cloud, cloud.google.com/apis/design… 【6】API Design Best Practices, Microsoft docs.microsoft.com/en-us/azure… 【 7 】 the Http status code en.wikipedia.org/wiki/List_o… [8] A Philosophy of Software Design, John Ousterhout