Ele. me high stability, high performance, high availability, high fault tolerance API architecture practice!

What is API Everything?

First, a brief introduction to API is equivalent to the front-end service interface such as Web access to the back-end. There is an isolation in the middle, which is adapted to the external end for access. Isolation is for security reasons, and there is a protocol conversion consideration.

Of course, there are many other considerations based on this piece. In the early development stage of Ele. me, many of our Web API layers were handwritten, that is, most application service backends wrote their own Web API, deployed independently, and provided HTTP API calls to the front-end.

At that time, the business was developing rapidly. In order to respond quickly, some business logic would be placed in the Web API layer, and even the Web API layer would access the database and perform database operations.

Accessing the database directly at the API layer can cause security issues and is not allowed. What is the style of the HTTP API interface between the front end and the back end?

There are Restful and JSON-RPC, which need to be considered uniformly, otherwise the experience of front-end development will be inconsistent.

In addition, HTTP API documentation is outdated and cannot reflect changes in code implementation, such as code changes, but the documentation is not changed.

Finally, the front-end and back-end students develop at the same time, but at a different pace. For example, the front end is under development, but the back end may be plugged into a more urgent project and cannot complete the API of the current project in a timely manner.

In this way, front-end development may be delayed, and front-end and back-end development may be out of sync, leading to mutual waiting between front-end resources and back-end resources, resulting in poor development experience and low efficiency.

Requirements Research -API Everything

Based on these situations, the company considered whether it could form a unified API framework, so we investigated various departments and found that they all had the need to develop a unified API framework. At present, there are also some independently written Web API layers.

The following figure shows some requirements for the survey:

From the figure, we can see that a very important function is HTTP to SOA service mapping, as well as authentication and authentication, such as SSO of the company, including users, some authentication of Ele. me, API deployment and operation.

API Orchestration calls to multiple apis, takes parts of the results returned by each API, assembs them into new results, and returns them to the front-end.

This application is called in a front-end API, which can actually call multiple back-end apis, assemble a return result to the front-end, reduce the front-end call back-end multiple times, and improve the front-end user experience.

In addition, because the back-end has many basic service interfaces, new business development does not need the back-end to provide interfaces, only need to combine and cut these interfaces.

There are ALSO API documentation, API tests, Mock apis, flow limiting, anti-crawls (interfaces exposed to the public, crawlers crawling them, how do we protect them, etc.) and grayscale.

Our API Everything does this by making back-end SOA services available to various ends (Web/APP) safely and reliably through the API.

Product technical solution principle

After the research, we need to consider what principles to consider product and system design.

API Everything is a base framework, and the stability of the base framework is a priority.

Even if you can’t meet all the functional requirements, but you are stable, the various applications connected to the API Everything framework will not be broken, so that the service can be improved on a stable basis, and new functions and features can be added. In other words, stability is Paramount.

Then performance, including throughput (response speed), high performance, will save hardware resources; High availability, avoiding single points of failure. There is fault tolerance, external dependence, to consider external dependence on how to do, how to downgrade.

Also, API Everything connects to the back-end application system, so external traffic can not impact the back-end application system. How to make the system more robust, how to protect themselves, how to protect access to the application system and so on.

Secondly, on this basis, we should consider how DevOps should be handled, and provide self-service DevOps for access parties, as well as various indicator viewing, alarm monitoring, error troubleshooting, log/trace/exception viewing, and self-service expansion.

Ultimately, it is hoped that this section can be transparent to access parties and can be automatically expanded. Problems caused by the API Everything framework will be solved by us, and problems arising from access applications will be solved by access applications with very clear boundaries.

Common problems can be solved automatically, field logs can be saved automatically, and self-healing can be possible, and the access party knows what has happened.

In addition, how to make our R&D or application development side more lazy? It’s about automating things you do all the time.

For example, to access an application, it is necessary to carry out various configurations, which are tedious and prone to error. Can we carry out automatic access and automatic configuration?

We talked a little bit about how the code is not in sync with the documentation, so instead of documenting it, we’ll document it inside the code.

For example, when we write Java Doc comments in Java code, we pull them out as part of the API documentation. In addition, we provide annotations to help complete the document.

User experience is also a factor to consider, because technical products are basically developed directly by engineers, in pursuit of completing functions, most of them do not consider user experience, resulting in awkward operation.

Therefore, in this respect, we should fully learn lessons and consider the user’s experience. If we can click on it, we don’t need to click on it twice. We can’t expose the technical complexity to users to understand and operate, so that users can enjoy it.

This framework involves a lot of configuration, scattered across different systems, and our idea is to do it all in one configuration without letting users understand which part of the API Everything framework is managing and which system to operate on.

The other is to meet different functional requirements, such as access to different protocols, which is our consideration of the whole product scheme in principle.

The life cycle

From the API side, we can see that the API life cycle starts from THE API development process, there will be documents, Mock, after development, management, that is, to authorize who can access, some can not gray scale.

API management is followed by API gateway service, which is the running service. Protocol transformation of API is carried out, for example, converting HTTP protocol into SOA, calling background SOA services, and finally ENTERING API operation and maintenance, which is monitoring management, deployment and expansion of API.

Product planning

According to product system design principles and API life cycle, we plan the following products, as shown below:

For example, the development support area is API Portal, and the runtime support area is Stargate Cluster. There’s also quality assurance, which is API Robot, which ensures API quality through automated regression testing.

In addition, it is important to consider how to synchronize the front and back ends during the development process. How to Mock API data to separate the front and back ends? This led to the Mock Server, which resulted in the four products as planned

But what is the relationship between these four products? Think of it in terms of interactions on the system.

As shown in the figure above, from the bottom, front-end applications (such as to the circle front end), accessed via HTTP, to the Nginx Cluster, and then to the SOA service.

The gray path leads to the circle management backend application, and the gray path leads to the circle service, as well as the red path, where the front-end URL accesses the Mock Server by adding a Query String.

When the Stargate Cluster receives this QueryString, it does not send it to the backend SOA service and routes it to the specified Mock Server. The front end will Mock it out.

The API Portal is responsible for the API documentation, which corresponds to the environment to be deployed to, and is displayed in the API Portal.

There are several environments in Ele. me:

Alpha test environment for development, which is provided for development use.
Beta environment, this is the environment provided for testing to verify whether it can go live, only after passing the Beta test, can go live. This Beta environment is also used for co-tuning with other teams.
Prod environment for online production. On the API Portal, you can find out exactly what the application is currently deployed and what the API documents are provided externally. This is obtained by API Portal by accessing the Stargate Cluster deployment information.

API Robot obtains the API definitions from the API Portal and sends test requests to the Stargate Cluster.

The internal services of Ele. me are SOA architectures, and the services are interdependent, so they need to be tested and coordinated.

Sometimes we find that our SOA services depend on the other side’s SOA services, and the other side has not completed the development. What should we do if we want to test our OWN SOA services?

At this point we can Mock Server, Mock each other’s SOA services, write Mock cases, and do our own development tests.

Code is documentation, so you might want to standardize how you write code, and once you’ve written comments and annotations, you can automate the extraction of documents from those comments and annotations into AN API document.

The Web API part does not need to be written manually. Currently we automatically generate code for the Web API and then automatically deploy it.

Deployment is listening for SOA service deployment messages, receiving deployment messages, automatically generating Web API code and automatically deploying.

Mock is also automatically generated. When you create a Mock Case, you automatically generate the corresponding data, which can be modified by yourself. There are API monitoring alarms, each application access, automatic link monitoring.

Stargate Cluster technology architecture

This is the technical architecture of one of the products, Stargate Cluster, shown above.

From the above, ELESS is our build system and we will listen for its build messages and call the base.stargate_core service when a build comes in.

The implementation of the service is very simple. The information required by Stargate Cluster is stored in MaxQ, an MQ product developed by Ele. me and currently widely used.

Finally, the Stargate Cluster operations Management service takes messages from MaxQ for processing.

Why are there such considerations? Because we had a lot of iterative development in Stargate Cluster operations Management services, often adding features (such as remote live) that were constantly iterating and adding, requiring frequent deployment, in an unstable situation.

We wanted to keep any of these build and deployment messages in place, so we came up with the stargate_core service, which is very simple, doesn’t iterate over functionality, stays the same, and is more reliable, and just puts build and deployment messages in MaxQ.

While Stargate operations management develops and then restarts, at least the data in MaxQ is not lost and can be consumed and deployed again after the restart. We consider the results of system reliability based on the isolation of change and non-change.

Stargate Cluster is deployed based on Docker

Talk about why it can be deployed automatically? It’s actually kind of interesting, we’re using the Docker environment here.

Starting from the bottom, you can see that, for example, the SOA service is deployed, through ELESS, we get the deployment message and put it into MaxQ.

Then we take out the message from MaxQ, call AppOS (The Docker platform developed by Me), and start the corresponding Docker instance.

The instance runs the corresponding Web API code of the SOA service. When the Docker instance starts, it calls the Navigator (the Nginx management platform developed by Me) to register the IP of the Docker instance with Nginx. So external traffic hits these Dockers.

After the new Docker is successfully started, the Stargate Cluster will call AppOS to destroy the previous version of the Docker instance. The Navigator will also delete the CORRESPONDING IP on the Nginx, which is to complete the automatic deployment.

This is some information for our automatic deployment, and as you can see from the figure, the upper left corner is basically PushSeq for SOA service deployment, followed by PushSeq Used by Client.

The consistency of these two Push SEQs indicates that Stargate Cluster automatically deploys its corresponding Web API side during SOA service deployment, and the version of Push Seq used by both sides is the same. We also save some version information including deployment.

API Portal – Automated documentation

With the Stargate Cluster out of the way, let’s look at the API Portal piece.

During system deployment, API Portal obtains the deployment source file, automatically parses the code, and automatically generates an API document based on the comments and annotations in the code, which is stored in Swagger mode.

At the beginning, we used Swagger’s native interface to display this interface. The front-end developers were not used to this interface, so we used what the front-end liked, which was to display it as various forms for them to use. The above table display is developed because of this.

What about Swagger’s native interface? There’s a feature on API Portal called Try it Out, which is back-end development that uses this feature to see what the back-end API looks like, and if it’s not, modify the back-end API until you’re happy with it.

The front end also uses this feature to see what it looks like without the actual data. The Try It Out feature uses Swagger’s native interface, but the back end prefers this page, so it stays. In this way, the user experience of the front and back ends can be satisfied, and everyone gets what he needs.

Here is a Swagger document interface:

The Mock Server process

What is a Mock Server?

Take a look at the scenario in which the SOA Client tests the Service Provider.

The Service Provider (Server Under Test) depends on external services, such as Server Cluster 1, but the external services are not available due to other conditions. We use Mock Server to simulate Server Cluster 1.

This dependency problem is resolved and the SOA Client can test the Service Provider normally.

Mock Server can also be used to solve the problem that dependent services need to return specific scenarios but are not easy to operate. In this way, it is easier to write different Mock cases using Mock Server to return scenarios.

What if there are many dependencies in an SOA environment, but the interfaces are not recognized or the environment is not good?

We Mock Server to mask all of these services so that we only need to test the services we want to test, which is very effective for SOA environments to resolve dependency issues.

Mock Server – Automatic parsing

Ele. me has Maven private service, which is called between SOA services through the API on Maven private service.

The diagram above shows our Mock Server interface. On the left, you can see the Maven dependency interface for input dependency services. Basically, the operation is to continue the process of the previous SOA invocation, filling in the box above with external dependencies, and specifying the interface of the dependency in the Mock Server.

Mock Server can then pull these dependencies from Maven private Server, automatically analyze which classes it has, and display all the methods in the class.

The method on the right of the image above is automatic analysis, and the one in yellow wants to Mock. Click the plus sign on the right to create the Mock Case. The Mock Case name tests arbitrary arguments.

Automatically generate Mock cases

The Mock Case is that when the Mock Server receives data that matches the Input parameter in the Mock Case, the Output in the Mock Case is returned as a response.

I just created a Mock Case called Test arbitrary parameters. The values of the Mock Case are automatically generated based on the analyzed Model (data definition).

For example, when Input “type” : Integer, we leave out enum[234567], which means that the Mock Case will hit any Integer in the request and return Output.

If enum [234567] is added, the Mock Case will only hit if the request parameter is 234567.

Output supports functions, and the Preview above shows what happens after the Output expression is executed. When the Mock Case is hit, the Preview results are returned as a response.

Front and rear end development separation

Back-end development completes the API definition by adding annotations and comments to the code interface according to the PRD.

After the code is checked in, the building system knows about the change and notifies the API Portal of the change. The API document is automatically generated, and the back-end notifies the front-end on the API Portal. The two sides discuss and confirm the API document through the API Portal.

According to this document, the front-end is developed through the Mock provided on the API Portal. More complex interactions require the construction of back-end apis to produce different data.

The front end accomplishes these complex developments by constructing different Mock cases. Once the development is complete, go to other things, after the backend API development, inform the front end to do the joint tuning.

In the past, if the front and back end are not separated, they will wait for each other. Front-end development, such as back-end API, spit out data and then carry out, which is not good for front-end development experience. Now, the front-end can complete its development without waiting for the back end.

Now, the back-end side has also been developed, said a time we tune it OK, this is the whole process of front and back end development separation.

The application practice

This process has been applied in our distribution scope projects. Here we give an iteration statistics, which can be seen during the development process.

For example, the original estimated work time is 5 days, but now it can be finished in 3 days. The back-end also saves some time. Before, they said, otherwise, they would wait for the front-end call to tell them what it looks like, or they could write some test scripts to see what kind of data it is.

In this regard, he wrote the API, through the API Portal can see whether the backend API spit out the correct data, so that the backend development time is reduced.

Also, the back end, which used to have Web API code to write and deploy, doesn’t have to write and deploy anymore.

In the past, the joint investigation took a long time, two days, because the development time was included in the joint investigation, so he was not sure whether the data was correct.

So some of the processing was not written, and it took him a little bit longer to develop it after he saw the data.

It only takes half an hour to complete the joint investigation, depending on whether it works or not. The whole development experience is good. Overall, the whole development time has been reduced by about 50%.

Is the problem really solved?

With these products, let’s see if all the problems mentioned earlier have been solved. If we write business today, we are stepping in in a unified way, sinking it into SOA, so business is not scattered everywhere.

When we defined the API, we also gave a default API generation method, which uses Json RPC, with method signatures and so on automatically as urls.

Mock services are also provided to help separate front-end and back-end development, as well as the code-as-document-piece. But these problems are solved, really solved?

Well, far from it, there’s more to it:

There are many links in the whole link. How to locate problems quickly?
When failure occurs: Can the site be preserved automatically? Can you perform basic analysis and save the results of the analysis? Can it heal itself?
Can you take snapshots of background services and data, automate regression tests through Docker environment and traffic recorded previously?
Use Async Web to improve performance? Using the GO?
When there are some business requirements: can existing apis be combined to fulfill this requirement, or do you need development? Need to intelligently analyze all API business attributes? Need search and recommendations for business development?

On Team Thinking

We’ve been thinking about how to provide a better service for development, test, and operations roles. That is, how to be lazier and more automated.

For example, in the operation and maintenance of API, can we be completely unaware of the access user side? And then there’s our API Everything team, how do we deal with it, automate it, in the face of multiple access parties at the same time?

In fact, we don’t want too many repetitive things to be repeated, but to think about how to free ourselves to do more automatic things through tools. First of all, we can free ourselves to save users.

Finally, we also want to think about problems from the root and solve them.

Ele. me high stability, high performance, high availability, high fault tolerance API architecture practice!

Related Posts

Inverts a 3 – digit integer

Simple read and write to Java files, random read and write, NIO read and write, and read and write using MappedByteBuffer

System Architecture Design Notes (38) — Workflow design