Author: Xiliu | Ali Cloud function computing experts
Spring Boot is a suite based on the Java Spring framework that comes preloaded with Spring components, allowing developers to create standalone applications with minimal configuration. In a cloud native environment, there are a number of platforms that can run Spring Boot applications, such as virtual machines, containers, and so on. One of the most attractive, however, is to run Spring Boot applications Serverless.
I will analyze the merits and demerits of Spring Boot application running on Serverless platform from five aspects, including architecture, deployment, monitoring, performance and security, through a series of articles. To make the analysis more representative, I chose Mall, an e-commerce app with more than 50K stars on Github, as an example. This is the fourth article in a series that shows you how to tune performance for Serverless applications.
Instance startup speed optimization
In the previous tutorial, I believe you all felt the beauty of Serverless. You can easily launch a flexible and highly available Web application by uploading the code package and image.
However, it still has the problem of “cold start delay” for the first startup. The Mall app starts up in about 30 seconds, and users will experience a long cold start delay, which may not be a disadvantage in this “real-time era”. (” cold start “is the state in which a function serves a particular invocation request. When there is no request for a period of time, the Serverless platform will reclaim the function instance; The next time there is a request, the system pulls up the instance again in real time, a process called cold startup.)
Before optimizing cold start, we should first analyze the time consuming of each stage of cold start.
First, enable the link tracing function on the service configuration interface of the FC console.
Make a request to the mall-admin service. After the request is successful, check the FC console and see the corresponding request information. Note that “View function errors only” is turned off so that all requests are displayed. There is a delay in monitoring indicators and collecting link data. If the indicator is not displayed, refresh the indicator after a while. Locate the request for the cold start flag and click Request Details under More.
The call link displays the time of each link of cold start. Cold start includes the following steps:
-
PrepareCode: Mainly download code packages or images. Since we have enabled image acceleration, there is no need to download all images, so this step takes a very short time.
-
Run-time initialization: starts from the start function until the function computing (FC) system detects that the application port is ready. This includes the application startup time. Run s mall-admin logs on the command line to check the corresponding log times. We can also see that the Spring Boot application takes a lot of time to start.
-
Application Initialization: The Initializer interface is provided for function calculations. You can perform Initialization logic in Initializer.
-
Invocation Delay: The delay in processing a request, which is very short.
From the link tracing diagram above, instance startup time is the bottleneck, but we can optimize it in a number of ways.
Using reserved Instances
Java class applications generally start slowly. An application needs to interact with many external services during initialization, which takes a long time. Such processes are required by business logic and are difficult to optimize latency. Therefore, function calculation provides the function of reserving instances. The reserved instance starts and stops with the user’s own control and will stay there without requests, so there is no problem with cold starts. Of course, the user will have to pay for the entire instance to run, even if the instance does not handle any requests.
In the function computing console, we can set up instances for functions on the “Elastic Scaling” page.
The user configures the minimum and maximum number of instances in the console. The platform reserves the minimum number of instances, which is the maximum number of instances under this function. You can also set periodic reservation rules and reservation rules by indicator.
After a reservation rule is created, the system creates a reservation instance. When the reserved instance is in place, there will be no cold start when we access the function again.
Optimize instance startup speed
Lazy initialization
In Spring Boot 2.2 and later, you can turn on a global lazy initialization flag. This will increase startup speed at the cost of a potentially longer delay on the first request as you wait for the component to initialize the first time.
The following environment variables can be configured for related applications in S. aml
SPRING_MAIN_LAZY_INITIATIALIZATION=true
Copy the code
Close the optimized compiler
By default, the JVM has multiple stages of JIT compilation. While these phases can increase the efficiency of an application over time, they can also increase the overhead of memory usage and increase startup time. For short-running Serverless applications, consider turning this optimization off to sacrifice long-term efficiency for shorter startup times.
The following environment variables can be configured for related applications in S.aml:
JAVA_TOOL_OPTIONS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1"
Copy the code
Example of setting environment variables in S. yaml:
As shown in the figure below, configure the environment variables for the mall-admin function. Then run sudo -e s mall-admin deploy.
Log in to the instance to check whether environment variables are correctly configured:
Locate the request in the request list on the Console function details page, and click The Instance Details link in More.
On the Instance Details page, click Login Instance.
Run the echo command on the shell interface to check whether environment variables are correctly set.
Note: For non-reserved instances, the functional computing system will automatically reclaim the instance after no request has been received for a period of time. The instance cannot be logged in again (the login instance button on the instance Details page above is grayed out). So make the call and log in as soon as possible before the instance is reclaimed.
Set proper instance parameters
When we choose an application instance specification, such as 2C4G or 4C8G, we then want to know how many requests an instance can handle to make full use of resources while maintaining performance. When the number of requests exceeds the threshold, the system can eject instances quickly to ensure smooth application performance.
How to measure instance loading of multiple dimensions, such as QPS exceeding a certain threshold, or instance CPU/Memory/Network/Load exceeding a threshold, etc. The function calculation uses Instance Concurrency as a measure of the load on an Instance and a basis for Instance scaling.
Instance Concurrency is the number of requests that an Instance can execute simultaneously. For example, setting instance concurrency to 20 means that an instance can execute a maximum of 20 requests at any one time.
Note: Distinguish between instance concurrency and QPS.
Using example concurrency to measure load has the following advantages:
-
The system can quickly count the instance concurrency index value to expand and shrink the capacity. Instance-level indicators, such as CPU, Memory, Network, and Load, are collected in the background and need to be counted for tens of seconds before scaling, which cannot meet the elastic scaling requirements of online applications.
-
Under various conditions, the index of instance concurrency can reflect the system load level stably. If the request delay is taken as an indicator, it is difficult for the system to distinguish whether the delay is increased due to overload of instances or the downstream service becomes a bottleneck. A typical Web application, for example, would normally access a MySQL database. If the database becomes a bottleneck and request latency is high, expanding at this point not only makes no sense, but also makes the situation worse by overwhelming the database. QPS are related to request latency, which can also be problematic.
Despite the above advantages, users often do not know what instance concurrency to set. I recommend following the following process to determine a reasonable level of concurrency:
Set the maximum number of instances of the application function to 1 to ensure that the performance of a single instance is measured.
Use the load load tool to pressure the application and look at metrics such as TPS and request latency.
Incrementally increase the instance concurrency, and continue to increase if the performance is still good; If the performance is not as expected, turn down the concurrency.
A link to the
1) Spring Boot:
Spring. IO/projects/sp…
2) the Mall:
Github.com/macrozheng/…
3) Serverless Devs installation Documentation
Serverlessdevs.com/zhcn/docs/i…
4) Function calculation:
www.aliyun.com/product/fc cloud native latest information technology, the most complete collection cloud native technology content, cloud native activities regularly, live, ali products and users best practices for release. Explore the cloud native technology with you and share the cloud native content you need.
Pay attention to [Alibaba Cloud native] public account, get more cloud native real-time information!