At present, deep learning is widely applied, among which online service of AI reasoning is an important practical application scenario. This article will introduce function is used to calculate the deployment of deep learning AI reasoning, best practices, including using FUN tool one-click deployment installing third-party dependencies, one-click deployment, local debugging and test evaluation, agile development characteristics of all-round display function calculation, automatic elastic expansion ability, avoid operations and perfect monitoring facilities.
1.1 summary of the DEMO
Identify whether the animal in the photo is a cat or a dog by uploading a photo of a cat or a dog
- DEMO Example Entry: sz.mofangdegisn.cn
- DEMO Project Address: github.com/awesome-fc/…
Open the service
Free open function calculation, pay by volume, function calculation has a large free amount.
Free access to file storage service NAS, charge by volume
1.2 Solutions
As shown in the above, when multiple users by providing the url of the service access reasoning, hundreds of thousands of requests per second it doesn’t matter, function computing platform will automatically scale, to provide enough instances of executed in response to the user’s request, at the same time function calculation provides the perfect function and operation of monitoring equipment to monitor you.
1.3. Comparison between Serverless scheme and traditional self-built service scheme
1.3.1 Excellent engineering efficiency
Self-built service | The function computes Serverless | |
---|---|---|
infrastructure | Requires user procurement and management | There is no |
Development efficiency | In addition to necessary business logic development, it is necessary to establish the same online operation environment, including related software installation, service configuration, security update and a series of issues | Just focus on the development of business logic, with FUN one-click resource choreography and deployment |
Learning costs | K8S or elastic expansion (ESS) may be used, but you need to know more about the meaning of the products, terms and parameters | Can write the corresponding language function code |
1.3.2 Elastic Scaling Without O&M
Self-built service | The function computes Serverless | |
---|---|---|
Elastic high availability | The self-built load balancer (SLB) is used for elastic expansion and capacity reduction, which is slower than FC capacity expansion and reduction | The FC system is flexible and scalable at the millisecond level, enabling rapid expansion of the underlying layer to cope with peak pressure and eliminating operation and maintenance |
Monitoring alarm query | Ecs-level metrics | Provides more fine-grained function execution, latency and log of each access function execution, and a more perfect alarm monitoring mechanism |
1.3.3 Lower cost
- Function computation (FC) has inherent automatic scaling and load balancing capabilities, and users do not need to purchase load balancing (SLB) and elastic scaling.
- For user access scenarios with significant peaks and troughs (such as only part of the time there is a request, other times there is no request at all), choose to pay on demand and pay only for the computing resources actually used.
For obvious peaks and troughs or sparse calls, it has the advantage of low cost, while maintaining flexibility. After the business scale grows, there will be no technology switching costs, and the growth of financial costs with pre-payment can also keep smooth.
- In the scenario where some requests continue to be stable, the problem of higher unit price can be solved with pre-payment. Function calculates cost optimization best practice documentation.
Assume that there is an online computing service. Since it is CPU intensive, we take the average CPU utilization as the core reference index for the cost. Take the one-month cycle, the total computing power of 10 C5 ECS as an example, and the total computing capacity is about 30%. The CPU resource usage of each solution is as follows:
The following billing model can be estimated from the figure above:
- Function calculation prepaid 3CU a month: 246.27 yuan, the computing power is equivalent to ECS computing type C5
- ECS computing C5 (2vCPU,4GB)+ cloud disk: 219 yuan per month, volume: 446.4 yuan
- SLB of 10 Mbps per month: 526.52 yuan (certain traffic assumption is made here), flexible expansion is free
- At saturation use, the function calculates that a pay-as-you-go machine costs about twice as much as a pay-as-you-go C5 ECS
Average CPU utilization | Computational cost | SLB | A total of | |
---|---|---|---|---|
The function calculates the combined payment | > = 80% | 738 + X (246.27 * 3 + X) | There is no | <= 738+X |
ECS are reserved at peak value | < = 30% | 2190 (10 * 219) | 526.52 | > = 2716.52 |
Elastic stretch delay is sensitive | < = 50% | 1314 (102193/5) | 526.52 | > = 1840.52 |
Elastic expansion is cost sensitive | < = 70% | 938.57 (102193/7) | 526.52 | > = 1465.09 |
Note:
- It is assumed that the function logic does not have the cost of downstream traffic on the public network, and the cost comparison is not involved for the time being
- Delay sensitive, when THE CPU utilization is greater than or equal to 50% need to start expansion, otherwise it is too late to cope with the peak
- Cost sensitive, capacity expansion starts when CPU utilization is about 80%, and can tolerate timeout or 5XX at a certain rate
In the above table, X in the calculation of the combined payment function is the cost price of the payment on demand, assuming that the calculation of the payment on demand accounts for 10% of the total calculation, assuming that the CPU utilization is 100%, corresponding to the above table, then the computing capacity of 3 ECS is needed. Therefore, the cost of FC pay-per-volume X = 3 ️446.4 ️ 10% ️ 2 = 267.84 (FC pay-per-volume is 2 times of ECS pay-per-volume). At this time, the function calculates the total amount of combined payment of 1005.8 yuan. According to the prediction of this model, as long as FC pay-per-volume accounts for less than 20% of the total computing volume, it will have certain advantages even if SLB is not taken into account and the computing cost is simply considered.
Nodule 1.3.4.
The main CPU-intensive advantages of AI reasoning based on function calculation:
- Easy to get started, only focus on business logic development, greatly improve the efficiency of project development.
- There are too many learning and configuration costs for the self-built solution. For example, different parameter configurations for ESS are required for different scenarios
- System environment maintenance and upgrade
- O&m is not required. Functions perform monitoring and alarms with granularity.
- Flexible capacity expansion at the millisecond level ensures elastic high availability and covers delay-sensitive and cost-sensitive types.
- In CPU-intensive computing scenarios, a reasonable combination of charging modes can provide cost advantages in the following scenarios:
- Requests for access have obvious peaks and troughs, and other times no requests at all
- There are some stable load requests, but there are some periods of sharp changes in the request volume
Package code ZIP packages and deploy functions
FUN operation concise video tutorial
Open the service
Free open function calculation, pay by volume, function calculation has a large free amount.
Free access to file storage service NAS, charge by volume
2.1 Installing third-party Packages to the LOCAL PC and uploading them to the NAS
2.1.1 Install the latest Fun
- The installed version is 8.x latest version or 10.x or 12.x Nodejs
- Install funcraf
2.1.2 Clone Project & Fun Install third-party libraries locally with one click
git clone https://github.com/awesome-fc/cat-dog-classify.git
- Copy the.env_example file as.env and modify the information in.env as your own
- perform
fun install -v
, Fun installs the associated dependency packages according to the logic defined in Funfile
root@66fb3ad27a4c: ls .fun/nas/auto-default/classify
model python
root@66fb3ad27a4c: du -sm .fun
697 .fun
Copy the code
According to the definition of Funfile:
- Download the third-party library to
.fun/nas/auto-default/classify/python
directory - Move the local model directory to
.fun/nas/auto-default/model
directory
After the installation, we can see that the code package used to calculate the function has reached 670 MB, which is far more than the 50M code package limit. The solution is NAS: mount NAS access. Fortunately, FUN solved the PROBLEM of NAS configuration and file upload in one click.
2.1.3. Upload the downloaded dependent third-party code package to the NAS
fun nas init
fun nas info
fun nas sync
fun nas ls nas://classify:/mnt/auto/
Copy the code
Execute these commands in sequence to upload the.fun/nas/auto-default third-party code packages and model files to the NAS.
fun nas init
: Initializes the NAS based on the information obtained from your.env (NAS that already meet the criteria) or creates a NAS available in the same regionfun nas info
: You can view the directory location of the local NAS, for this project it is $(PWD)/.fun/ NAS /auto-default/classifyfun nas sync
: Upload the content (.fun/ NAS /auto-default/classify) in the local NAS to the classify directory in the NASfun nas ls nas:///mnt/auto/
: Check to see if we have uploaded the file correctly to the NAS
Log in to the NAS console at nas.console.aliyun.com and VPC console at vpc.console.aliyun.com. You can view that NAS devices and corresponding VPCS are successfully created in the specified region
2.2 Local debugging functions
In template.yml, we specify that this function is of type HTTP, so follow Fun’s tip:
Tips for next step
======================
* Invoke Event Function: fun local invoke
* Invoke Http Function: fun local start
* Build Http Function: fun build
* Deploy Resources: fun deploy
Copy the code
When fun local start is executed, a local HTTP server is started to simulate the function’s execution. Then we can use postman, curl, or a browser, as in this example:
2.3 Deploying Functions on the FC Platform
With local debugging OK, we deploy the function to the cloud:
Modify template.yml LogConfig with a Project name that will not be duplicated
fun deploy
Note: The template.yml annotation is for the custom domain name configuration, if you want to complete the deployment in Fun deploy:
- Go to the domain name resolution, such as in the example, the domain name sz. Mofangdegisn. Cn parsing to cn-hangzhou.fc.aliyuncs.com 123456., the corresponding domain name, accountId and modified into their own region
- Uncomment template.yml and change it to your own domain name
- perform
fun deploy
If you do not have a custom domain name and access the HTTP trigger URL directly through the browser, Such as https://123456.cn-shenzhen.fc.aliyuncs.com/2016-08-15/proxy/classify/cat-dog/ will be forced to download.
Reason: https://help.aliyun.com/knowledge_detail/56103.html#HTTP-Trigger-compulsory-header
Log in to the console, fc.console.aliyun.com, and you can see that the service and function have been created successfully and that the service has been configured correctly.
Here, we find that when the page access function is first opened, the execution environment instance takes a very long time to cold start. If it is an online AI reasoning service, which is very sensitive to response time, the burr caused by cold start is unacceptable for this type of service. Next, This paper explains how to eliminate the negative effects of cold start by using the reservation mode calculated by the function.
Use reserved mode to eliminate cold start burrs
Function calculation has the feature of dynamic scaling. According to the number of concurrent requests, it automatically and flexibly expands the execution environment to execute the environment. In this typical deep learning example, import keras takes a long time
start = time.time()
from keras.models import model_from_json
print("import keras time = ", time.time()-start)
Copy the code
3.1 Function calculation setting reservation
Reserve operation concise video tutorial
- On the FC Console, publish the version, create the alias prod based on the version, and set the reservation based on the alias prod. For details, see help.aliyun.com/document_de…
- Implement the PROD version of the function’s HTTP trigger and custom field name Settings
Results of a pressure test
As can be seen from the above figure, when the request for function execution comes, it is preferentially scheduled to be executed in the reserved instance. At this time, there is no cold start, so the request has no burr. Later, with the increasing test pressure (peak TPS reaches 1184), the reserved instance cannot meet the request for calling the function. If the web AP is latency sensitive, this latency is not acceptable at this time. At this time, the function will automatically expand the instances for the function to execute on demand, and then the call will be cold started.
conclusion
- Function calculation has fast automatic expansion capacity
- Reservation mode solves the burr problem in cold start
- Fun is a one-click deployment tool that is easy to use and only needs to focus on specific code logic
- Function calculation has a good monitoring facility, you can visualize the performance of your function, execution time, memory, and so on
If you have any questions, please contact us
“Alibaba Cloud originators pay close attention to technical fields such as microservice, Serverless, container and Service Mesh, focus on cloud native popular technology trends and large-scale implementation of cloud native, and become the technical circle that knows most about cloud native developers.”