Introduction: This article is a best practice introduction to the lossless offline and service warm-up capabilities provided by Ali Cloud micro-service engine MSE during application release.
This paper introduces the best practice of the lossless offline and service warm-up capability provided by ali Cloud micro-service engine MSE during application release. Assume that the application architecture consists of a Zuul gateway and a back-end microservice application instance (Spring Cloud). The specific back-end call links include shopping cart application A, transaction center application B, and inventory center application C. Services in these applications are registered and discovered through the Nacos registry.
The premise condition
Enable MSE microservice management
- A Kubernetes cluster has been created. For details, see Creating a Kubernetes Managed Edition Cluster.
- The MSE Microservice Governance professional edition is available. For details, see Enabling MSE Microservice Governance.
Background information
Many application systems with a large number of users and high degree of concurrency generally choose to release in the middle of the night when the traffic is small in order to avoid loss of traffic in the process of release. Although this has an effect, the uncontrollable r & D operation personnel behind it often get scared and tired in the middle of the night because of release problems. Based on this, ali cloud service engine MSE through in the process of application, by applying the offline active real-time cancellation, application of on-line health ready check alignment and life cycle, and micro service provided by the service such as preheating technology application condition from top to bottom line release function, let the research and development operations staff even published applications during the day, also can the wind light cloud light.
The preparatory work
Note that the Agent used in this practice is still in gray scale, so it is necessary to upgrade the Agent in gray scale. The upgrade document is help.aliyun.com/document\_d…
To deploy applications in different regions (only domestic regions are supported), use the corresponding Agent download address: http://arms-apm-cn-\[regionId\]. Oss-cn -\[regionId\].aliyuncs.com/2.7.1.3-mse-beta/, replace address \[regionId\], RegionId is ali Cloud regionId,
For example Region Agent address is: Beijing arms-apm-cn-beijing.oss-cn-beijing.aliyuncs.com/2.7.1.3-mse…
Application deployment traffic Architecture diagram
Flow pressure source
In spring-cloud-Zuul applications, each POD has A concurrent 10 HTTP request traffic of 127.0.0.1:20000:/A/ A to the local Zuul port. The number of concurrent requests can be configured using the environment variable demo.qps.
Deploy the Demo application
Save the following in a file, say mse-demo.yaml, and execute kubectl apply -f mse-demo.yaml to deploy the application to the pre-created Kubernetes cluster (note that because of the CronHPA task in demo, Please install the ack-kubernetes- cronhPA-Controller component in the container service -kubernetes- > Marketplace -> Application directory to install the component in the test cluster. Application A, APPLICATION B, and application C deploy A baseline version and A grayscale version respectively. Application B disables the lossless offline function in the baseline version, and the grayscale version enables the lossless offline function. C The service preheating capability is enabled for the application, and the preheating duration is 120 seconds.
# Nacos Server
—
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nacos-server
name: nacos-server
spec:
replicas: 1
selector:
matchLabels:
app: nacos-server
template:
metadata:
labels:
app: nacos-server
spec:
containers:
– env:
– name: MODE
value: standalone
image: registry.cn-shanghai.aliyuncs.com/yizhan/nacos-server:latest
imagePullPolicy: Always
name: nacos-server
resources:
requests:
cpu: 250m
memory: 512Mi
dnsPolicy: ClusterFirst
restartPolicy: Always
Nacos Server Service configuration
—
apiVersion: v1
kind: Service
metadata:
name: nacos-server
spec:
ports:
– port: 8848
protocol: TCP
targetPort: 8848
selector:
app: nacos-server
type: ClusterIP
# Entry zuul app
—
apiVersion: apps/v1
kind: Deployment
metadata:
name: spring-cloud-zuul
spec:
replicas: 1
selector:
matchLabels:
app: spring-cloud-zuul
template:
metadata:
annotations:
msePilotCreateAppName: spring-cloud-zuul
labels:
app: spring-cloud-zuul
spec:
containers:
– env:
– name: JAVA_HOME
Value: / usr/lib/JVM/Java – 1.8 – its/jre
– name: LANG
value: C.UTF-8
Image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-zuul:1.0.1
imagePullPolicy: Always
name: spring-cloud-zuul
ports:
– containerPort: 20000
Enable full-link transparent transmission by machine latitude with base version
—
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: spring-cloud-a
name: spring-cloud-a
spec:
replicas: 2
selector:
matchLabels:
app: spring-cloud-a
template:
metadata:
annotations:
msePilotCreateAppName: spring-cloud-a
msePilotAutoEnable: “on”
labels:
app: spring-cloud-a
spec:
containers:
– env:
– name: LANG
value: C.UTF-8
– name: JAVA_HOME
Value: / usr/lib/JVM/Java – 1.8 – its/jre
– name: profiler.micro.service.tag.trace.enable
value: “true”
Image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT
imagePullPolicy: Always
name: spring-cloud-a
ports:
– containerPort: 20001
protocol: TCP
resources:
requests:
cpu: 250m
memory: 512Mi
livenessProbe:
tcpSocket:
port: 20001
initialDelaySeconds: 10
periodSeconds: 30
Enable full link transparent transmission by machine latitude
—
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: spring-cloud-a-gray
name: spring-cloud-a-gray
spec:
replicas: 2
selector:
matchLabels:
app: spring-cloud-a-gray
strategy:
template:
metadata:
annotations:
alicloud.service.tag: gray
msePilotCreateAppName: spring-cloud-a
msePilotAutoEnable: “on”
labels:
app: spring-cloud-a-gray
spec:
containers:
– env:
– name: LANG
value: C.UTF-8
– name: JAVA_HOME
Value: / usr/lib/JVM/Java – 1.8 – its/jre
– name: profiler.micro.service.tag.trace.enable
value: “true”
Image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT
imagePullPolicy: Always
name: spring-cloud-a-gray
ports:
– containerPort: 20001
protocol: TCP
resources:
requests:
cpu: 250m
memory: 512Mi
livenessProbe:
tcpSocket:
port: 20001
initialDelaySeconds: 10
periodSeconds: 30
# B Use base version to disable lossless logoff capability
—
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: spring-cloud-b
name: spring-cloud-b
spec:
replicas: 2
selector:
matchLabels:
app: spring-cloud-b
strategy:
template:
metadata:
annotations:
msePilotCreateAppName: spring-cloud-b
msePilotAutoEnable: “on”
labels:
app: spring-cloud-b
spec:
containers:
– env:
– name: LANG
value: C.UTF-8
– name: JAVA_HOME
Value: / usr/lib/JVM/Java – 1.8 – its/jre
– name: micro.service.shutdown.server.enable
value: “false”
– name: profiler.micro.service.http.server.enable
value: “false”
Image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.1-SNAPSHOT
imagePullPolicy: Always
name: spring-cloud-b
ports:
– containerPort: 8080
protocol: TCP
resources:
requests:
cpu: 250m
memory: 512Mi
livenessProbe:
tcpSocket:
port: 20002
initialDelaySeconds: 10
periodSeconds: 30
# B The lossless offline function is enabled by default for gray applications
—
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: spring-cloud-b-gray
name: spring-cloud-b-gray
spec:
replicas: 2
selector:
matchLabels:
app: spring-cloud-b-gray
template:
metadata:
annotations:
alicloud.service.tag: gray
msePilotCreateAppName: spring-cloud-b
msePilotAutoEnable: “on”
labels:
app: spring-cloud-b-gray
spec:
containers:
– env:
– name: LANG
value: C.UTF-8
– name: JAVA_HOME
Value: / usr/lib/JVM/Java – 1.8 – its/jre
Image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.1-SNAPSHOT
imagePullPolicy: Always
name: spring-cloud-b-gray
ports:
– containerPort: 8080
protocol: TCP
resources:
requests:
cpu: 250m
memory: 512Mi
lifecycle:
preStop:
exec:
command:
– /bin/sh
– ‘-c’
– >-
Wget http://127.0.0.1:54199/offline 2 > / TMP/null; sleep
30; exit 0
livenessProbe:
tcpSocket:
port: 20002
initialDelaySeconds: 10
periodSeconds: 30
# C Apply base version
—
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: spring-cloud-c
name: spring-cloud-c
spec:
replicas: 2
selector:
matchLabels:
app: spring-cloud-c
template:
metadata:
annotations:
msePilotCreateAppName: spring-cloud-c
msePilotAutoEnable: “on”
labels:
app: spring-cloud-c
spec:
containers:
– env:
– name: LANG
value: C.UTF-8
– name: JAVA_HOME
Value: / usr/lib/JVM/Java – 1.8 – its/jre
Image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-c:0.1-SNAPSHOT
imagePullPolicy: Always
name: spring-cloud-c
ports:
– containerPort: 8080
protocol: TCP
resources:
requests:
cpu: 250m
memory: 512Mi
livenessProbe:
tcpSocket:
port: 20003
initialDelaySeconds: 10
periodSeconds: 30
# HPA configuration
—
apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHorizontalPodAutoscaler
metadata:
labels:
The controller – tools. K8s. IO: “1.0”
name: spring-cloud-b
spec:
scaleTargetRef:
apiVersion: apps/v1beta2
kind: Deployment
name: spring-cloud-b
jobs:
– name: “scale-down”
schedule: “0 0/5 * * * *”
targetSize: 1
– name: “scale-up”
schedule: “10 0/5 * * * *”
targetSize: 2
—
apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHorizontalPodAutoscaler
metadata:
labels:
The controller – tools. K8s. IO: “1.0”
name: spring-cloud-b-gray
spec:
scaleTargetRef:
apiVersion: apps/v1beta2
kind: Deployment
name: spring-cloud-b-gray
jobs:
– name: “scale-down”
schedule: “0 0/5 * * * *”
targetSize: 1
– name: “scale-up”
schedule: “10 0/5 * * * *”
targetSize: 2
—
apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHorizontalPodAutoscaler
metadata:
labels:
The controller – tools. K8s. IO: “1.0”
name: spring-cloud-c
spec:
scaleTargetRef:
apiVersion: apps/v1beta2
kind: Deployment
name: spring-cloud-c
jobs:
– name: “scale-down”
schedule: “0 2/5 * * * *”
targetSize: 1
– name: “scale-up”
schedule: “10 2/5 * * * *”
targetSize: 2
# Zuul Gateway opens the SLB exposure display page
—
apiVersion: v1
kind: Service
metadata:
name: zuul-slb
spec:
ports:
– port: 80
protocol: TCP
targetPort: 20000
selector:
app: spring-cloud-zuul
type: ClusterIP
# A application exposed K8S service
—
apiVersion: v1
kind: Service
metadata:
name: spring-cloud-a-base
spec:
ports:
– name: http
port: 20001
protocol: TCP
targetPort: 20001
selector:
app: spring-cloud-a
—
apiVersion: v1
kind: Service
metadata:
name: spring-cloud-a-gray
spec:
ports:
– name: http
port: 20001
protocol: TCP
targetPort: 20001
selector:
app: spring-cloud-a-gray
Nacos Server SLB Service
—
apiVersion: v1
kind: Service
metadata:
name: nacos-slb
spec:
ports:
– port: 8848
protocol: TCP
targetPort: 8848
selector:
app: nacos-server
type: LoadBalancer
Result Verification 1: Lossless offline function
Since we have timed HPA enabled for both spring-Cloud-B and Spring-Cloud-b-Gray applications, the simulation performs timed scaling every 5 minutes.
Log in to MSE console, enter Microservice Governance Center -> Application List -> Spring-Cloud-A -> Application Details. From the application monitoring curve, we can see the traffic data of Spring-Cloud-A application:
For gray traffic, the number of request errors is 0 during pod capacity expansion and reduction, and there is no traffic loss. In the unmarked version, because the lossless offline function was disabled, 20 requests sent from Spring-Cloud-A to Spring-cloud-B were reported wrong during pod capacity expansion and reduction, resulting in request traffic loss.
Result verification two: service preheating function
In the Spring-Cloud-C application, we started timed HPA to simulate the startup process of the application and scaled it every 5 minutes. It was scaled down to 1 node at 0 seconds in 2 minutes and expanded to 2 nodes at 10 seconds in 2 minutes.
Enable the service preheating function on spring-cloud-b, the consumer side of the preheating application.
Enable the service preheating function on spring-cloud-c, the service provider of the preheating application. The preheating time is set to 120 seconds.
The traffic on the node increases slowly. You can also see the node warm-up start and end times, as well as related events.
From the picture above you can see open preheating function after the restart of the application of flow rate will increase slowly over time, some applications in need in the process of start pre-built connection pool and caching resources such as slow start scene, open service preheating can effectively protect the application cache resources in order to create application security in the process of start up and do it traffic condition.
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.