In the last two weeks, we have optimized our continuous deployment program and achieved remarkable results. Please record and share with us
background
In that year, the company grew rapidly and launched new projects frequently. Every time a project was launched, it was necessary to apply for a new batch of machines, initialize and deploy the dependent service environment
In that year, the project was in full swing. The flow of project A increased rapidly, and the machine was immediately expanded to A. The new functions of project B were added to B
Working overtime day and night, I was on the run
That year, I learned that Docker could save me, so I decided to fight for glory (hairline).
In order to quickly land and minimize the impact of the introduction of Docker on the whole CICD process, Docker was added into our online process with minimal changes. Please refer to the following figure for process changes
In that year, container arrangement was in chaos, K8S was not yet popular, and due to the limited time and energy and technical strength, the production environment did not dare to hastily launch the arrangement. Docker was simply used on the previous host, mainly to solve the problems of environment deployment, expansion and reduction. Docker did solve these two problems after it was launched. There are additional perks such as ensuring consistency in the development line environment
But the use of the Docker not’s interest, and packaged synchronization code way into a mirror, update the container also brought the growth of the online time, at the same time because of the difference of various environment configuration file failed to fully accomplish a packaging environment more sharing, this paper mainly introduces how we optimize the these two problems
Python multithreading
After analyzing the deployment logs, it is found that it takes a long time to download images and restart containers during the whole deployment process
The whole deployer is developed by Python, and the core idea is to use the Paramiko module to remotely execute SSH commands. Before the introduction of Docker, the release is rsyslog synchronization code, single thread rolling restart service, and the whole deployer logic has not changed much after the launch of Docker. Instead of synchronizing the code restart service to download the image restart container, the code looks like this:
import os
import paramiko
# paramiko.util.log_to_file("/tmp/paramiko.log")
filepath = os.path.split(os.path.realpath(__file__))[0]
class Conn:
def __init__(self, ip, port=22, username='ops') : self.ip = ip self.port = int(port) self.username = username self.pkey = paramiko.RSAKey.from_private_key_file( filepath +'/ssh_private.key'
)
def cmd(self, cmd):
ssh = paramiko.SSHClient()
try:
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(self.ip, self.port, self.username, pkey=self.pkey, timeout=5)
except Exception as err:
data = {"state": 0."message": str(err)}
else:
try:
stdin, stdout, stderr = ssh.exec_command(cmd, timeout=180)
_err_list = stderr.readlines()
if len(_err_list) > 0:
data = {"state": 0."message": _err_list}
else:
data = {"state": 1, "message": stdout.readlines()}
except Exception as err:
data = {"state": 0."message": '%s: %s' % (self.ip, str(err))}
finally:
ssh.close()
return data
if __name__ == '__main__':
# Demo code is much simplified, the overall logic remains the same
hostlist = ['10.82.9.47'.'10.82.9.48']
image_url = 'ops-coffee:latest'
for i in hostlist:
print(Conn(i).cmd('docker pull %s' % image_url))
Update the container after the image has been downloaded
Copy the code
All are single thread operation, but think efficiency will not be very high, why not multi-thread? Mainly considering the availability of services, the completion of a server update to update a server until all server update is complete, single thread rolling updates ensure that services are available to the greatest extent, if all servers are updated at the same time, the service cannot provide service to restart process, system will have the risk of downtime, and projects are small scale at the time, Ignoring this increase in time, as projects grew larger and larger, this optimization had to be rethought
The introduction of multithreading is imperative, so how to use multithreading? In view of the overall availability of the service, the two operations of downloading the image and restarting the container are split. Downloading the image does not affect the normal provision of the service, but can use multi-threading completely, so the time of downloading the image will be greatly shortened. The optimized code is as follows:
import threading
Import the Conn class from the previous example
class DownloadThread(threading.Thread):
def __init__(self, host, image_url):
threading.Thread.__init__(self)
self.host = host
self.image_url = image_url
def run(self):
Conn(self.host).cmd('docker login -u ops -p coffee hub.ops-coffee.cn')
r2 = Conn(self.host).cmd('docker pull %s' % self.image_url)
if r2.get('state'):
self.alive_host = self.host
print('---->%s image download completed ' % self.host)
else:
self.alive_host = None
print('---->%s image download failed, details: %s' % (self.host, r2.get('message')))
def get_result(self):
return self.alive_host
if __name__ == '__main__':
# Demo code is much simplified, the overall logic remains the same
hostlist = ['10.82.9.47'.'10.82.9.48']
image_url = 'ops-coffee:latest'
threads = []
for host in hostlist:
t = DownloadThread(host, image_url)
threads.append(t)
for t in threads:
t.start()
for t in threads:
t.join()
alive_host = []
for t in threads:
alive_host.append(t.get_result())
## Multithreaded download mirror finished
print('----> This project has a total of hosts % D, % D hosts download image successfully ' % (len(hostlist), len(alive_host)))
Copy the code
Restarting the container can not be as simple as restarting multiple threads at the same time. As mentioned above, restarting the container at the same time will have the risk of service downtime. All online servers are redundant. You can’t restart them at the same time. Can you restart them in batches? After analyzing the traffic situation, we came up with an algorithm, if the project host is less than 8, then the single thread rolling restart, it doesn’t take too long, if the project host is more than 8, then the number of project hosts /8 rounded up, as the number of threads multithreaded restart, In this way, about 80% of the hosts in the project can be guaranteed to provide services externally all the time, reducing the risk of service unavailability. The optimized code is as follows:
import threading
from math import ceil
Import the Conn class from the previous example
class DeployThread(threading.Thread):
def __init__(self, thread_max_num, host, project_name, environment_name, image_url):
threading.Thread.__init__(self)
self.thread_max_num = thread_max_num
self.host = host
self.project_name = project_name
self.environment_name = environment_name
self.image_url = image_url
def run(self):
self.smile_host = []
with self.thread_max_num:
Conn(self.host).cmd('docker stop %s && docker rm %s' % (self.project_name, self.project_name))
r5 = Conn(self.host).cmd(
'docker run -d --env ENVT=%s --env PROJ=%s --restart=always --name=%s -p 80:80 %s' % (
self.environment_name, self.project_name, self.project_name, self.image_url)
)
if r5.get('state'):
self.smile_host.append(self.host)
print('---->%s Mirror update completed ' % (self.host))
else:
print('---->%s server failed to execute docker run,details:%s' % (self.host, r5.get('message')))
# check mirror restart status and restart failure require rollback code omitted
def get_result(self):
return self.smile_host
if __name__ == '__main__':
# Demo code is much simplified, the overall logic remains the same
alive_host = ['10.82.9.47'.'10.82.9.48']
image_url = 'ops-coffee:latest'
project_name = 'coffee'
environment_name = 'prod'
# alive_host / 8 is rounded up as the maximum number of threads
thread_max_num = threading.Semaphore(ceil(len(alive_host) / 8))
threads = []
for host in alive_host:
t = DeployThread(thread_max_num, host, project_name, environment_name, image_url)
threads.append(t)
for t in threads:
t.start()
for t in threads:
t.join()
smile_host = []
for t in threads:
smile_host.append(t.get_result())
print('---->% D hosts updated successfully ' % (len(smile_host)))
Copy the code
After the above optimization, we found that it took about 10 minutes for a project with 28 hosts to go online before optimization, but only about 2 minutes after optimization, and the efficiency was improved by 80%
Configuration file processing in multiple environments
We adopted project code packed into a mirror image of the management plan, develop, test, staging, production environment configuration file is different, so even if is the same project of different environment will walk alone again deployment release process packaging image, the different environment configuration of the package to a different image, the operation is too cumbersome and unnecessary, It also significantly increased our live time
Each container can define the configuration to be mounted, which is automatically mounted when the container is started, to solve the problem that the image can be used in different environments. How to deal with the problem that the image is not used? Configuration center is essential, the previous article “small and medium-sized team landing configuration center details” has a detailed introduction to our configuration center program
The overall idea of dealing with different configurations is that two environment variables ENVT and PROJ are passed in when Docker is started. These two environment variables are used to define the environment of which project this container belongs to. After getting these two environment variables, the Startup script of Docker uses ConfD service to automatically go to the configuration center to obtain the corresponding configuration. Then update to the corresponding local location, so there is no need to package the configuration file into the image
Take a purely static project that requires only nginx services
Dockerfile is as follows:
FROM nginx:base
COPY conf/run.sh /run.sh
COPY webapp /home/project/webapp
CMD ["/run.sh"]
Copy the code
The run.sh script is as follows:
#! /bin/bash
/etc/init.d/nginx start && \
sed -i "s|/project/env/|/${PROJ}/${ENVT}/|g" /etc/confd/conf.d/conf.toml && \
sed -i "s|/project/env/|/${PROJ}/${ENVT}/|g"The/etc/confd/templates/conf. TMPL && \ confd - watch - backend etcd - node = http://192.168.107.101:2379 - node = http://192.168.107.102:2379 | | \exit 1
Copy the code
Docker startup command:
'docker run -d --env ENVT=%s --env PROJ=%s --restart=always --name=%s -p 80:80 %s' % (
self.environment_name, self.project_name, self.project_name, self.image_url)
Copy the code
The image packaging can be shared by multiple environments once, and there is no need to go through the process of compiling and packaging again when it goes online. You only need to update the image and restart the container, which significantly improves the efficiency
Write in the last
- A container without choreography has no soul, and continuing to advance the use of choreography tools will be a major focus in 2019
- In fact, after Docker was reformed and stabilized, we deployed a set of K8S cluster in the internal development and test environment, which has been stable for more than a year now
- The online project uses a cloudy environment. Some online projects have already used container arrangement based on K8S. Of course, some of them are pure Docker environment I introduced above
If you find this article helpful to you, please share it with more people. If you’re not enjoying your reading, read the following:
- Details of landing configuration center for small and medium-sized teams
- Varian: Elegant release deployer