1. What is a supervisor
Supervisor is a process monitoring tool for Linux/Unix systems. It is a universal process manager developed by Python. It can manage and monitor processes on Linux, turn a common command line process into a background daemon, and monitor process status. The system automatically restarts after an abnormal exit. But like DaemonTools, it can’t monitor daemon processes
Supervisor official website click here.
2. Why use Supervisor
- Using simple Supervisors provides a unified way to start, stop, and monitor your processes individually or as a group. You can configure the Supervisor locally or remotely from the command line or web interface. Under Linux many programs are usually has been running, are generally need to write your own a process can achieve the start/stop/restart/reload the script function, and then in the/etc/init. Under d/a. There are a number of downsides to this. First, we need to write a similar script for each program, and second, Linux does not automatically restart the process when it dies. To do so, we need to write a monitoring restart script. The Supervisor can solve these problems perfectly. The Supervisor manages processes by starting these managed processes as child processes of the Supervisor through fork/exec. In this case, we simply write the path to the executable file of the process to be managed in the Supervisor configuration file. Second, the supervised process is the supervisor’s child process. When the supervisor process dies, the parent process can accurately obtain the information about the child process that dies, so it can automatically restart the child process. Of course, whether to restart the child process depends on whether the autostart=true is set in your configuration file. Supervisor is easy to master through the INI format configuration file. It provides many configuration options for each process, allowing you to easily restart the process or automatically rotate logging.
- It is OK to centrally manage supervisor processes and process group information in an INI file. In addition, when we manage the Supervisor, we can manage it locally or remotely. Moreover, the Supervisor provides a Web interface on which we can monitor and manage processes. Of course, the Supervisor’s XML_RPC interface needs to be called for local, remote, and Web management, which is also a later topic. Supervisor can centrally manage process groups, that is to say, we can write the processes to be managed into a group, and then manage the group as an object, such as start, stop, restart and so on. On Linux, there is no such functionality. If you want to stop a process, you have to stop it one by one, or you can write a script to stop it in batches.
3. The supervisor components
- It is designed to create a specified number of subprocesses for the application based on the configuration file, manage the entire life cycle of the subprocess, restart the crash process, send event notifications for process changes, etc. At the same time built-in Web Server and XML-RPC Interface, easy to achieve process management. The service configuration file in/etc/supervisor/supervisord. Conf.
- The supervisorctl client’s command-line tool provides a shell-like interface that allows you to connect to different processes in the container to manage their respective subroutines, while the commands communicate with the container via UNIX sockets or TCP. The user is able to send messages on the command line to the container, which is designed to display process status, load configuration files, start and stop processes, see standard output and error output, and run the container remotely. The server can also require the client to provide authentication before proceeding.
- Web Server Supervisor provides the Web Server function to control processes through the Web (set to [inet]httpServer] configuration item).
- Xml-rpc Interface The XML-RPC Interface, just like HTTP provides a WEB UI, is used to control a Supervisor and the programs that run from it.
4. Installation, configuration, and use
Supervisor is written in Python and can be installed with easy_install and PIP. For example, on my centos machine, the supervisor installation command is as follows:
It is safe to say that when I am running the PIP installation, I am running the configuration file that is being created. Therefore, I choose the installation method of Easy_Install Supervisor. Of course, I can also download the source code to install it. For example:Copy the code
Wget pypi.python.org/packages/so… –no-check-certificat
The tar ZXVF - the supervisor - 3.1.3. Tar. GzcdSupervisor-3.1.3 sudo python setup.py installCopy the code
After installation, it can be directly supervised and verified whether it is successful by running. If any errors are reported, it will be solved one by one, for example, melD3 version problem may be reported. Here are the installation steps:
Unzip wget http://effbot.org/media/downloads/elementtree-1.2.7-20070827-preview.zip Elementtree - 1.2.7-20070827 - preview.zip &&cdElementtree - 1.2.7-20070827 - preview python setup. Py installCopy the code
Or download this version:
Wget http://www.plope.com/software/meld3/meld3-0.6.5.tar.gz tar - xf meld3-0.6.5. Tar. Gz &&cdMeld3-0.6.5 python setup. Py installCopy the code
If the installation is successful, you can proceed to the next step: setting up the configuration file.
Generate the configuration file and place it in the /etc directory
echo_supervisord_conf > /etc/supervisord.conf
In order not to write all the new configuration information in a configuration file, create a new folder, each application set a configuration file, isolated from each other
mkdir /etc/supervisord.d/
### Modify the configuration file
vim /etc/supervisord.conf
Add the following configuration information
[include]
files = /etc/supervisord.d/*.conf
It is designed to be able to view the management process through the web. It is designed to be able to see the management process through the web. It is designed to be able to see the process through the web.
[inet_http_server]
port=9001
username=user
password=123
Copy the code
Start the supervisord
# supervisord -c /etc/supervisord.conf
Copy the code
Let’s see if we’re listening
# lsof -i:9001
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
superviso 14685 root 4u IPv4 20155719 0t0 TCP *:etlservicemgr (LISTEN)
Copy the code
Now you can view the Supervisor web interface through http://ip:9001/ (the default user name and password are user and 123). Of course, no monitoring program has been added yet.
Let’s write a simple Python script to verify the Supervisor’s monitoring effect.
#cat /root/temp/test_http.py ### This is the code in the test_http.py script
#! /usr/bin/env python
# coding=utf-8
import sys
import BaseHTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler
HandlerClass = SimpleHTTPRequestHandler
ServerClass = BaseHTTPServer.HTTPServer
Protocol = "HTTP / 1.0"
if __name__ == "__main__":
if sys.argv[1:]:
port = int(sys.argv[1])
else:
port = 10000
server_address = ('0.0.0.0', port)
HandlerClass.protocol_version = Protocol
httpd = ServerClass(server_address, HandlerClass)
sa = httpd.socket.getsockname()
print "Serving HTTP on", sa[0], "port", sa[1], "..."
httpd.serve_forever()
Copy the code
Add a configuration file that the Supervisor can use to monitor the test_http.py program.
# cat/etc/supervisord. D/supervisor_test_http. Conf # # # of the following is the configuration file
[program:test_http]
command=python /root/temp/test_http.py 9999 ; Path of monitored processes directory=/root/temp; Do you want to do it beforecdPriority =1; The higher the number, the higher the priority numprocs=1. Start several processes autostart=true; It is designed to launch the autorestart= while the container is runningtrue; Automatic restart. Of course you have to choose Startretries =10; Exitcodes =0; exitCodes =0; The normal exit code does not restart when the exit code is this? To be determined) stopsignal=KILL; Stopwaitsecs =10; Wait time before sending SIGKILL redirect_stderr=true; Redirect stderr to stdoutCopy the code
Reloading the container or reloading the configuration file:
supervisorctl reload
# # # # or
supervisorctl -c /etc/supervisord.conf
Copy the code
If you visit the HTTP page at this point, you will find that the test_http.py program has been monitored and started automatically.
You can also access HTTP services provided by the test_http.py program, such as http://ip:9999.
Note: The Supervisor can only monitor foreground applications. If you fork your daemon, you can’t use it. Otherwise, the Supervisor > Status will tell you: BACKOFF Exited too quickly (process log may have details). Therefore, services like Apache and Tomcat are started in daemon mode by default. Instead of running the Service HTTPD start script directly, you need to use a wrapped start/stop script. For example, for the script for starting and stopping Tomcat under Supervisor, see “Controlling Tomcat with Supervisor” or “supervisor-tomcat.conf”.
In addition, the Supervisor can be started with system startup. Linux executes the script in /etc/rc.local during startup, so simply add the supervisor command here:
# Add the following content to Ubuntu:
/usr/local/bin/supervisord -c /etc/supervisord.conf
Add the following content for Centos
/usr/bin/supervisord -c /etc/supervisord.conf
Copy the code
6. The supervisor management
The supervisor is designed to be handled in a command-line tool, or in a web-based format. The supervisor is designed to be handled in a small way, and the supervisor is designed to be handled in a web-based format.
### the container is designed to display the supported command
# supervisorctl help
default commands (type help <topic>):
=====================================
add exit open reload restart start tail
avail fg pid remove shutdown status update
clear maintail quit reread signal stop version
View a list of currently running processes
# supervisorctl status
test_http RUNNING pid 28087, uptime 0:05:17
Copy the code
Among them
- The update is designed to update the new configuration to the supervisord.
- Reload, all configuration files are loaded, and all processes are started and managed according to the new configuration (which will restart the previously running program)
- Start XXX: starts a process
- Restart XXX: restarts a process
- Stop XXX: stops a process (XXX). XXX is the value configured in [Program: theprogramName]
- Stop groupworker: Restart all processes belonging to groupworker (same as start and restart)
- Stop all: Stops all processes. Note: start, restart, and stop do not load the latest configuration file
- Reread, which is executed when a service is changed from automatic to manual startup
Note: If the program is designed to run with a container container, it should be written in a shell script that is designed to run in the container.
7. Supervisor configuration parameters
The configuration file it is designed to handle is made up of several segments, and the items are presented in K/V format.
- unixhttpServer configuration block
The parameter in this configuration block represents an HTTP server listening on the socket. If the [unixhttpserver] block is not in the configuration file or annotated, socket-based HTTP server will not be started. The parameters of this block are described as follows:
-file: the file path of a Unix domain socket, on which HTTP/XML-RPC will listen. -chmod: changes the Unix domain socket mode_chown at boot time. Change the owner of the socket file. -username: indicates the username of the HTTP server during authentication. -password: indicates the authentication passwordCopy the code
- inethttpServer configuration block
The parameter in this configuration block represents an HTTP server listening on TCP. If the [inethttpServer] block is not in the configuration file or annotated, tcp-based HTTP Server will not be started. The parameters of this block are described as follows:
- port: indicates the IP address and port (IP :port) monitored by TCP. This IP address will be monitored by HTTP/ xml-rpc. - username: indicates the username of the HTTP server during authenticationCopy the code
Such as:
[inet_http_server] ; Inet (TCP) server disabled by default port=0.0.0.0:9001; (ip_address:port specifier, *:portfor all iface)
username=user ; (default is no username (open server))
password=123 ; (default is no password (open server))
Copy the code
It indicates that the listener is on port 9001, and the access address is HTTP //127.0.0.1:9001.
- Supervisord configuration block
The container is configured with a global configuration item for the supervisord process. The parameters of this block are described as follows:
- logfile:logFile path -logfile_maxbytes:logThe number of files is automatically rotated, and the unit is KB, MB, or GB. If this parameter is set to 0, the size of log files is not limited. -logfile_backups: Number of backup log backups. The default value is 10. Error, WARN, INFO, DEBUG, trace, blather, critical - pidfile: indicates the pidfile path -umask:umaskValue, default 022 -nodaemon: if set totrueMinfds: specifies the minimum number of file descriptors it is willing to use before the container is successfully launched. Default: 1024 minprocs: It is the smallest number of process descriptors that can be used before the container is successfully launched. The default is 200. The directory in which the container is running while the daemon is running is called strip_ANSI: eliminates escape sequences in the child's log file - environment: a list of K/V pairsCopy the code
The parameters of this block can usually be used unchanged, but can also be modified as needed.
- The program configuration block
This block is the configuration item of the program we want to monitor. The header of the configuration block has a fixed format, a keyword program followed by a colon, followed by the program name. For example, [Program :foo], which is the name of the program it is designed to display when the program is handled in the container. The parameters of this block are described as follows:
- command-process_name: a Python string expression that represents the name of the supervisor process to be started. The default value is %(program_name) s-numprocs: The Supervisor starts multiple instances of this program, and if numprocs>1, the expression for process_name must contain %(process_num)s, which defaults to 1 - numprocs_start: An int offset used to calculate the numprocs value when an instance is started. -priority: weight controls the order in which programs are started and closed. The lower the weight, the earlier they are started and the later they are closed. The default value is 999 - autostart: if set totrueIt is designed to be rebooted automatically when it is launched. - autorestart: The value can befalse,trueAnd unexpected.falseUnexpected: Processes will restart when a program exits with an exitcode not defined in exitCodes,true: The process restarts unconditionally when exiting. - startSecs: indicates the length of time after the program is started before it is deemed to be successfully started. - Startretries: supervisord Indicates the number of times that a program is tried. Default is 3 - exitcodes: an expected exit return code, which defaults to 0,2. - stopSignal: when a stop request is received, a signal is sent to the program. The default signal is TERM signal, which can also be HUP, INT, QUIT, KILL, USR1, or USR2. - StopWaitSecs: the time it takes to wait while the operating system is sending SIGCHILD signals to the container. - StopasGroup: If set totrue, causes the Supervisor to send a stop signal to the entire process group - killasGroup: if set totrueWhen the SIGKILL signal is sent to the program, it will be sent to the entire process group and its child processes will also be affected. -user: it is designed to be used when the user is running root in the container. -redirect_stderr: If set totrueThe process prints standard errors to a standard output file descriptor in the background container it is running on. - stdout_logFILE: Writes the standard output of the process to a file. If stdout_logfile is not set or is set to AUTO, the Supervisor automatically selects a file location. - stdout_logFILe_maxbytes: indicates standard outputlogThe number of files is automatically rotated, and the unit is KB, MB, or GB. -stdout_logFILe_backups: Number of standard output log backups in rotation. The default value is 10. If the parameter is set to 0, no backups are required. The maximum number of bytes that can be written to the FIFO queue when the process is in stderr Capture mode. The unit can be KB, MB, or GB-stdout_EVENTS_enabled if set totruePROCESS_LOG_STDERR event is triggered when the process writes its stderr to the file descriptor. -stderr_logFILE: Prints the process error log to a file unless redirecT_stderr is set totrue- stderr_logFILe_maxbytes: indicates an errorlogThe number of files is automatically rotated, and the unit is KB, MB, or GB. -stderr_logFILe_backups: Number of rotation backups of error logs. The default value is 10. If this parameter is set to 0, no backups are required. The maximum number of bytes that can be written to the FIFO queue when the process is in stderr Capture mode. The unit can be KB, MB, or GB-stderr_EVENTS_enabled if set totrueThe PROCESS_LOG_STDERR event is emitted when the process is writing its stderr to the file descriptorumask: Sets the processumask- serverURL: Specifies whether to allow child processes to communicate with internal HTTP services. If this parameter is set to AUTO, the Supervisor automatically constructs a URLCopy the code
For example, the following option block represents monitoring a program called test_http:
[program:test_http]
command=python test_http.py 10000 ; Run the directory=/root/ command to start monitored processes. Do you want to do it beforecdPriority =1; The higher the number, the higher the priority numprocs=1. Start several processes autostart=true; It is designed to launch the autorestart= while the container is runningtrue; Automatic restart. Of course you have to choose Startretries =10; Exitcodes =0; exitCodes =0; The normal exit code does not restart when the exit code is this? To be determined) stopsignal=KILL; Stopwaitsecs =10; Wait time before sending SIGKILL redirect_stderr=true; Redirect stderr to stdoutCopy the code
8. Cluster management
The Supervisor does not support process monitoring across machines, and one container is only capable of monitoring programs on the machine, which limits the supervisor’s use.
However, because Supervisor itself supports XML-RPC, there are also multi-machine process management tools based on supervisor secondary development. Such as:
- Django-Dashvisor Web-based dashboard written in Python. Requires Django 1.3 or 1.4.
- Nodervisor Web-based dashboard written in Node.js.
- Supervisord-Monitor Web-based dashboard written in PHP.
- SupervisorUI Another Web-based dashboard written in PHP.
- cesi cesi is a web interface provides manage supervizors from same interface.
So many of the above, I will not, one by one to try it is also very troublesome, in addition to the last CESI, but also better understand pyhon, reluctantly installed successfully.
Cesi installation instructions refer to the original Readme. Here’s a quick note:
git clone https://github.com/Gamegos/cesi
cdCesi && mkdir pack python setup.py build python setup.py install sqlite3 / own path path/userinfo.db < userinfo.sql cp cesi.conf /etc/cesi.conf### change the content of cesi.conf
cd cesi && python web.py The startup is successful
Copy the code
Cesi. conf configuration file Settings:
[node:local] ### Set up each machine to be monitoredUsername = user password = 123 Host = 192.168.14.8 port = 9001; [node:<node_name2>]### If there are multiple machines, add them one by one; username = <username> ; password = <password> ; host = <hostname> ; port = <port> ; [environment:<environment_name>] ; members = <node_name>, <node_name2> [cesi] database = /root/temp/cesi/userinfo.dbSet the db path
activity_log = /root/temp/cesi/cesi.log Set the log pathThe host = 0.0.0.0Copy the code
If everything goes well, you can go to http://ip:5000, and the user name and password are admin. The final effect is as follows:
Example effect on cesI repo:
Note the usage and learn how to use it.
Reprinted from: www.cnblogs.com/smail-bao/p…