GitLab Workhorse
Last time, we introduced the basic functions and architecture of GitLab, but we did not specifically explain how the user’s request is processed. We just introduced the functional responsibilities of each component. This section will briefly introduce the functions of GitLab-Workhorse
Let’s start by reciting: GitLab uses Nginx to proxy front-end HTTP/HTTPS requests to Gitlab-Workhorse, which then forwards the requests to Unicorn Web server. By default, communication between Gitlab-Workhorse and the front end is done using Unix Domain sockets, but TCP forwarding requests are also supported; GitLab uses Unicorn Web server to provide dynamic Web pages and API interface
1. The Nginx entrance
As you can see from the architecture diagram, the first stop for HTTP/HTTPS requests into GitLab is Nginx
${gitlab-ce root directory}/lib/support/nginx. ${gitlab-ce root directory}/lib/support/nginx
GitLab redirects HTTP requests to HTTPS requests by default
## Redirects all HTTP traffic to the HTTPS hostServer {listen 0.0.0.0:80; listen [::]:80 ipv6only=on default_server; server_name YOUR_SERVER_FQDN; server_tokens off;return 301 https://$http_host$request_uri;
access_log /var/log/nginx/gitlab_access.log gitlab_ssl_access;
error_log /var/log/nginx/gitlab_error.log;
}
Copy the code
Location :/ proxy_pass http://gitlab-workhorse; , illustrating that nginx passes almost all HTTP/HTTPS requests to the Gitlab-Workhorse component (communicating using Unix sockets) except for some static pages
Unix Socket is an inter-process communication function implemented by Socket. It does not require complex data packing and unpacking, verification and calculation verification, and does not need to go through the network protocol stack, ensuring security and reliability. Unix Socket is one of the AF_UNIX or AF_LOCAL types of sockets. It is called Unix domain Socket for local communication, namely for IPC. Therefore, the constructor does not need IP and port. Instead, the file path
upstream gitlab-workhorse {
# GitLab socket file,
# for Omnibus this would be: unix:/var/opt/gitlab/gitlab-workhorse/socketserver unix:/home/git/gitlab/tmp/sockets/gitlab-workhorse.socket fail_timeout=0; }...## HTTPS hostServer {listen 0.0.0.0:443 SSL; listen [::]:443 ipv6only=on ssl default_server; server_name YOUR_SERVER_FQDN; server_tokens off; ssl on; ssl_certificate /etc/nginx/ssl/gitlab.crt; ssl_certificate_key /etc/nginx/ssl/gitlab.key; ssl_ciphers"ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES25 6-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256 -SHA:AES128-SHA:DES-CBC3-SHA:! aNULL:! eNULL:! EXPORT:! DES:! MD5:! PSK:! RC4";
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 5m;
real_ip_recursive off; ## If you enable 'on'
access_log /var/log/nginx/gitlab_access.log gitlab_ssl_access;
error_log /var/log/nginx/gitlab_error.log;
location / {
client_max_body_size 0;
gzip off;
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Ssl on;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade_gitlab_ssl; proxy_pass http://gitlab-workhorse; } error_page 404 /404.html; error_page 422 /422.html; error_page 500 /500.html; error_page 502 /502.html; error_page 503 /503.html; location ~ ^/(404|422|500|502|503)\.html$ { root /home/git/gitlab/public; internal; }}Copy the code
2. GitLab-workhorse
So what is Gitlab-Workhorse? It is a smart reverse proxy server for GitLab that handles high-load HTTP requests such as file upload/download, Git push/pull, and Git archive downloads. In practice, it can be complicated
+-------+ +------------------+ +---------+
| | | | | |
| NGINX +->| gitlab-workhorse +->| Unicorn |
| | | | | |
+-------+ +------------------+ +---------+
Copy the code
The following Rails components are running on the Unicorn Web server:
- Workhorse can handle requests that do not call Rails components, such as static JS/CSS resource files
- Workhorse can modify responses from Rails components. For example, suppose your Rails component uses
send_file
, then Gitlab-Workhorse will open the file on disk and return the file contents as the response body to the client - Workhorse can take over requests to Rails components for permission to do things, such as processing
git clone
Prior to confirming the current customer’s permissions, Workhorse will continue to take over after asking the Rails component for confirmationgit clone
The request of - Workhorse can modify request information before it is sent to Rails components. For example, when handling Git LFS uploads, workhorse first asks the Rails component if the current user has execution permission, then it stores the request body in a temporary file, and then it sends the Rails component a modified request body containing the path to the temporary file
- Workhorse manages long-running WebSocket connections that communicate with Rails components
- Workhorse cannot connect directly to a database and can only communicate with Rails components and, optionally, Redis components
- All requests to Workhorse are forwarded by the upstream proxy server (NGINx)
- Workhorse does not accept HTTPS connections
- Workhorse does not clear idle client connections
- All requests to Rails components go through Workhorse
Unicorn, for example, is relatively inefficient at handling static resource files, so let workHorse handle it. Due to the length of this article, only one example will be chosen: gzip resource files
In ${gitlab – workhorse root} / internal/staticpages/servefile. Go function func (s * Static) ServeExisting inside, Defines how Workhorse handles static resource files
Suppose we have to request a relative URL for/assets/locale/zh_CN/app – 3396 bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd. Js. Because of the assets prefix, Workhorse handles the request the same way it handles a static resource file, as shown in the code
// ${gitlab-workhorse root}/internal/upstream/routes.go
// Serve assets
route(
"", `^/assets/`,
static.ServeExisting(
u.URLPrefix,
staticpages.CacheExpireMax,
NotFoundUnless(u.DevelopmentMode, proxy),
),
withoutTracing(), // Tracing on assets is very noisy
),
Copy the code
Below request of js static resource files/assets/locale/zh_CN/app – 3396 bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd. Js. Using gzip
If user requests header accept-encoding: For gzip, Workhorse reads the gZIP file (the server’s pre-compressed static resource file) that requests the static resource and sends the compressed content to the browser (not directly, but through the Nginx server). Upon receiving the server’s response, the browser determines whether the content has been compressed. If compressed, decompress; Of course, if the user request header does not indicate the use of GZIP, then Workhorse reads the source file
// ${gitlab-workhorse root}/internal/staticpages/servefile.go // ... File := filepath.Join(s.doumentroot, prefix-.strip (r.rol.path)) //... // Serve pre-gzipped assetsif acceptEncoding := r.Header.Get("Accept-Encoding"); strings.Contains(acceptEncoding, "gzip") {
content, fi, err = helper.OpenFile(file + ".gz")
if err == nil {
w.Header().Set("Content-Encoding"."gzip")}}Copy the code
As you can see from the following figure, the compressed GZ file is less than 1/3 of the size of the source file, which greatly saves the network bandwidth of the server
At the same time, we should also note that when accessing static resource files, the request is not forwarded to Unicorn Web server, but handled by Workhorse. This is the biggest significance of the workhorse component, which is to remedy the defects of Unicorn Web server. This leads to a famous quote:
Any problem in computer science can be solved by another layer of indirection
Any problem in computer science can be solved by adding an intermediate layer
To conclude: The workhorse component was originally designed to address git-over-HTTP/HTTPS timeouts, or to put it another way, the Workhorse component addressed requests that the Unicorn server was not good at handling, Requests such as dynamic page rendering will be proxy by Workhorse to unicorn server because unicorn server is good at handling such requests. As for the most front-end Nginx server is mainly used for HTTPS configuration and other purposes
The appendix
Refer to the link
GitLab Workhorse official warehouse