GitLab Workhorse

Last time, we introduced the basic functions and architecture of GitLab, but we did not specifically explain how the user’s request is processed. We just introduced the functional responsibilities of each component. This section will briefly introduce the functions of GitLab-Workhorse



Let’s start by reciting: GitLab uses Nginx to proxy front-end HTTP/HTTPS requests to Gitlab-Workhorse, which then forwards the requests to Unicorn Web server. By default, communication between Gitlab-Workhorse and the front end is done using Unix Domain sockets, but TCP forwarding requests are also supported; GitLab uses Unicorn Web server to provide dynamic Web pages and API interface

1. The Nginx entrance

As you can see from the architecture diagram, the first stop for HTTP/HTTPS requests into GitLab is Nginx

${gitlab-ce root directory}/lib/support/nginx. ${gitlab-ce root directory}/lib/support/nginx



GitLab redirects HTTP requests to HTTPS requests by default

## Redirects all HTTP traffic to the HTTPS hostServer {listen 0.0.0.0:80; listen [::]:80 ipv6only=on default_server; server_name YOUR_SERVER_FQDN; server_tokens off;return 301 https://$http_host$request_uri;
  access_log  /var/log/nginx/gitlab_access.log gitlab_ssl_access;
  error_log   /var/log/nginx/gitlab_error.log;
}
Copy the code

Location :/ proxy_pass http://gitlab-workhorse; , illustrating that nginx passes almost all HTTP/HTTPS requests to the Gitlab-Workhorse component (communicating using Unix sockets) except for some static pages

Unix Socket is an inter-process communication function implemented by Socket. It does not require complex data packing and unpacking, verification and calculation verification, and does not need to go through the network protocol stack, ensuring security and reliability. Unix Socket is one of the AF_UNIX or AF_LOCAL types of sockets. It is called Unix domain Socket for local communication, namely for IPC. Therefore, the constructor does not need IP and port. Instead, the file path

upstream gitlab-workhorse {
  # GitLab socket file,
  # for Omnibus this would be: unix:/var/opt/gitlab/gitlab-workhorse/socketserver unix:/home/git/gitlab/tmp/sockets/gitlab-workhorse.socket fail_timeout=0; }...## HTTPS hostServer {listen 0.0.0.0:443 SSL; listen [::]:443 ipv6only=on ssl default_server; server_name YOUR_SERVER_FQDN; server_tokens off; ssl on; ssl_certificate /etc/nginx/ssl/gitlab.crt; ssl_certificate_key /etc/nginx/ssl/gitlab.key; ssl_ciphers"ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES25 6-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256 -SHA:AES128-SHA:DES-CBC3-SHA:! aNULL:! eNULL:! EXPORT:! DES:! MD5:! PSK:! RC4";
  ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
  ssl_prefer_server_ciphers on;
  ssl_session_cache shared:SSL:10m;
  ssl_session_timeout 5m;
  real_ip_recursive off;    ## If you enable 'on'
  access_log  /var/log/nginx/gitlab_access.log gitlab_ssl_access;
  error_log   /var/log/nginx/gitlab_error.log;

  location / {
    client_max_body_size 0;
    gzip off;
    proxy_read_timeout      300;
    proxy_connect_timeout   300;
    proxy_redirect          off;

    proxy_http_version 1.1;

    proxy_set_header    Host                $http_host;
    proxy_set_header    X-Real-IP           $remote_addr;
    proxy_set_header    X-Forwarded-Ssl     on;
    proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
    proxy_set_header    X-Forwarded-Proto   $scheme;
    proxy_set_header    Upgrade             $http_upgrade;
    proxy_set_header    Connection          $connection_upgrade_gitlab_ssl; proxy_pass http://gitlab-workhorse; } error_page 404 /404.html; error_page 422 /422.html; error_page 500 /500.html; error_page 502 /502.html; error_page 503 /503.html; location ~ ^/(404|422|500|502|503)\.html$ { root /home/git/gitlab/public; internal; }}Copy the code

2. GitLab-workhorse

So what is Gitlab-Workhorse? It is a smart reverse proxy server for GitLab that handles high-load HTTP requests such as file upload/download, Git push/pull, and Git archive downloads. In practice, it can be complicated

+-------+  +------------------+  +---------+
|       |  |                  |  |         |
| NGINX +->| gitlab-workhorse +->| Unicorn |
|       |  |                  |  |         |
+-------+  +------------------+  +---------+
Copy the code

The following Rails components are running on the Unicorn Web server:

  1. Workhorse can handle requests that do not call Rails components, such as static JS/CSS resource files



  2. Workhorse can modify responses from Rails components. For example, suppose your Rails component usessend_file, then Gitlab-Workhorse will open the file on disk and return the file contents as the response body to the client
  3. Workhorse can take over requests to Rails components for permission to do things, such as processinggit clonePrior to confirming the current customer’s permissions, Workhorse will continue to take over after asking the Rails component for confirmationgit cloneThe request of



  4. Workhorse can modify request information before it is sent to Rails components. For example, when handling Git LFS uploads, workhorse first asks the Rails component if the current user has execution permission, then it stores the request body in a temporary file, and then it sends the Rails component a modified request body containing the path to the temporary file
  5. Workhorse manages long-running WebSocket connections that communicate with Rails components



  6. Workhorse cannot connect directly to a database and can only communicate with Rails components and, optionally, Redis components
  7. All requests to Workhorse are forwarded by the upstream proxy server (NGINx)
  8. Workhorse does not accept HTTPS connections
  9. Workhorse does not clear idle client connections
  10. All requests to Rails components go through Workhorse

Unicorn, for example, is relatively inefficient at handling static resource files, so let workHorse handle it. Due to the length of this article, only one example will be chosen: gzip resource files

In ${gitlab – workhorse root} / internal/staticpages/servefile. Go function func (s * Static) ServeExisting inside, Defines how Workhorse handles static resource files

Suppose we have to request a relative URL for/assets/locale/zh_CN/app – 3396 bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd. Js. Because of the assets prefix, Workhorse handles the request the same way it handles a static resource file, as shown in the code

// ${gitlab-workhorse root}/internal/upstream/routes.go
// Serve assets
route(
    "", `^/assets/`,
    static.ServeExisting(
        u.URLPrefix,
        staticpages.CacheExpireMax,
        NotFoundUnless(u.DevelopmentMode, proxy),
    ),
    withoutTracing(), // Tracing on assets is very noisy
),
Copy the code

Below request of js static resource files/assets/locale/zh_CN/app – 3396 bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd. Js. Using gzip



If user requests header accept-encoding: For gzip, Workhorse reads the gZIP file (the server’s pre-compressed static resource file) that requests the static resource and sends the compressed content to the browser (not directly, but through the Nginx server). Upon receiving the server’s response, the browser determines whether the content has been compressed. If compressed, decompress; Of course, if the user request header does not indicate the use of GZIP, then Workhorse reads the source file



// ${gitlab-workhorse root}/internal/staticpages/servefile.go // ... File := filepath.Join(s.doumentroot, prefix-.strip (r.rol.path)) //... // Serve pre-gzipped assetsif acceptEncoding := r.Header.Get("Accept-Encoding"); strings.Contains(acceptEncoding, "gzip") {
    content, fi, err = helper.OpenFile(file + ".gz")
    if err == nil {
        w.Header().Set("Content-Encoding"."gzip")}}Copy the code

As you can see from the following figure, the compressed GZ file is less than 1/3 of the size of the source file, which greatly saves the network bandwidth of the server



At the same time, we should also note that when accessing static resource files, the request is not forwarded to Unicorn Web server, but handled by Workhorse. This is the biggest significance of the workhorse component, which is to remedy the defects of Unicorn Web server. This leads to a famous quote:

Any problem in computer science can be solved by another layer of indirection

Any problem in computer science can be solved by adding an intermediate layer

To conclude: The workhorse component was originally designed to address git-over-HTTP/HTTPS timeouts, or to put it another way, the Workhorse component addressed requests that the Unicorn server was not good at handling, Requests such as dynamic page rendering will be proxy by Workhorse to unicorn server because unicorn server is good at handling such requests. As for the most front-end Nginx server is mainly used for HTTPS configuration and other purposes

The appendix

Refer to the link

GitLab Workhorse official warehouse