This article is based on the practice case of privatized module processing by Youpaiyun team. It introduces how to use privatized module and the details behind go Get tool, including how to get the right source code on privatized GitLab and certification. This article is organized according to Liu Yunpeng, senior development engineer of Youpaiyun, who shared the live broadcast of Open Talk. Please click “Read the original article” at the end of the replay video.

About Open Talk: The comprehensive technology salon initiated by Youpaiyun, adhering to the original intention of “making entrepreneurship easier”, provides multi-dimensional knowledge sharing including technology, operation and maintenance, products and entrepreneurship for technical developers in the form of all-dry goods, helps enterprise members improve professional skills and promotes better and faster development of enterprises.

Research and development background

GO introduced Module features in version 1.11; Module checksum checks were introduced in version 1.13, which enhanced Module security. Module mode is now used by default in version 1.16. In a recent blog post, the GO team announced that GOPAHT support will be removed in 1.17. If you’re not already using the GO MODULE, try the GOMDULE.

The main difference between GOMODULE and GOPATH is the use of privatized modules. Socialized module use is the same, all directly get modules through go Get. For privatized Module GOPAHT can directly throw Module code in GOPAHT directory, but GO Module can not, it has its own code management mode, let’s briefly introduce the following.

How does GO get the Module

The GO get module is usually obtained by using the GO Get tool. Currently, GO Get supports two ways:

The first is to pull code from the traditional VCS hosting platform, mainly git, but also SVN, HG, and other platforms.

The second is through the GOPROXY protocol, which has been supported since version 1.12, where Go gets the code archive on the GOPROXY server.

Since 1.13 GO also uses checksum checking — GO SUM, which is checked for all modules after downloading. It compares the hash value of the downloaded module with the hash value in Google’s online database to prevent the module from being tampered with. Only the verified module can be installed and used normally.

How to obtain a VCS module

GO supports a number of version management tools. The first step is to decide which version management tool to use to get the module. The judgment methods can be roughly divided into three categories, independent of the other two static matching methods and one dynamic matching method.

Static matching mode

** Prefix matching: ** For example, github, Google’s Bitbuket, Apache, openstack and other code hosting platforms will internalize the module prefix in the go Get tool chain, and use the corresponding version management tool when the prefix matches. In the example on the left, the github.com/eamaple/pkg module matches the prefix, matches github, and knows that Github uses git.

** Re matches: ** re is used by adding a suffix to the module. The suffix name can be a suffix of one of the five version management tools (git, SVN, HG, BZR, and Fossil) described above. Suffixes are matched by regular expressions. In the two examples above, the.git suffix is used to obtain sub-groups through regular expression matching, that is, VCS sub-groups are matched to modules managed using Git.

Dynamic matching

When the prefix and the regular expression do not match, the system checks dynamically. Go get sends an HTTP request for the module with a protocol header and parameters (Go-get =1). Go GET expects the server to return information from the module to help go GET perform further operations. GO will send HTTPS requests by default. If the server wants to use HTTP, it can process them through the environment variable GOINSECURE. When GOINSECURE is 1, GO will use HTTP.

The expected return body of Go GET is an HTML document in which the meta tag with the name=”go-import” attribute is meaningful for Go. The meta tag tells GO how to get the module via the Content attribute.

The content consists of three parts: the first part is root-path, which refers to the module name; VCS represents the management tools to be used, such as Git and SVN. ; The third part of the repo-URL refers to the repository under which the module’s original code is stored, which should be in the form of a protocol plus the repository address.

Using curl to send a GO get request, golang.org/x/net server returns an HTML document with the meta tags in the red circle. Content The first part is the GO module name golang.org/x/net; The second part is git, which means you need to use git to get the source code. The third part is the address of the module hosting, indicating that it is hosted at the address of the module package googlesource.com/net. Note that meta tags can only be placed inside the head. Go get parsing starts at the beginning and stops when the end tag of the head or the start tag of the body is encountered.

GIT application in GO GET

Git supports HTTP and SSH. By default, only HTTP is used when GO invokes Git, and git interaction is disabled during invocation. For example, when Git uses HTTP to clone a private repository, the user name and password need to be entered. However, when Git is invoked by GO, the user name and password cannot be entered interactively. As a result, obtaining modules fails. Interaction is controlled by the environment variable GIT_TERMINAL_PROMPT, which can be enabled to manually enter the user name and password if the variable is manually forced to 1.

How do you pass a user name and password to Git without being aware of it? In git, if you use HTTP, you can pass the user name and password through the netrc file. This file is located in the HOME directory. There are two file formats:

  • The first method is to define the user name and password of the server by using the server name and user name password.

  • The second way is to specify the same user name and password for all servers without specifying a server.

As shown in the figure above, gitlab.com is configured in the first one with the user name root and password admin. When git is used to clone gitLab’s private repository, you can pass the user names root and admin to Git, so that Git will not be aware of the user name and password, so that it will not require password input. Set the default user name and password for all servers by default. The user name is guest and the password is 123456, indicating that when all servers except gitlab.com need authentication, Both guest and 123456 are passed to the program as user name and password.

Go also supports SSH when calling Git, but does not use it by default. The SSH protocol can be used only when the display is specified during dynamic fetching. If static matching (prefix matching or regular matching) is used, only THE HTTPS protocol can be used to match the module information.

In the above module is example.com/pkg, warehouse address is gitlab.example.com/example/pkg. The content of the meta tag contains the complete module information. The first part is the module name, which is the same as the previous module name definition. Git, which means using git to get the code, and the last part is the repository address, which shows the specified SSH protocol, as well as the user name of Git and the SSH server port number.

Git SSH authentication is based on a key pair. If a key pair is unavailable, you can use the SSH tool suite ssh-keygen to generate a key. The previous figure lists the common parameter -t, which can specify the key type. The RSA key may be the most commonly used, and I prefer to use ED25519. It has an obvious advantage that the key length is very short, the public key and private key are only 32 bytes, and the security can be comparable with RSA key 3000 bits, which can ensure security, and the key length is short. Therefore, ED25519 is often used as the key.

When the secret key pair is generated, the file code for the key queue, including the private and public keys, is generated in the.ssh folder under HOME. The files ending in.pub” j are public key files that need to be configured on code hosting platforms such as Gitlab or Github. On the right is the screenshot of Gitlab. The key used in the picture is the key in ED25519 format. It can be seen that the length is really very short.

GOPROXY obtains the module

GO The GO module can be obtained through GOPROXY. The module is based on the HTTP protocol and only uses THE HTTP GET request, and uses the standard HTTP status code to make the call. If the GOPROXY protocol is used, the GOPROXY server does not have a user name or password by default. However, if you want to set up private, you can support HTTP base authorization, as before, through the.netrc file to configure the user name and password. GOPROXY has two other features:

  • First: GOPROXY can get modules faster than VCS clone directly, for reasons explained later.

  • Second: it can solve the problem that modules cannot be accessed, such as the problem that Golang domain name cannot be accessed, and can be accessed and downloaded to these modules through the proxy server built by the third party.

GOPROXY use

The GOPROXY configuration is controlled by the GOPROXY environment variable, and the PROXY server URL is configured. Multiple proxy urls can be configured, separated by commas and pipe characters, which are explained in an example below.

Urls can be replaced by fixed strings off and direct. Off prevents modules from being downloaded from any source. Setting GOPROXY to off prevents modules from being downloaded, and only local modules can be used, whether from GitLab, Github or elsewhere. Direct indicates pulling directly from VCS and is usually used as an alternative.

Two examples are shown here:

  • The first is the syntax for Linux environment variables, which are set by export. Proxy.golang.org, Google’s official GoProxy server, was configured, and the alternative, Direct, was specified after commas. When the GOPROXY server returns status codes 403 and 410, the module cannot be found. Only if the server returns a 403 or 410 status code, go Get tries to use the alternative, which is downloaded from the version management platform.

  • The second uses a different syntax configuration, the go env-w syntax is shipped with Go and is supported starting in go 1.13. It can be used across platforms. With this syntax, environment variables related to GO can be configured in this way on Windows, Linux, and Max, regardless of operating system. In this example, goproxy.cn is the common proxy address in China. Pipe characters are used to specify alternatives, meaning that whatever error the proxy server returns, even if it is not an HTTP error, such as a GOproxy server hanging error that returns 500, or a network error. Try to download modules using alternatives.

GOPROXY implementation

The implementation of GOPROXY is simple, with only five interfaces officially defined.

The meanings of the three variables in the URL are as follows:

  • Base indicates the URL of the GOPROXY server.

  • Module Indicates the name of the module to be obtained.

  • Version is the version of a module.

Case coding problem

The HTTP URL definition is case insensitive, and confusion can arise on some systems when module or version is capitalized. To avoid this problem, you need to use case encoding, converting uppercase letters into exclamation points and lowercase letters.

  • The first interface is to get a list of all versions;

  • The second interface is to get information about the specified version;

  • The third interface is to obtain the specified module, the specified version of the mod file;

  • The fourth interface is to get the latest version of the module. This is an optional interface. GOPROXY still works without providing and implementing this interface.

  • The last interface is to download the zip file for the specified version of the module.

In this example, proxy.golang.org is the address of the proxy server, golang.org/x/text is the name of the module to get, @v is a fixed string, and list is the list interface to invoke. It can be seen that the interface returns all versions of the Text package. In the figure, GO can infer the latest version of the module through version semantics after obtaining all versions.

As shown in the figure above, the INFO and LATEST interfaces return the same content. Version: indicates the Version number of the fixed Version string. Time is an optional character string in the FC3339 Time format, indicating the submission Time of the Version.

Finally, the MOD and ZIP interfaces. The MOD interface returns a specified version of the MOD file. In the example above, we get the latest version of the MOD file. The Text package only relies on the Tools module. ZIP file interface is to get the ZIP file of the specified version of the module. When it packages all the original files of the version into a ZIP file, go Get finally downloads the module of this version through the interface.

As mentioned above, it is faster to obtain source code through GOPROXY than VCS. To download the source code through ZIP, only all files of the current version will be downloaded without historical version information. If you clone the repository through VCS, such as Git, all historical version information will be obtained. Therefore, the file size obtained through the GOPROXY ZIP interface is smaller and the download is faster. Note that GOPROXY defines the size of the module ZIP file and the total uncompressed limit of all files to 500 MiB. The size of the go.mod file and LICENSE file is limited to 16 MiB.

The module verification

In version Go1.13, module SUM verification mechanism is introduced. By default, all go modules will verify whether their hash is the same as that recorded online (default: sum.golang.org domestic: sum.golang.google.cn) after downloading.

The validation process is controlled by the environment variables GONOSUMDB and GOSUMDB: First, look at the configuration of GOSUMDB, which specifies the online database address to use. As the default sum.golang.org cannot be accessed in China, the domestic image built by Google is configured in the figure above. It can also be set to OFF to disable verification, that is, the downloaded module does not perform verification of hash values, and this process is completely abandoned. I do not recommend doing this. GONOSUMDB environment variables can be used to configure modules that do not need to be authenticated, such as private modules that cannot be authenticated. GONOSUMDE runs with prefix matching. If gitlab.com is configured in the figure, all packages that start with gitlab.com will not be checksum checked for GO.

Here’s a look at common variables:

  • GONOPROXY runs based on prefix matching. In the figure above, gitlab.com is specified, that is, all the codes on Gitlab.com are not obtained from the GOPROXY server, but are directly pulled from the original code server through traditional VCS.

  • GONOSUMDB, which allows modules whose prefix matches to skip security checks;

  • GOPRIVATE is equivalent to the set of the first two environment variables. Configuring GOPRIVATE is equivalent to configuring the first two environment variables together.

  • GOVCS, which was added in the GO1.16 version, is mainly used to specify which modules use which VCS.

Clap cloud business practice again

Use of private packages

Here’s how to use private modules. Generally, gitLab service built by privatization is widely used in companies. Gitlab itself supports HTTP requests in response to Go GET. When obtaining the package through Go GET, the client will send an HTTP request to the GitLab server, and the server will return a response containing meta tags after receiving the request. This tag tells the client that the module uses Git to get the source code over HTTP. Gitlab uses HTTPS by default. After receiving the response result of Gitlab server, the client can correctly use Git to pull the source code of the module. After the module is downloaded, there will also be a checksum check process. You can add gitlab.com to the GOPRIVATE variable to inform go gitlabc.com related modules are private modules and skip the checksum check.

In the internal practice of Youpaiyun, the situation is somewhat different. All HTTP services used inside Youpaiyun need to be verified by Google. All requests to the internal GitLab server are pre-checked for a Google-authorized HEAD, and if not, are intercepted with an error 403. This will result in all simple HTTP requests not reaching the GitLab server being blocked directly. HTTP requests sent by GO will also be intercepted, causing GO to fail to obtain module information correctly. At this time, although the original code on the Clone server can be directly accessed through SSH protocol, the request fails because go Get does not have such information. Therefore, the gray line in the figure below indicates that the request is not actually sent.

So how to solve it? The approach is to use additional HTTP services to handle HTTP requests for Go Get. The additional HTTP service has no verification process. After the request is approved, go Get will correctly obtain the required meta information. The meta must specify the SSH protocol, because the Gitlab HTTP service has secondary authentication, and no request without authentication can pass, so only SSH protocol can be used. Permission authentication can be performed by SSH key pair, enabling non-perceptive authorization. The Go Get bootstrap HTTP service does not manage authorization-related issues; all authorization processing is handed over to GitLab. As a private module, if there is no corresponding responder, the authorization is handled by GitLab.

Ask for guidance

How does it work to use additional services to bootstrap Go Get? This requires a change in the naming of module packages, based on gitLab naming rules.

gitlab.com/lyp256/pkg

Domain name warehouse name

A complete module consists of several parts, the first is the domain name gitlab.com, lyp256 is the owner, and PKG is the project name of the module. Important for a single GitLab platform are the last two paragraphs, which specify the module owner and project name. The domain name is definitely fixed and can be ignored.

Based on these rules I implemented a simple little service to handle go Get HTTP requests. The code is as follows:

Gitlab CI practice

Gitlab CI will start with an empty container, as shown in the example using golang Alpine’s mirror. There is nothing in this image but Golang. We need to install dependencies and inject SSH authentication content. The following is defined in script:

Step 1: Use mikdir -p to create a directory under the cache. This directory is the cache on our CI machine. Mount a space on the physical disk to store data and use it to cache the Go mod to reduce module downloads.

Step 2: Install the basic environment and tool packages. The example in the figure has git and g++ installed. G++ is the dependency required for go compilation, and openssh is the tool chain for SSH that git uses.

Step 3: Process the SSH secret key. There are two steps, trust the GitLab server key and import the authentication private key. The private key is imported through the environment variable DEPLOYSSHKEY. You only need to save the content of the environment variable to the corresponding key file. The Gitlab server key is obtained using SSH − Keyscan and saved to the knownhosts file. The private key that can access git project is imported in the environment variable DEPLOY_SSH_KEY through the configuration of gitlabSI. You only need to keep the contents of this environment variable in the corresponding key file. The gitlab server key is obtained using ssh-keyscan and saved to the known_hosts file. Through the configuration of Gitlab SI, the private key that can access git project is imported in the environment variable DEPLOYSSHKEY. You only need to reserve the content of the environment variable to the corresponding secret key file. The Gitlab server key is obtained using SSH − Keyscan and saved to the knownhosts file. The gitlabSI configuration places the private key that can access the Git project in the environment variable DEPLOY_SSH_KEY and the private key in the appropriate SSH private key file and grants the correct permissions.

Finally, you need to configure the GOPRIVATE variable to define all go.HoldCloud.com related modules as PRIVATE modules without using proxies and check and check.

Up to now, all the preparations have been basically completed. The following GO test is the normal CI test logic, which can be written according to the actual situation.

conclusion

  • GO will remove support for GOPATH in version 1.17. It is recommended to migrate to GOMDULE as soon as possible.

  • The checksum check of GO can detect code changes and improve security and availability. It is recommended not to turn it off.

  • It is recommended to retain vendor to prevent dependent modules from being deleted.

Recommended reading

Practical notes: Configure the mental path of the monitoring service for NSQ

Say goodbye to DNS hijacking, one article read DoH