Golang introduces go.sum on the basis of Go. mod to ensure the security of dependencies. The function of go.sum file is to record the hash value of dependencies of projects and prevent them from being modified.
After analyzing the go.sum file of a specific project, it can be found that go.sum not only records the hash value of go.mod, but also the hash value of the whole module.
The main purpose of this is to find child dependencies when downloading the entire module, so that multiple dependencies can be downloaded in parallel.
At first, I thought that the hash value recorded in Go. sum was the result of base64 encoding after direct calculation through SHA256, but the base64 value obtained in actual operation verification was always inconsistent with that recorded in Go. sum. Therefore by looking at the go source (/ usr/local/go/SRC/CMD/go/face/usr/local/go/src/cmd/vendor/golang.org/x/mod/sumdb/dirhash package is referenced, This is also the core of the underlying algorithm that implements Go.sum) discovered that Golang’s hash of files and entire projects is not simply sha256 and Base64 encoding.
Case analysis
Cloud.google.com/go/firestore v1.1.0 / go mod h1: ulACoGHTpvq5r8rxGJ4ddJZBZqakUQqClKRT5SZwBmk =#The general idea is that
<module> <version>/go.mod h1:<sha256hash+base64>
#The first segment is the module-dependent path
#The second paragraph is the version information/specific document
#The third paragraph is the calculated SHA256 hash value of the file's contents followed by the BASH64 encoded value
#H1 stands for SHA256 +base64
Copy the code
Special hash
Special hash of go.mod
#Enter: go.mod file path
#Steps:
#1. Open the go.mod file, read the file content, and calculate the SHA256 hash to obtain the sha256hash
#2. Construct a new string base64IN ="sha256hash go.mod\n", separated by two Spaces, and must end with a loop character
#3. Encode base64IN as input to Base64 to obtain Base64encode
#4. String splicing results h1:base64encode as in Go. sum
Copy the code
The hash calculation of go.mod can be obtained by shell simulation, but the hash calculation of the whole module cannot be done. The following is to simulate the above process by shell command
$ sha256sum go.mod
5a93925e1efdeecd8b5755d089fdba6dfb3c04eb85447e8dec8b31cdb44203ab go.mod #sha256hash
$ vim base64in.txt5 a93925e1efdeecd8b5755d089fdba6dfb3c04eb85447e8dec8b31cdb44203ab go. Mod # base64in strings, pay attention to the following circular operator cannot little, otherwise and Golang results
$sha256sum base64in.txt | xxd -r -ps | base64
+DbmgtsW3Ksw3QccfHlswRDLj07woKf4ku0C0xYA7u0= #base64encode
#The end result after string concatenation can get h1: + DbmgtsW3Ksw3QccfHlswRDLj07woKf4ku0C0xYA7u0 =
#In writing. The sum needed to write at the same time < module > < version > / go mod h1: + DbmgtsW3Ksw3QccfHlswRDLj07woKf4ku0C0xYA7u0 =
Copy the code
Special hash for the entire module
When calculating the hash of the entire module, you do not hash the packed ZIP package directly, but iterate the decompressed file and then perform a total hash calculation. In this way, you can avoid inconsistent hash results of the entire ZIP package caused by byte differences during the packing of the ZIP algorithm
#Input: module directory and the import path of the module (the same import path used in the source code)
#Steps:
#1. Traverse all files in the module
#Only files are considered, not directories
#Ignore all files in the. Git directory
#Splice each file relative path together with the import path
#For example, import path"github.com/spf13/cobra", the package in the command. Go file after joining together for: github.com/spf13/cobra/command.go
#Store the results of the traversal in a list for later calculationhash
#2. Sort the list from the previous step (sorting is guaranteedhashConsistent results)
#3. Then iteratehash, its calculation process is to read a file in the sorted list for SHA256hash 将"ha256hash github.com/spf13/cobra/command.go\n"String concatenation in the latter filehashBefore the result, and so on, you end up with all fileshashResulting string
#4. Run sha256 for the long stringhashThe result of calculation is sha256hash and base64 encoding to get base64encode
#5. Write go.sum as follows:Github.com/spf13/cobra v1.1.3 h1: xghbfqPkxzxP3C/f3n5DdpAbdKLj4ZE4BWQI362l53M = github.com/spf13/cobra v1.1.3 / go mod h1:pGADOWyqRD/YMrPZigI/zbliZ2wVD/23d+is3pSWzOo=#The first line is for the entire packagehashThe results of
#The second line is for go.modhashThe results of
Copy the code
The above process can be found in the Golang source code, and there is a god on Github who has replicated this particular hash: hub.fastgit.org/vikyd/go-ch…