In the last article, we talked about Golang’s native resource embedding solution. In this article, we’ll take a look at one of the top open source implementations: Go-BinData.
The reason why talk about this plan first, because although its current heat and popularity is not the highest, but its scope of influence and time comprehensive, is relatively large, and in the implementation and use, because of historical reasons, its hard fork version is also the most, the situation is the most complex.
The origins of open source projects
Let’s talk about the origins of these open source projects. Currently, there are four go-bindata projects used in the project, which are as follows:
- (1500+ stars) github.com/go-bindata/…
- (840+ stars) github.com/elazarl/go-…
- (630+ stars) github.com/jteeuwen/go…
- (280+ stars) github.com/kevinburke/…
The common origin of these projects is the Jteeuwen/Go-BinData project, whose first line of code was submitted ten years ago in June 2011.
However, on February 7, 2018, the author deleted all the warehouses he created for some reasons, and the account was subsequently deactivated. Meanwhile, a well-meaning foreign user took to Twitter to warn other users.
This, of course, leads to the same awful chain reaction as the recent fake.js author deletions and the earlier NPM left-pad repository deletions, with a lot of software failing to build properly.
In some legacy projects, we can clearly see when this happened, such as Twitter’s fork of Go-BinData.
On February 8th, other students in the open source community tried to appeal to get this account and restored the code before “deleting database” to this account. In order to show that the warehouse is only for recovery, the kind people set the software warehouse as read-only (archive) and made a lei Feng statement.
In the years since, though, the warehouse has lost its original owner’s maintenance. But Golang and the Golang community ecosystem are still thriving, and the need for static resource embedding is still strong, leading to the other three open source repositories mentioned above, as well as lesser known ones that I haven’t mentioned yet.
Differences between versions of software
Having covered the origins of each open source project, let’s take a look at some of the differences between the repositories.
Of these repositories, go-Bindata/Go-Bindata is the best known version. Elazarl/Go-Bindata-assetFS provides FS encapsulation that net/ HTTP does not support in the original software. Remember the FS interface implementation mentioned in the last article? Yes, this project mainly does that. In addition, in the past few years, the explosion of front-end technology, especially spa-type front-end applications, has put Elazarl/Go-Bindata-AssetFS, a solution focused on serving single file distribution of SPA applications, into action. So if you have similar needs, you can still use this repository to package your front-end SPA project into an executable for quick distribution.
Of course, software development in the open source community is often interlaced, and not long after Elazarl/Go-bindata-assetFS provided FS encapsulation, go-bindata/ Go-bindata also provided -fs parameters, The ability to use embedded resources with NET/HTTP is supported. So if you’re looking for dependency minimization and want embedded resources to work with NET/HTTP, consider using only this repository.
In addition, some code freaks created a new fork version, kevinburke/go-bindata. The code is much more robust than the original and the Go-Bindata/Go-Bindata code, and it fixes some of the issues that the community has reported to go-Bindata/Go-Bindata, adding some new features that the community has come to expect. However, the repository, like the original, does not contain the FS encapsulation required to work with NET/HTTP. So if you want to use static resources handled by this program with NET/HTTP, you need to pair them with Elazarl/Go-bindata-assetfs, or package a simple FS yourself.
Differences between these software and official implementations
Go-bindata has some additional features compared to the official implementation:
- Allows users to read static resources using two different modes (such as reflection and
unsafe.Pointer
Or use Golang program variables for data interaction.) - Relatively low resource storage footprint in some scenarios (based on GZip compression at build time)
- The ability to dynamically adjust or preprocess a reference path to a static resource
- More open resource import mode, supporting resource import from parent directory (official implementation only supports current directory)
Of course, compared to the official implementation in the previous article, go-BinData’s implementation is “dirty” and packs static resources into a GO program file. And before the program can run, we need to perform a resource build operation to get the program running. Instead of the official implementation of “zero add pollution-free”, go Run or Go Build can solve “everything” with a single command.
Let’s talk about basic usage and performance of Go-BinData.
Basic usage: Go-bindata Default value
As in the previous article, let’s finish writing the basic functionality before looking at the performance differences.
mkdir basic-go-bindata && cd basic-go-bindata
go mod init solution-embed
Copy the code
There is a small detail here, because the latest 3.1.3 version of Go-Bindata/Go-BinData has not been officially released, so if we want to install content that contains the latest feature fixes, we need to do so in the following way:
# go get -u -v github.com/go-bindata/go-bindata@latestGo get: added github.com/go-bindata/go-bindata v3.1.2+incompatibleCopy the code
In the previous article, to use the official Go-Embed feature for resource embedding, our implementation would look something like this:
package main
import (
"embed"
"log"
"net/http"
)
//go:embed assets
var assets embed.FS
func main(a) {
mutex := http.NewServeMux()
mutex.Handle("/", http.FileServer(http.FS(assets)))
err := http.ListenAndServe(": 8080", mutex)
iferr ! =nil {
log.Fatal(err)
}
}
Copy the code
With Go-bindata, since we need to use an extra generated program file, we need to change the program to something like this and add the go:generate directive:
package main
import (
"log"
"net/http"
"solution-embed/pkg/assets"
)
//go:generate go-bindata -fs -o=pkg/assets/assets.go -pkg=assets ./assets
func main(a) {
mutex := http.NewServeMux()
mutex.Handle("/", http.FileServer(assets.AssetFile()))
err := http.ListenAndServe(": 8080", mutex)
iferr ! =nil {
log.Fatal(err)
}
}
Copy the code
Here we use the go generate directive to declare relevant commands that need to be executed before the program runs. In addition to supporting global programs in the environment, it can also run executable commands installed through go Get. If you’ve ever used the NPX (NPM) command in the Node.js ecosystem, you’ll be familiar with it, but unlike NPX, this command is more context-specific, allowing you to write in different applications, and more context-specific.
In the PKG /assets/assets.go directory of the project, a program file will appear. It contains the resources we need because the bindata implementation uses characters like \x00 for encoding. So the generated code will swell by a factor of four to five compared to the original static resource, but it will not affect the size of the compiled binary (which is consistent with the official implementation).
du -hs *
17M assets
4.0K go.mod
4.0K go.sum
4.0K main.go
83M pkg
Copy the code
Whether we choose to use the go run main. Go or go build main. Go, when the program to run after, visit http://localhost:8080/assets/example.txt to verify whether the program look normal.
The code is available at github.com/soulteary/a… You can help yourself if you are interested.
In addition, the official program does not support the use of resources outside the current program directory (need to use Go generate cp-r.. /originPath./destPath to curve save), Go-bindata can be directly used to generate resources in reference to external resources. Before providing services externally, use the -prefix parameter to adjust the reference path in the generated resource file.
Preparation: Go-bindata Default value
The test code is similar to the one in the previous section, and can be used with a few adjustments:
package main
import (
"log"
"net/http"
"net/http/pprof"
"runtime"
"solution-embed/pkg/assets"
)
//go:generate go-bindata -fs -o=pkg/assets/assets.go -pkg=assets ./assets
func registerRoute(a) *http.ServeMux {
mutex := http.NewServeMux()
mutex.Handle("/", http.FileServer(assets.AssetFile()))
return mutex
}
func enableProf(mutex *http.ServeMux) {
runtime.GOMAXPROCS(2)
runtime.SetMutexProfileFraction(1)
runtime.SetBlockProfileRate(1)
mutex.HandleFunc("/debug/pprof/", pprof.Index)
mutex.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline)
mutex.HandleFunc("/debug/pprof/profile", pprof.Profile)
mutex.HandleFunc("/debug/pprof/symbol", pprof.Symbol)
mutex.HandleFunc("/debug/pprof/trace", pprof.Trace)
}
func main(a) {
mutex := registerRoute()
enableProf(mutex)
err := http.ListenAndServe(": 8080", mutex)
iferr ! =nil {
log.Fatal(err)
}
}
Copy the code
Performance test: Go-bindata default
Except for the main program and the test program, the rest of the project content can be directly used in the previous code. After executing the benchmark.sh script, you get the same performance sampling data as in the previous article.
Back in the previous article, none of our test samples took long to execute:
=== RUN TestSmallFileRepeatRequest --- PASS: TestSmallFileRepeatRequest (0.04 s) PASS 0.813 s = = = ok solution - embed the RUN TestLargeFileRepeatRequest - PASS: TestLargeFileRepeatRequest (1.14 s) PASS 1.331 s = = = ok solution - embed the RUN TestStaticRoute - PASS: TestStaticRoute (0.00 s) = = = RUN TestSmallFileRepeatRequest - PASS: TestSmallFileRepeatRequest (0.04 s) = = = RUN TestLargeFileRepeatRequest - PASS: TestLargeFileRepeatRequest (1.12 s) PASS ok solution - embed 1.509 sCopy the code
After executing the Sampling script for Go-Bindata in this article, you can see that the overall test time is much longer:
=== RUN TestSmallFileRepeatRequest --- PASS: TestSmallFileRepeatRequest (1.47 s) PASS 2.260 s = = = ok solution - embed the RUN TestLargeFileRepeatRequest - PASS: TestLargeFileRepeatRequest (29.43 s) PASS ok solution - embed 29.808 sCopy the code
The related code used in this section, I uploaded to github.com/soulteary/a… You can help yourself if you need.
Performance of embedded large files
Here we still use go Tool pprof-http =:8090 cpu-large. Out to show the resource consumption of the application calculation call process (because there are many calls, we only look at the part of the direct relationship is large). Open http://localhost:8090/ui/ in the browser, you can see a similar call graph below:
Compared to the official Go: Embed implementation, the Embed function consumes only 0.07s and IO. Copy consumes only 0.88s. Go-bindata spent 12.99 ~ 13.08s and 26.06 ~ 27.03s on embed processing and IO. Copy respectively. The performance cost of the former increases by more than 180 times, and the latter by nearly 30 times.
Go tool pprof-http =:8090 mem-large.out
As you can see, both the complexity of the program’s call chain and the amount of resources used, the consumption of Go-bindata seems quite exaggerated. After the same one hundred quick calls, a total of 19180 MB has been used in memory, which is 3 times of the official implementation and more than 1000 times of the consumption of original resources. On average, we need to pay about 10 times of the original file resources to provide services for each request, which is very uneconomical.
Therefore, it is not difficult to draw a simple conclusion: do not embed excessively large resources in Go-bindata, which will cause a serious waste of resources. If you have such a need, you can use the official solution mentioned in the previous article to solve the problem.
Resource usage embedded in small files
After looking at large files, let’s also look at resource usage for small files. After executing go Tool pprof-http =:8090 cpu-small.out, you can see a pretty spectacular call. (This call complexity is ridiculous if our code is simple enough.)
There are no embed related function calls among the top calls in the official implementation. In go-bindata, a large number of data reading and memory copy operations consume 0.88 ~ 0.95s. In addition, GZip decompression for resources also consumes 0.85s in total.
Note, however, that this test is based on thousands of small file fetches, so the average time spent per fetch is actually acceptable. Of course, if there are similar requirements, it is more efficient to use a native implementation.
Next, look at memory resource usage. Compared to the official implementation, Go-bindata consumes about four times as much resources, and compared to the original file, we use about six times more resources. If there are too many small files or too many requests, using Go-bindata should not be an optimal solution. But if you need temporary or small files, you can use them occasionally.
Use Wrk for throughput tests
As in the previous article, we first execute go build main.go to get the built program, and then execute./main to start the service to test the throughput of small files:
# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/vue.min.js
Running 30s test@ http://localhost:8080/assets/vue.min.js 16 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 89.61 MS 73.12 MS 701.06 MS 74.80% Req/Sec 74.17 25.40 210.00 68.65% 35550 requestsin30.05 s, 3.12 GBread
Requests/sec: 1182.98
Transfer/sec: 106.43MB
Copy the code
It can be seen that compared to the official implementation in the previous article, the throughput capacity has shrunk by nearly 20 times. However, it can still maintain more than 1000 times per second, which is not a problem for small projects.
Let’s look at large file throughput:
# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/chip.jpg
Running 30s test@ http://localhost:8080/assets/chip.jpg 16 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 0.00 US 0.00 US 0.00 NAN % Req/Sec 1.66 2.68 10.00 91.26% 106 requestsin30.10 s, 1.81 GBread
Socket errors: connect 0, read0, write 0, timeout 106 Requests/ SEC: 3.52 Transfer/ SEC: 61.46MBCopy the code
In contrast, the official implementation can handle nearly 300 times per second. After using Go-bindata, it can only handle 3.5 requests per second, which further verifies the judgment that it is not recommended to use Go-bindata to process large files in the previous article.
Performance test: Go-bindata Disables GZip compression and reduces memory usage
The default Go-bindata will enable GZip compression (using the go default compression ratio). Will the performance improve if we do not enable GZip? Also, if we turned on the memory footprint reduction based on reflection and unbroadening.Pointer, would the performance of the application improve?
To turn off GZip and enable memory reduction, simply add the following parameter switch to the go: Generate directive.
-nocompress -nomemcopy
Copy the code
After re-executing Go Generate, we look at the size of the generate file and see that it is even smaller than if GZip was not enabled (there are some resources that do not fit GZip) :
du -hs *
17M assets
4.0K benchmark.sh
4.0K go.mod
4.0K go.sum
24M main
4.0K main.go
68M pkg
Copy the code
After making adjustments to the above test program, we tested it again, again by executing benchmark.sh, and we saw a qualitative change in execution time that even approximated the official implementation (only 0.01s and 0.07s).
bash benchmark.sh
=== RUN TestSmallFileRepeatRequest
--- PASS: TestSmallFileRepeatRequest (0.05s)
PASS
ok solution-embed 1.246s
=== RUN TestLargeFileRepeatRequest
--- PASS: TestLargeFileRepeatRequest (1.19s)
PASS
ok solution-embed 1.336s
Copy the code
Next, what are some of the amazing changes in program invocation?
For the code related to this section, I uploaded it to github.com/soulteary/a… If you are interested, you can help yourself and conduct experiments.
Performance of embedded large files
Go tool pprof-http =:8090 CPU-large. out It can be seen that the call complexity of resource processing here is almost the same as the official comparison. Compared with the official implementation of the call chain, it opens the reduction of memory footprint and closes the program after GZip compression. In terms of parallel calculation of the program, it is even better than the official call in the previous article.
This is why there is little difference in overall service response time even though resource processing calls have similar call complexity and even though the execution time of 0.91s is more than double the official 0.42s.
Go tool pprof-http =:8090 mem-large. Out
If you look at the previous section, you can see that with “reduce memory consumption” enabled, the memory footprint of Go-BinData is even 3MB smaller than the official implementation. Of course, even with the same resource consumption as the official implementation, we still spend approximately 3.6 times as much per request as the original file.
Resource usage embedded in small files
The test results of the small file at first glance look very similar to the official implementation, so I won’t waste too much space here. Let’s go straight to the stress test to see how well the program can handle.
Use Wrk for throughput tests
As in the previous article, we first execute go build main.go to get the built program, then execute./main to start the service, and first test the throughput of small files:
# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/vue.min.js
Running 30s test@ http://localhost:8080/assets/vue.min.js 16 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 4.22 MS 2.55 MS 47.38 MS 70.90% Req/Sec 1.46K 128.35 1.84k 77.00% 699226 requestsin30.02 s, 61.43 GBread
Requests/sec: 23292.03
Transfer/sec: 2.05GB
Copy the code
The results were amazing, with several hundred more responses per second than the official implementation. Let’s take a look at the capacity for large files:
# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/chip.jpg
Running 30s test@ http://localhost:8080/assets/chip.jpg 16 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 340.98 MS 138.47 MS 1.60s 81.04% Req/Sec 18.24 9.33 60.00 73.75% 8478 requestsin30.10 s, 141.00 GBread
Requests/sec: 281.63
Transfer/sec: 4.68GB
Copy the code
The test results for large files were virtually indistinguishable from the official implementation, with a numerical difference of a few values per second.
other
Due to space limitations, the use of “Homebrew” version of Go-bindata will not be mentioned, interested students can refer to this article to take a test.
In addition to the implementation mentioned above, there are actually some interesting implementations, although they are not well known:
- Github.com/kataras/bin…
- Web customization optimization based on IRIS, storage data and output are processed using GZip, compared with the original several times the performance.
- Github.com/conku/binda…
- Go-bindata based open source repository focusing on embedded page templates.
- Github.com/wrfly/binda…
- An optimized version that is simpler in implementation.
The last
At this point, we can make a simple decision about Go-BinData. If you’re looking for no or less reflection and unsafe.Pointer, it’s ok to use Go-BinData for a small number of files without large ones.
Once the data volume is large, it is recommended to use the official implementation. Of course, if you’re comfortable using reflection and unsafe.Pointer, Go-Bindata can give you performance comparable to the official Go-Embed implementation, as well as more customization capabilities.
–EOF
We have a little group of hundreds of people who like to do things.
In the case of no advertisement, we will talk about software and hardware, HomeLab and programming problems together, and also share some information of technical salon irregularly in the group.
Like to toss small partners welcome to scan code to add friends. (To add friends, please note your real name, source and purpose, otherwise it will not be approved)
All this stuff about getting into groups
If you think the content is still practical, welcome to share it with your friends. Thank you.
If you want to see the next content faster, please feel free to “like” or “share”, these free encouragement will affect the speed of subsequent content updates.
This article is published under a SIGNATURE 4.0 International (CC BY 4.0) license. Signature 4.0 International (CC BY 4.0)
Author: Su Yang
Creation time: January 16, 2022 statistical word count: 12144 words reading time: 25 minutes to read this article links: soulteary.com/2022/01/16/…