Sample code can be found at: github.com/wenjianzhan…
Common analysis indicators
- Wall Time // The absolute Time for the program to run
- CPU Time // CPU consumption Time
- Block Time
- Menory allocation // Memory allocation
- GC times. Time Spent // GC time spent
The sample code
structs.go
package profiling
type Request struct {
TransactionID string `json:"transaction_id"`
PayLoad []int `json:"payload"`
}
type Response struct {
TransactionID string `json:"transaction_id"`
Expression string `json:"exp"`
}
Copy the code
optmization.go
package profiling
import (
"encoding/json"
"strconv"
"strings"
)
func createRequest(a) string {
payload := make([]int.100.100)
for i := 0; i < 100; i++ {
payload[i] = i
}
req := Request{"demo_transaction", payload}
v, err := json.Marshal(&req)
iferr ! =nil {
panic(err)
}
return string(v)
}
func processRequest(reqs []string) []string {
reps := []string{}
for _, req := range reqs {
reqObj := &Request{}
json.Unmarshal([]byte(req), reqObj)
ret := ""
for _, e := range reqObj.PayLoad {
ret += strconv.Itoa(e) + ","
}
repObj := &Response{reqObj.TransactionID, ret}
repJson, err := json.Marshal(&repObj)
iferr ! =nil {
panic(err)
}
reps = append(reps, string(repJson))
}
return reps
}
Copy the code
optmization_test.go
package profiling
import "testing"
func TestCreateRequest(t *testing.T) {
str := createRequest()
t.Log(str)
}
func TestProcessRequest(t *testing.T) {
reqs := []string{}
reqs = append(reqs, createRequest())
reps := processRequest(reqs)
t.Log(reps[0])}func BenchmarkProcessRequest(b *testing.B) {
reqs := []string{}
reqs = append(reqs, createRequest())
b.ResetTimer()
for i := 0; i < b.N; i++ {
_ = processRequestOld(reqs)
}
b.StopTimer()
}
Copy the code
The test script
$ go test -bench=. goos: darwin goarch: amd64 pkg: github.com/wenjianzhang/golearning/src/ch45 BenchmarkProcessRequest-4 200000 6814 ns/op PASS ok Github.com/wenjianzhang/golearning/src/ch45 1.745 sCopy the code
To optimize
Performance analysis
Cpu.porf is generated first
$ go test -bench=.
Copy the code
The output
goos: darwin goarch: amd64 pkg: github.com/wenjianzhang/golearning/src/ch45 BenchmarkProcessRequests-4 100000 18050 ns/op PASS ok Github.com/wenjianzhang/golearning/src/ch45 2.501 sCopy the code
Use ppFOR tool for analysis, access to the console
$ go tool pprof cpu.prof
Copy the code
The output
Type: cpu
Time: Nov 17, 2019 at 12:24am (CST)
Duration: 3.17s, Total samples = 2.84s (89.66%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)
Copy the code
Check the ones that take more time
(pprof) top
Copy the code
The output
(pprof) top -cum
Copy the code
Sorting result
(pprof) list processRequest
Copy the code
The results of the analysis
. 540ms 46: json.Unmarshal([]byte(req), reqObj)
Copy the code
Because we use go built-in JSON serialization and deserialization in the above code, and they are realized using reflection mechanism, the efficiency is slower than that. Next, we use EasyJSON for optimization. First, we need to use EasyJSON tool to generate struct objects
$ ls
Copy the code
$ easyjson -all structs.go
$ ls
Copy the code
Structs_easyjson. go file
Code to improve
A code
// json.Unmarshal([]byte(req), reqObj)
reqObj.UnmarshalJSON([]byte(req))
Copy the code
Code 2
// repJson, err := json.Marshal(&repObj)
repJson, err := repObj.MarshalJSON()
Copy the code
The name of the improved method method is processRequest, and the name of the former method is processRequestOld
optmization.go
func processRequest(reqs []string) []string {
reps := []string{}
for _, req := range reqs {
reqObj := &Request{}
reqObj.UnmarshalJSON([]byte(req))
var buf strings.Builder
for _, e := range reqObj.PayLoad {
buf.WriteString(strconv.Itoa(e))
buf.WriteString(",")
}
repObj := &Response{reqObj.TransactionID, buf.String()}
repJson, err := repObj.MarshalJSON()
iferr ! =nil {
panic(err)
}
reps = append(reps, string(repJson))
}
return reps
}
func processRequestOld(reqs []string) []string {
reps := []string{}
for _, req := range reqs {
reqObj := &Request{}
json.Unmarshal([]byte(req), reqObj)
ret := ""
for _, e := range reqObj.PayLoad {
ret += strconv.Itoa(e) + ","
}
repObj := &Response{reqObj.TransactionID, ret}
repJson, err := json.Marshal(&repObj)
iferr ! =nil {
panic(err)
}
reps = append(reps, string(repJson))
}
return reps
}
Copy the code
After the modification is complete, we execute the test program results and have been before
=== RUN TestProcessRequest -- PASS: TestProcessRequest (0.00s) Optimization_test. go:14: {" transaction_id ":" demo_transaction ", "exp" : "zero,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68, , 69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,} "PASS the Process finished with exit code 0Copy the code
Let’s do the benchmark again
$ go test -bench=.
Copy the code
The output
goos: darwin goarch: amd64 pkg: github.com/wenjianzhang/golearning/src/ch45 BenchmarkProcessRequests-4 100000 11818 ns/op PASS ok Github.com/wenjianzhang/golearning/src/ch45 1.870 sCopy the code
Enter the Pprof console
$ go test -bench=. -cpuporfile=cpu.prof
$ go tool pprof cpu.porf
Copy the code
ROUTINE ======================== github.com/wenjianzhang/golearning/src/ch45.processRequest in / Users/zhangwenjian/Code/golearning/SRC/ch45 / optmization go 10 ms 210 ms (flat, cum). 13.04% of the Total. 20: . . 21:func processRequest(reqs []string) []string { . . 22: reps := []string{} . . 23: for _, req := range reqs { . . 24: reqObj := &Request{} . 30ms 25: reqObj.UnmarshalJSON([]byte(req)) . . 26: . . 27: ret := "" . . 28: for _, e := range reqObj.PayLoad { 10ms 140ms 29: ret += strconv.Itoa(e) + "," . . 30: } . . 31: repObj := &Response{reqObj.TransactionID, ret} . 30ms 32: repJson, err := repObj.MarshalJSON() . . 33: if err ! = nil { . . 34: panic(err) . . 35: } . 10ms 36: reps = append(reps, string(repJson)) . . 37: } . . 38: return reps . . 39:} . . 40: . . 41://func processRequest(reqs []string) []string { (pprof)Copy the code
The previous 540ms to 30ms code can actually be further optimized…
Sample code can be found at: github.com/wenjianzhan…