“This is the fifth day of my participation in the First Challenge 2022. For details: First Challenge 2022.”

This article follows up with a go memory analysis (benchmark and profile generation)

Pprof tools

With the profile file and test binaries in place, we can use the pprof tool to analyze the profile file

go tool pprof -alloc_space memcpu.test mem.out 
Copy the code

Using the -alloc_space option instead of the default -inuse_space option shows where each memory allocation occurs, whether or not it is still in memory at the time the profile is fetched. To view the algOne function, type List algOne in the pprof prompt box.

(pprof) list algOne
Total: 335.03MB ROUTINE ======================== ... /memcpu.algOne in code/go/src/... /memcpu/stream.go
 335.03MB   335.03MB (flat, cum)   100% of Total
        .          .     78:).79:// algOne is one way to solve the problem..80:func algOne(data []byte, find []byte, repl []byte, output *bytes.Buffer){.81:).82: // Use a bytes Buffer to provide a stream to process.
 318.53MB   318.53MB     83: input := bytes.NewBuffer(data)
        .          .     84:).85: // The number of bytes we are looking for..86: size := len(find)
        .          .     87:).88: // Declare the buffers we need to process the stream.
  16.50MB    16.50MB     89: buf := make([]byte, size)
        .          .     90: end := size - 1.91:).92: // Read in an initial number of bytes we need to get started..93: ifn, err := io.ReadFull(input, buf[:end]); err ! =nil || n < end {
        .          .     94:       output.Write(buf[:n])
(pprof) _
Copy the code

With the profile data and test binaries in place, we can now run the pprof tool to explore the profile data. From this profile, we now know that backup arrays of input and BUF are being allocated to the heap. Since input is a pointer variable, the profile is essentially saying that the bytes.buffer value pointed to by the input pointer is being allocated. Let’s first focus on input allocation and understand why it is allocated to the heap.

We can assume that it is being allocated because the function call bytes.newbuffer shares the bytes.buffer value, which creates the call stack. However, the presence of a value in the first column of the pprof output indicates that the value is allocated on the heap because the algOne function shares it in a way that causes it to escape.

We do not yet know why these bytes.buffers are allocated on the heap. This is where the -gcFlags “-m -m” option in the Go Build command comes in handy. Profilers can only tell you which values are escaping, but build commands can tell us why.

Compiler information

go build -gcflags "-m -m"
./stream.go:83: inlining call to bytes.NewBuffer func([]byte) *bytes.Buffer { return &bytes.Buffer literal }

./stream.go:83: &bytes.Buffer literal escapes to heap
./stream.go:83:   from ~r0 (assign-pair) at ./stream.go:83
./stream.go:83:   from input (assigned) at ./stream.go:83
./stream.go:83:   from input (interface-converted) at ./stream.go:93
./stream.go:83:   from input (passed to call[argument escapes]) at ./stream.go:93
Copy the code

The first line is interesting in that it confirms that bytes.buffer does not escape because it is passed to the call stack. This is because bytes.newBuffer is never called and the code inside the function is inline. For example, the following code:

input := bytes.NewBuffer(data)
Copy the code

Because the compiler chose to inline the bytes.NewBuffer function, the code above is converted to the following code

input := &bytes.Buffer{buf: data}
Copy the code

This means that the algOne function directly builds byte.buffer. So now the question becomes what causes this value to escape from the algOne stack, and the answer lies in the other five rows.

./stream.go:83: &bytes.Buffer literal escapes to heap
./stream.go:83:   from ~r0 (assign-pair) at ./stream.go:83
./stream.go:83:   from input (assigned) at ./stream.go:83
./stream.go:83:   from input (interface-converted) at ./stream.go:93
./stream.go:83:   from input (passed to call[argument escapes]) at ./stream.go:93
Copy the code

The input variable is assigned to interface, causing the variable to escape, and the IO.ReadFull function accepts interface. So using the interface type as a function parameter can cause escape.

We use the bytes.Buffer Read method instead of the IO.ReadFull function, and take a look at the Benchmark results.

        if_, err := input.Read(buf[end:]); err ! =nil {

            // Flush the reset of the bytes we have.
             output.Write(buf[:end])
            return
         }
Copy the code
go test -run none -bench AlgorithmOne -benchtime 3s -benchmem -memprofile mem.out
BenchmarkAlgorithmOne- 8 -    	2000000 	     1814 ns/op         5 B/op  	      1 allocs/op
Copy the code

With one less memory allocation, we saw about a 29% improvement in performance.

conclusion

Go has some great tools for understanding the decisions the compiler makes when it does escape analysis. Based on this information, we can refactor the code to keep values on the stack that do not need to be on the heap. Never write code with performance as a primary concern, which means focusing first on completeness, readability, and simplicity. Once you have a program available, determine if it is fast enough. If it is not fast enough, use the tools provided by the language to find and fix performance problems.

Refer to the translation

www.ardanlabs.com/blog/2017/0…