Translator: belonHan

Torch. Utils. is the first tool used to debug a bottleneck. It summarizes the Python analysis tool and the PyTorch automatic gradient analysis tool in script running.

Run the following command on the cli

python -m torch.utils.bottleneck /path/to/source/script.py [args]
Copy the code

Where [args] is any number of arguments to script.py. Run the python -m torch. Utils. bottleneck -h command for more help.

warning

Make sure the script exits within a limited time during analysis.

warning

When running CUDA code, due to the asynchronous nature of the CUDA kernel, the output of cProfile and the AUTOgrad analysis tool for CPU mode may not show the correct timing: the CPU time reported is the time used to boot the kernel, not the time executed on the GPU. Under conventional CPU pattern analyzers, synchronous operations are very expensive. In cases where precise timing is not possible, the AUTOgrad analysis tool in CUDA mode can be used.

Pay attention to

When choosing which analysis tool to view output (CPU or CUDA mode), first determine whether the script is CPU-bound(” Total CPU time is much greater than total CUDA time “). If CPU intensive, choose to view the results in CPU mode. Conversely, if you are running on the GPU most of the time, then look at CUDA analysis results for the corresponding CUDA operations.

Of course, the actual situation, depending on your model, may be more complex than either of the above extremes. In addition to the results of the analysis, can try to use nvprof command to see the torch. Autograd. Profilers. Emit_nvtx () results. However, it is important to note that the overhead of NVTX is very high, and timelines are often severely skewed.

warning

If you are analyzing CUDA code, the first analysis tool (cProfile) that runs with Bottleneck will include CUDA startup (CUDA cache allocation) time in its time. Of course, if CUDA startup time is much smaller than the mid-bottleneck of the code, this can be ignored.

More and more complex on the use of the analytical tools (such as much GPU), please click docs.python.org/3/library/p… Or the torch. Autograd. Profilers. Profile ().