Source code compiled for TensorFlow GPU edition

Software and Hardware Environment

Ubuntu 18.04 64 – bit
3.7.6 anaconda3 with python
Tensorflow 2.2.0
Bazel 2.0.0
Cuda 10.1
Cudnn 7.6.5
gcc 7
nvidia gtx 1070Ti

Tensorflow profile

Tensorflow is an open source machine learning framework launched by Google. It provides apis of c++, python, Java, javascript, go and other languages. It is fast, flexible, and suitable for large-scale production-level applications, so that every developer can easily use artificial intelligence to solve various practical problems. So it’s very popular.

Tensorflow gets its name from how it works. Tensor means an n-dimensional array, flow means calculation based on a data flow diagram, and tensorflow is the calculation process of tensors flowing from one end of a flow diagram to the other.

Computation in TensorFlow can be represented as a directed graph, or computation graph, in which each computation operation is treated as a node and the links between nodes are called edges. This graph describes the computation process of the data. It is also responsible for maintaining and updating the state. The user can perform conditional control and cyclic operations on the branches of the graph. Each node in the graph can have any number of inputs and outputs. Each node describes an operation. A node can be an instantiation of an operation.

The preparatory work

Installing the Python environment

We use Anaconda, detailed installation method can refer to the article Anaconda use.

To speed up the conda software installation, use the domestic tsinghua source, edit the file ~/.condarc, add

channels:
  - defaults
show_channel_urls: true
default_channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
custom_channels:
  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
Copy the code

Next, create a separate, clean virtual environment, TFGPU

Conda create-n tfGPU Python =3.7 Conda activate TFGPUCopy the code

Install the protobuf

Protobuf is also a Google product. It is a data exchange/storage format that we installed through Conda. The current default version is 3.11.4

conda install protobuf
Copy the code

Protobuf 3.11.4 did not report any errors when I was compiling Tensorflow-2.2.0. If an error is encountered in your environment, check the tensorflow/ workplace.bzl file for the version of protobuf and install it

For example, protobuf-3.8.0 corresponding to Tensorflow-2.2.0 is installed as follows

Wget https://github.com/protocolbuffers/protobuf/releases/download/v3.8.0/protobuf-all-3.8.0.tar.gz tar XVF CD protobuf-3.8.0./autogen.sh./configure make sudo make installCopy the code

Install cuda and cuDNN

We chose the current mainstream CUDA 10.1 and CUDNN 7.6.5. Please refer to this article to install CudA in Ubuntu

Tensorflow version

This article selects the latest official version 2.2.0, you can directly download the compression package to the official website and decompress, address: github.com/tensorflow/…

The tar XVF tensorflow - 2.2.0. Tar. GzCopy the code

Install bazel

Bazel is an engineering build system from Google, and its version selection directly affects the source compilation of TensorFlow. You can find the following statement by looking at the configure-py file in the TensorFlow source directory

_TF_MAX_BAZEL_VERSION = '2.0.0'Copy the code

You can see that tensorFlow version 2.2.0 requires a Bazel version number of 2.0.0. We went straight to Bazel’s website at github.com/bazelbuild/ to download 2.0.0… Here the downloaded file is the binary file bazel-2.0.0-linux-x86_64, and then execute

Sudo mv bazel-2.0.0-linux-x86_64 /usr/bin/bazel sudo chmod a+x /usr/bin/bazelCopy the code

Install the necessary software packages

These tools are also needed to compile TensorFlow

pip install numpy six keras_preprocessing
Copy the code

GCC version

Ubuntu 18.04 uses GCC version 7.5.0 by default, and there are no problems with compilation. If problems arise, consider giving GCC a reduced version, as it has not been officially tested on this version

sudo apt install gcc-6 g++-6
Copy the code

Then switch versions

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 100
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/gcc-6 100
Copy the code

If you want to switch back to gcc-7, use a similar command to increase the last parameter, the priority, such as 101

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 101
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/gcc-7 101
Copy the code

The build process

Start by configuring the project and executing configure

CD tensorflow - 2.2.0. / configureCopy the code

The terminal is presented with a series of options, which you need to select according to your needs. This is the configuration for this article

The compiled command is then executed

bazel build --verbose-failures --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
Copy the code

If you want to compile the dynamic link library required for c++ development, you need to use the following command

bazel build --verbose-failures --noincompatible_do_not_split_linking_cmdline --config=opt --config=cuda //tensorflow:libtensorflow_cc.so  //tensorflow:install_headers
Copy the code

Or you can write two targets together

bazel build --verbose-failures --noincompatible_do_not_split_linking_cmdline --config=opt --config=cuda //tensorflow:libtensorflow_cc.so //tensorflow/tools/pip_package:build_pip_package
Copy the code

Common external library link error

To solve this problem, use the noincompatible_do_not_split_linking_cmdline parameter. For details, see the issue github.com/tensorflow/…

The overall build time depends on your machine configuration

Tests found that the same compile command crashed on a 16GB machine due to out of memory, but not on a 32GB machine. We can do this by adjusting the following parameters of bazel

--local_ram_resourcesuseRAMThe size of PI, in units of PIMB
--local_cpu_resourcesuseCPUThe number of nuclear
--jobsconcurrency

Finally, we generate the WHL files required for the Python installation

sudo ./tensorflow/tools/pip_package/build_pip_package.sh /tmp/tensorflow_pkg
Copy the code

Here/TMP /tensorflow_pkg is used to store WHL files, you can specify any, after successful generation, we will install

PIP install/TMP/tensorflow_pkg/tensorflow - 2.2.0 - cp37 - cp37m - linux_x86_64. WHLCopy the code

Python validation

Let’s open ipython. Do not open ipython in the tensorFlow source directory, or it will report an error. Add a simple tensorFlow test code to see if it will report an error and the corresponding output information

import tensorflow as tf

tf.__version__
tf.test.is_gpu_available()
Copy the code

As can be seen from the figure above, the installed version is 2.2.0, and the GPU can be used normally

C + + to validate

After the target of libtensorflow_cc is successfully compiled, In the bazel-bin/ tensorFlow directory, the corresponding header file (include) and dynamic link library (libtensorflow_cc.so and libtensorflow_framework.so.2, both soft link files) are generated.

Use the Clion integrated development tool to create a cmake-based project and write the source file main.cpp. This example comes from the network

#include <iostream>
#include "tensorflow/cc/client/client_session.h"
#include "tensorflow/cc/ops/standard_ops.h"
#include "tensorflow/core/framework/tensor.h"

using namespace tensorflow;
using namespace tensorflow::ops;

int main()
{
    Scope root = Scope::NewRootScope();

    // Matrix A = [3 2; -1 0]
    auto A = Const(root, { {3.f, 2.f}, {-1.f, 0.f} });
    // Vector b = [3 5]
    auto b = Const(root, { {3.f, 5.f} });
    // v = Ab^T
    auto v = MatMul(root.WithOpName("v"), A, b, MatMul::TransposeB(true));

    std::vector<Tensor> outputs;
    ClientSession session(root);

    // Run and fetch v
    TF_CHECK_OK(session.Run({v}, &outputs));
    std::cout << "tensorflow session run ok" << std::endl;
    // Expect outputs[0] == [19; -3]
    std::cout << outputs[0].matrix<float>();

    return 0;
}

Copy the code

Next, write the rule cMakelists.txt

Project (libtf) cmake_minimum_required(VERSION 3.0) add_definitions(-std=c++11) set(TENSORFLOW_ROOT_DIR / home/xugaoxiang/Downloads/tensorflow 2.2.0 - cc) include_directories (${TENSORFLOW_ROOT_DIR} / bazel - bin/tensorflow/include  ) aux_source_directory(./ DIR_SRCS) Link_directories (/ home/xugaoxiang/Downloads/tensorflow - 2.2.0 - cc/bazel - bin/tensorflow) add_executable (libtf ${DIR_SRCS}) #target_link_libraries(libtf # tensorflow_cc # tensorflow_framework # ) target_link_libraries(libtf / home/xugaoxiang/Downloads/tensorflow - 2.2.0 - cc/bazel - bin/tensorflow/libtensorflow_cc. So / home/xugaoxiang/Downloads/tensorflow - 2.2.0 - cc/bazel - bin/tensorflow/libtensorflow_framework so. 2)Copy the code

It’s important to note that the names of the two libraries are not the same, so I’ll just write the absolute path and compile it

mkdir build
cd build
cmake ..
make
./libtf
Copy the code

The execution result is as follows

(base) xugaoxiang @ 1070 ti: ~ / CLionProjects/libtf/build $. / libtf 16:30:25 2020-05-22. 469170: I tensorflow/core/platform/profile_utils cpu_utils. Cc: 102] the CPU Frequency: 3092885000 Hz 2020-05-22 16:30:25. 469647: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560f48a29630 initialized for platform Host (this does Not guarantee that XLA will be used. Devices: 2020-05-22 16:30:25.469694: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-05-22 16:30:25.473522: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 The 2020-05-22 16:30:25. 613992: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560f48988550 initialized for platform CUDA (this does Not guarantee that XLA will be used). Devices: 2020-05-22 16:30:25.614051: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1070 Ti, Compute Capability 6.1 2020-05-22 16:30:25.615325: I tensorflow/core/ common_Runtime /gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:03:00.0 name: GeForce GTX 1070 Ti computeCapability: 6.1 coreClock: 1.683ghz coreCount: 19 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s 2020-05-22 16:30:25.615715: I tensorflow stream_executor/platform/default/dso_loader. Cc: 44] Successfully the opened the dynamic library libcudart. So. 10.1 The 2020-05-22 16:30:25. 619174: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 The 2020-05-22 16:30:25. 621307: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 The 2020-05-22 16:30:25. 621587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 The 2020-05-22 16:30:25. 623604: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 The 2020-05-22 16:30:25. 625051: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 The 2020-05-22 16:30:25. 629452: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-05-22 16:30:25.630598: I TensorFlow /core/ Common_Runtime /gpu/gpu_device.cc:1703] Adding visible GPU Devices: 0 2020-05-22 16:30:25. 630633: I tensorflow stream_executor/platform/default/dso_loader. Cc: 44] Successfully the opened the dynamic library libcudart. So. 10.1 The 2020-05-22 16:30:25. 631317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-05-22 16:30:25.631335: I tensorflow/core/ common_Runtime /gpu/gpu_device.cc:1108] I tensorflow/core/ common_Runtime /gpu/gpu_device.cc:1121] 0: N 20-05-22 16:30:25.632489: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6477 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 Ti, PCI bus ID: 0000:03:00.0, compute Capability: 6.1) TensorFlow session run OK 19Copy the code

Notice that the GPU is correctly identified and used, as expected, perfectly

Common mistakes

Official a list of common errors are given and the corresponding solutions, this is too great, so have problems during compilation, can be the first to look at here, the address is: www.tensorflow.org/install/err…

Download link

The dynamic link library and header files are packed and downloaded if needed

CSDN link

The resources

www.tensorflow.org/install/sou…
Github.com/bazelbuild/…
Docs. Bazel. Build/versions/ma…
Github.com/tensorflow/…
Github.com/tensorflow/…
Xugaoxiang.com/2019/12/08/…
Xugaoxiang.com/2019/12/13/…
Github.com/tensorflow/…