- Understanding Tensorflow using Go
- Paolo Galeone
- The Nuggets translation Project
- Translator: lsvih
- Proofreader: Whatbeg,yifili09
Understand Tensorflow in Go
Tensorflow is not strictly a machine learning library; it is a general-purpose library that uses diagrams to represent computations. Its core function is realized by C++, through encapsulation, can run in a variety of different languages. Golang Tensorflow not only allows you to use Tensorflow through Go, but also allows you to understand the underlying implementation of Tensorflow.
encapsulation
According to the official note, the Tensorflow developers released the following:
-
C++ source code: Low-level and high-level specific functions are implemented by C++ source code, which is the heart of true Tensorflow.
-
Python wrapper and the Python library: a wrapper version automatically generated by a C++ implementation that allows us to call C++ functions directly from Python: this is the core implementation of numpy.
The Python library combines various calls from packaged Versions of Python to form a variety of well-known high-level apis.
-
Java packages
-
Go to encapsulate
As a Gopher rather than a Java enthusiast, I paid a lot of attention to the Go package and wanted to know what kind of tasks it could be used for.
Encapsulation is also called a language interface.
Go to encapsulate
Gopher (created by Takuya Ueda @Tenntenn, following the CC 3.0 protocol) combined with the Tensorflow Logo.
The first thing to note is that, as the maintainers themselves admit, the Go API lacks Variable support, so the API is only used to work with trained models, not to train them.
It is clearly mentioned in the installation Tensorflow for Go:
TensorFlow provides several apis for Go programming. These apis are especially good for loading models created in Python and running them in Go applications.
This limitation is OK if we are not interested in training machine learning models.
However, if you plan to train the model yourself, here are some tips:
As a Gopher, please keep Go simple! Use Python to define and train the model, and after that you can use Go to load the trained model at any time! (Meaning they are too lazy to develop)
In short, Golang tensorFlow can import and define constant graph. This constant diagram refers to the fact that there are no training processes and no variables to be trained in the diagram.
Let’s dig deeper into Tensorflow with Golang! Let’s start by creating our first application.
I recommend that you prepare your Go environment and compile and install Tensorflow Go (see README for compilation and installation) before reading the following.
Understand the structure of Tensorflow
Let’s review what Tensorflow is. (This is my personal understanding, and the official website is different)
TensorFlow™ is an open source software library that uses Data Flow Graphs for numerical calculations. The Nodes represent mathematical operations, and the lines in the diagram represent arrays of multidimensional data related to each other between the Nodes, the tensor.
We can put the Tensorflow as a descriptive language similar to SQL, first you have to determine what data you need, it will be through the analysis of the underlying engine (database) your query statements, check your syntax errors and grammatical errors, converts the query to the private language expressions, optimized operation after the calculation results are obtained. In this way, it ensures that the correct results are delivered to you.
Therefore, whatever API we use is essentially describing a graph. We place it in the Session as a starting point for evaluation, which ensures that the graph will run in the Session.
With this in mind, we can try to define a graph of the computed operation and put it in a Session for evaluation.
The API documentation explicitly informs the list of available methods in the TensorFlow (TF) package and the OP package.
As you can see from this list, these two packages contain all the methods we need to define and evaluate diagrams.
The TF package includes various infrastructure building functions, such as Graph. The op package is the most important package and contains features such as bindings that are automatically generated by C++ implementation.
Now, suppose we want to compute the matrix multiplication of AAA and XXX:
I’m going to assume that you’re all familiar with the definition of TensorFlow diagrams, you’re all familiar with placeholder and you know how they work.
The following code is written by a Tensorflow Python user on his first try. Attempt1.go Let’s call this file attempt1.go.
package main
import (
"fmt"
tf "github.com/tensorflow/tensorflow/tensorflow/go"
"github.com/tensorflow/tensorflow/tensorflow/go/op"
)
func main() {
// Step 1: Create the diagram
// First we need to define two placeholder Spaces in the Runtime
// The first placeholder A will be replaced by A [2, 2] interger type tensor
// The second placeholder X will be replaced by a [2, 1] interger type tensor
// Next we need to calculate Y = Ax
// Create the first node of the graph: use this empty node as the root of the graph
root := op.NewScope()
// Define two placeholder placeholder
A := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2.2)))
x := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2.1)))
// Define an op node that accepts input A and x
product := op.MatMul(root, A, x)
// Every time we pass a field to an operation,
// We all need to place operations in this field.
As you can see, we now have an empty scope (created by newScope). This empty scope
// is the root of our graph, which we can represent with "/".
// Now let TensorFlow build the diagram according to our definition.
// According to the abstract graph defined by scope and op, the program will create the corresponding constant graph.
graph, err := root.Finalize()
iferr ! = nil {// If we define the graph incorrectly, we must manually correct the relevant definition,
// Any attempt to automate error handling is futile.
// Just like SQL queries, if the query is not a valid syntax, we can only rewrite it.
panic(err.Error())
}
// If we get to this point, our graph is syntactically correct.
// Now we can put it in a Session and execute it!
var sess *tf.Session
sess, err = tf.NewSession(graph, &tf.SessionOptions{})
iferr ! = nil { panic(err.Error()) }// To use placeholder, we need to create tensors of values passed into the network
var matrix, column *tf.Tensor
// A = [ [1, 2], [-1, -2] ]
if matrix, err = tf.NewTensor([2] [2]int64{ {1.2}, {- 1.2 -} }); err != nil {
panic(err.Error())
}
// x = [ [10], [100] ]
if column, err = tf.NewTensor([2] [1]int64{ {10}, {100} }); err != nil {
panic(err.Error())
}
var results []*tf.Tensor
if results, err = sess.Run(map[tf.Output]*tf.Tensor{
A: matrix,
x: column, }, []tf.Output{product}, nil); err ! = nil { panic(err.Error()) }for _, result := range results {
fmt.Println(result.Value().([][]int64))
}
}Copy the code
The above code is commented, and I encourage you to read each comment.
Now, the Tensorflow Python user feels good that his code compiles and runs successfully. Let’s give it a try:
go run attempt1.go
Then he would see:
panic: failed to add operation "Placeholder": Duplicate node name in graph: 'Placeholder'
Wait, why is that?
The problem is obvious. There are two “Placeholder” operations with the same name in the code above.
Lesson 1: Node IDs
Each time we call a method to define an operation, regardless of whether it has been called before, the Python API generates a different node.
So, the following code does nothing wrong and returns 3.
import tensorflow as tf
a = tf.placeholder(tf.int32, shape=())
b = tf.placeholder(tf.int32, shape=())
add = tf.add(a,b)
sess = tf.InteractiveSession()
print(sess.run(add, feed_dict={a: 1.b: 2}))Copy the code
We can verify this problem to see if the application creates two different placeholder nodes: print(a.name, b.name)
It prints Placeholder:0 placeholder_1:0.
A placeholder is placeholder :0 and B placeholder is Placeholder_1:0.
But in Go, the above program will give you an error, because A and x are both called Placeholder. We can draw a conclusion from this:
The Go API does not automatically generate a new name every time we call a function that defines an operation. Therefore, its operation name is fixed and we cannot change it.
Question time:
-
What have we learned about Tensorflow’s architecture?
Each node in the diagram must have a unique name. All nodes are identified by name.
-
Is the node name the same as the name of the defining operator?
Yes, it is also possible to say that the node name is the last paragraph of the operator name.
Let’s fix the duplication of node names to clarify the second question above.
Lesson 2: Scope
As we have seen, the Python API automatically creates new names when defining operations. If you look underneath, the Python API calls the WithOpName method in the C++ Scope class.
Here is the documentation and features of this method, refer to scope.h:
/// returns the new scope. All ops in the returned scope will be named
/// <name>/<op_name>[_<suffix].
Scope WithOpName(const string& op_name) const;Copy the code
Note that this method returns a Scope to name the node, so the node name is actually the Scope.
Scope is the full path from root/(empty graph) back to op_name.
The WithOpName method adds a suffix _
(
is a counter) to avoid duplicate nodes in the same scope when we try to add a node with the same/to op_name path.
Knowing the above, we can solve the problem of duplicate node names by looking for WithOpName in Type Scope. However, this method is not available in the Go TF API.
If we look at the documentation for Type Scope, we can see that the only method that can return a new Scope is SubScope(Namespace String).
The following is a reference to the document:
The SubScope returns a new Scope that ensures that all operations added to the diagram are placed in the namespace ‘Namespace’. If the namespace conflicts with an existing namespace in the scope, it will be suffixed.
}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} The SubScope of Go is suffix followed by the scope name.
This results in completely different graphs (the nodes are in different scopes), but their results are the same.
Let’s try changing the placeholder definition so that they define two different nodes and then print the Scope name.
Let’s create attempt2.go, turning the following lines
A := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2.2)))
x := op.Placeholder(root, tf.Int64, op.PlaceholderShape(tf.MakeShape(2.1)))Copy the code
to
// Define two custom fields under the root domain, named input. such
// We have input/ and input_1/ in the root domain.
A := op.Placeholder(root.SubScope("input"), tf.Int64, op.PlaceholderShape(tf.MakeShape(2.2)))
x := op.Placeholder(root.SubScope("input"), tf.Int64, op.PlaceholderShape(tf.MakeShape(2.1)))
fmt.Println(A.Op.Name(), x.Op.Name())Copy the code
Attempt2. go run attempt2.go
input/Placeholder input_1/PlaceholderCopy the code
Question time:
-
What have we learned about Tensorflow’s architecture?
A node is completely identified by the scope in which it is defined. This “scope” is a path from the root node of the graph back to the specified node. There are two ways to define a node that performs the same operation: 1. Put its definition in a different scope (Go style) 2. Change the operation name (we can do this in C++, Python does it automatically)
We have now solved the problem of repeating node names, but now we have another problem in our console:
panic: failed to add operation "MatMul": Value for attr 'T' of int64 is not in the list of allowed values: half, float, double, int32, complex64, complex128Copy the code
Why is the definition of the MatMul node wrong? All we need to do is compute the product of two TF.INT64 matrices! It seems that MatMul simply cannot accept int64.
Value for attr ‘T’ of int64 is not in the list of allowed values: half, float, double, int32, complex64, complex128
What is this list above? Why can we compute the product of two INT32 matrices but not the product of INT64?
We’ll solve this problem next.
Lesson 3: The Tensorflow type system
Let’s dig into the source code to see how C++ defines the MatMul operation:
REGISTER_OP("MatMul")
.Input("a: T")
.Input("b: T")
.Output("product: T")
.Attr("transpose_a: bool = false")
.Attr("transpose_b: bool = false")
.Attr("T: {half, float, double, int32, complex64, complex128}")
.SetShapeFn(shape_inference::MatMulShape)
.Doc(R"doc(
Multiply the matrix "a" by the matrix "b".
The inputs must be two-dimensional matrices and the inner dimension of
"a" (after being transposed if transpose_a is true) must match the
outer dimension of "b" (after being transposed if transposed_b is
true).
*Note*: The default kernel implementation for MatMul on GPUs uses
cublas.
transpose_a: If true, "a" is transposed before multiplication.
transpose_b: If true, "b" is transposed before multiplication.Copy the code
These lines define an interface for the MatMul operation, which is described by the REGISTER_OP macro as follows:
- Name:
MatMul
- Parameters:
a
.b
- Properties (Optional) :
transpose_a
.transpose_b
- template
T
Supported types:half, float, double, int32, complex64, complex128
- Output type: Automatic identification
- The document
This macro does not contain any C++ code, but it tells us that when defining an operation, even if it uses a template definition, we need to specify a list of types (or attributes) supported by a particular type T.
In fact, the.attr attribute (“T: {half, float, double, int32, complex64, complex128}”) limits the type of T to this type list. As mentioned in the TensorFlow tutorial, we need to register all supported overloaded operations with the kernel when we template T. The kernel uses CUDA to reference C/C++ functions for concurrent execution.
MatMul’s authors may have excluded INT64 for supporting only the above types for two reasons:
- Oversight: This is possible because Tensorflow’s authors are human.
- Cannot be used for support
int64
It is possible that the kernel implementation of this feature will not run on the various supported hardware.
Going back to our problem, it’s pretty clear how to solve the problem. We need to pass it the parameters of the type MatMul supports.
Attempt3. go let’s create attempt3.go and change all places of int64 to INT32.
One thing to note is that the Go package TF has its own set of types, which basically map 1:1 to the Go type itself. We must follow this mapping when we want to pass values into a diagram (such as Int32 when defining a PLACEHOLDER of type TF.int32). Same thing from the graph.
The tf.Tensor type will return a Tensor evaluation that contains a Value() method that returns an interface{} that must be converted to the correct type (as you can see from the structure of the diagram).
Run go Run attempt3.go to get the result:
input/Placeholder input_1/Placeholder
[[210] [- 210.]]Copy the code
Success!
Here is the complete code for Attempt3, which you can compile and run. (this is a Gist, if you find have what can improve welcome to gist.github.com/galeone/096…
package main
import (
"fmt"
tf "github.com/tensorflow/tensorflow/tensorflow/go"
"github.com/tensorflow/tensorflow/tensorflow/go/op"
)
func main() {
// Step 1: Create the diagram
// First we need to define two placeholder Spaces in the Runtime
// The first placeholder A will be replaced by A [2, 2] interger type tensor
// The second placeholder X will be replaced by a [2, 1] interger type tensor
// Next we need to calculate Y = Ax
// Create the first node of the graph: use this empty node as the root of the graph
root := op.NewScope()
// Define two placeholder placeholder
// Define two custom fields under the root domain, named input. such
// We have input/ and input_1/ in the root domain.
A := op.Placeholder(root.SubScope("input"), tf.Int32, op.PlaceholderShape(tf.MakeShape(2.2)))
x := op.Placeholder(root.SubScope("input"), tf.Int32, op.PlaceholderShape(tf.MakeShape(2.1)))
fmt.Println(A.Op.Name(), x.Op.Name())
// Define an op node that accepts input A and x
product := op.MatMul(root, A, x)
// Every time we pass a field to an operation,
// We all need to place operations in this field.
As you can see, we now have an empty scope (created by newScope). This empty scope
// is the root of our graph, which we can represent with "/".
// Now let TensorFlow build the diagram according to our definition.
// According to the abstract graph defined by scope and op, the program will create the corresponding constant graph.
graph, err := root.Finalize()
iferr ! = nil {// If we define the graph incorrectly, we must manually correct the relevant definition,
// Any attempt to automate error handling is futile.
// Just like SQL queries, if the query is not a valid syntax, we can only rewrite it.
panic(err.Error())
}
// If we get to this point, our graph is syntactically correct.
// Now we can put it in a Session and execute it!
var sess *tf.Session
sess, err = tf.NewSession(graph, &tf.SessionOptions{})
iferr ! = nil { panic(err.Error()) }// To use placeholder, we need to create tensors of values passed into the network
var matrix, column *tf.Tensor
// A = [ [1, 2], [-1, -2] ]
if matrix, err = tf.NewTensor([2] [2]int32{{1.2}, {- 1.2 -}}); err ! = nil { panic(err.Error()) }// x = [ [10], [100] ]
if column, err = tf.NewTensor([2] [1]int32{{10}, {100}}); err ! = nil { panic(err.Error()) }var results []*tf.Tensor
if results, err = sess.Run(map[tf.Output]*tf.Tensor{
A: matrix,
x: column, }, []tf.Output{product}, nil); err ! = nil { panic(err.Error()) }for _, result := range results {
fmt.Println(result.Value().([][]int32))
}
}Copy the code
Question time:
What have we learned about Tensorflow’s architecture?
Each operation has its own set of associated kernels. Tensorflow is a strongly typed descriptive language that not only follows C++ typing rules, but also requires that the type be defined when the op is registered.
conclusion
Using Go to define and process a graph gives us a better understanding of the underlying structure of Tensorflow. Through trial and error, we finally solved this simple problem and learned about diagrams, nodes, and type systems step by step.
If you found this post useful, please like it or share it with others
The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. Android, iOS, React, front end, back end, product, design, etc. Keep an eye on the Nuggets Translation project for more quality translations.