This article is a summary of my experience as a beginner’s note for TensorFlow.
One, foreword
There are applications related to learning artificial intelligence in the direction of deep reinforcement learning. Since there is no relevant foundation, many projects on GitHub are based on TensorFlow. Bibliography: TensorFlow Practical Google Deep Learning Framework (2nd edition).
Introduction to TensorFlow
TensorFlow is one of the mainstream development tools for artificial intelligence. It is a computing framework officially open source by Google on November 9, 2015. It is a general computing framework improved by The Google Brain team led by Jeff Dean based on DistBelief, the first generation of deep learning system within Google. GitHub and the industry has a high degree of application and practicality.
Introduction to TensorFlow
Calculation graph is the most basic concept in TensorFlow, and it is the calculation model of TensorFlow. All calculations in TensorFlow are transformed into nodes on the calculation graph, and the edges between nodes describe the dependence between calculations.
Tensor is the underlying data model of TensorFlow. TensorFlow’s name already illustrates its two most important concepts — Tensor and Flow. Tensor is a Tensor, we can think of it as a multi-dimensional array, a zero-order Tensor is a scalar, a number. The first tensor is a vector, which is a one-dimensional array. The NTH order tensor can be understood as an n-dimensional array. Flow is translated into Chinese as “Flow”, which intuitively expresses the process of mutual transformation between tensors through calculation. TensorFlow’s implementation of tensors is not directly in the form of an array, it is only a reference to the results of TensorFlow operations.
import tensorflow as tf
# TensorFlow is a very common technique to use when using TensorFlow.
a = tf.constant([1.0.2.0], name="a")
b = tf.constant([2.0.3.0], name="b")
result = tf.add(a, b, name="add")
print result
Add: "' output: Tensor (" 0", shape = (2), dtype = float32) ' ' '
Copy the code
As you can see from the above code, the result of TensorFlow’s calculation is not a concrete number, but a structure of tensors that holds three properties: name, shape, and type. Shape =(2,) the tensor result is a one-dimensional array of length 2. Dtype =float32 specifies the data type of its value.
As shown in the figure above, the ADD node is a node of the calculation graph, and it depends on reading the values of two constants. Result refers to the calculation result of ADD, which is the basic calculation model of TensorFlow.
TensorFlow should pay attention to the designation of data types and data dimensions when defining tensors and calculations. Only the same data types and data with appropriate dimensions can be calculated with each other. TensorFlow supports 14 different types, It mainly includes real numbers (Tf.float32, TF.float64), integers (tf.int8, tf.int16, tF.int32, tF.int64, tF.uint8), Boolean types (tf.bool) and complex numbers (tf.complex64, TF.com plex12) 8).
Session is the operating model of TensorFlow. The calculations in TensorFlow are defined and need to be performed using a session.
Create a session.
sess = tf.Session()
# Use the created session to calculate the result of interest. For example, you can call sess.run(result),
# to get the result value of the execution definition.
sess.run(...)
The # close dialog box allows resources used in this run to be released.
sess.close()
Copy the code
Fourthly, TensorFlow realizes neural network
The neural network flow chart based on the back propagation algorithm is as follows:
The steps are as follows:
- The network structure and the output result of forward propagation are defined.
- Define loss function and select back propagation optimization algorithm;
- Generate sessions (tF.session) and run the back propagation optimization algorithm repeatedly on the training data.
Implementation example:
import tensorflow as tf
from numpy.random import RandomState
Define the batch size of the training data
batch_size = 8
Define a two-tier network structure
w1 = tf.Variable(tf.random_normal((2.3), stddev=1, seed=1))
w2 = tf.Variable(tf.random_normal((3.1), stddev=1, seed=1))
Placeholder functions reserve input space
x = tf.placeholder(tf.float32, shape=(None.2), name="x-input")
y_ = tf.placeholder(tf.float32, shape=(None.1), name="y-input")
# Define how networks are connected
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
# Define loss function and back propagation algorithm
y = tf.sigmoid(y)
cross_entropy = -tf.reduce_mean(
y_ * tf.log(tf.clip_by_value(y, 1e-10.1.0))
+ (1-y_) * tf.log(tf.clip_by_value(1-y, 1e-10.1.0))
)
tran_step = tf.train.AdadeltaOptimizer(0.01).minimize(cross_entropy)
Generate an analog data set from random numbers
rdm = RandomState(1)
dataset_size = 128
X = rdm.rand(dataset_size, 2)
Define mock tag rules
Y = [[int(x1 + x2 < 1)] for (x1, x2) in X]
Create a session to run the TensorFlow program
sess = tf.Session()
init_op = tf.global_variables_initializer()
sess.run(init_op)
Print the parameters before training
print(sess.run(w1))
print(sess.run(w2))
# Define the number of rounds and data pickers and train them
STEPS = 50000
for i in range(STEPS):
start = (i * batch_size) % dataset_size
end = min(start + batch_size, dataset_size)
sess.run(tran_step, feed_dict={x: X[start:end], y_: Y[start:end]})
Print the loss function of the neural network on the total data set every 10,000 rounds
if i % 10000= =0:
total_cross_entropy = sess.run(
cross_entropy, feed_dict={x: X, y_: Y}
)
print(i)
print(total_cross_entropy)
Print the parameters after training
print(sess.run(w1))
print(sess.run(w2))
sess.close()
Copy the code
Deep neural network analysis:
- Commonly used activation functions are:
tf.nn.relu
,tf.sigmoid
,tf.tanh
; - Cross entropy implementation:
Cross_entropy = -tf.reduce_mean(y_ * tf.log(tf.clip_by_value(y, 1E-10, 1.0)))
; - MSE:
tf.reduce_mean(tf.square(y_ - y))
; - Commonly used optimization algorithms:
tf.train.GradientDescentOptimizer(learning_rate).minmize(loss_fuction)
,tf.train.AdamOptimizer
,tf.train.MomentumOptimizer
; - Learning rate attenuation method:
tf.train.exponential_decay
; - Regularization formula:
tf.contrib.layers.l1_regularizer(lambda)(w)
,tf.contrib.layers.l2_regularizer(lambda)(w)
; - Implementation method of moving average model:
tf.trian.ExponentialMovingAverage(decay, num_updates)
;
Five, learning summary
Overall, TensorFlow is pretty easy to get started with, not as bad as it sounds. I omit the TensorFLow download and environment configuration steps, because that part is too tedious, the compatibility between different versions is not good, the specific solution to the problem is found from the Internet, not worth referring to.
The core concept of TensorFlow is very novel. It takes computing as the axis rather than data, which fully reflects the importance of computing for artificial intelligence and provides new perspectives and perspectives on the construction of neural networks and the implementation of algorithms. As for the comparison part, I have no experience with PyTorch or other build tools, so I will not comment.
I’m looking forward to the next hands-on project.
(Thanks for reading. If there are any mistakes, please correct them.)