One, code implementation

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

x_data_arr = np.linspace(-0.5.0.5.200)
x_data = x_data_arr[np.newaxis, : ]
noise = np.random.normal(0.0.02, x_data.shape) 
y_data = x_data**2 + noise 

print(x_data.shape, y_data.shape)
plt.scatter(x_data, y_data)
plt.title("scatter points")
Copy the code

x = tf.placeholder(tf.float32, name = "x")  # define a placeholder
y = tf.placeholder(tf.float32, name = "y")
Copy the code
w1 = tf.Variable(tf.random_normal([10.1]))  
b1 = tf.Variable(tf.zeros([10.1])) 
z1 = tf.matmul(w1, x) + b1  
a1 = tf.nn.tanh(z1)  

w2 = tf.Variable(tf.random_normal([1.10])) 
b2 = tf.Variable(tf.zeros([1.1]))
z2 = tf.matmul(w2, a1) + b2
prediction = tf.nn.tanh(z2)
Copy the code
loss = tf.reduce_mean(tf.square(y - prediction))  
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss) 

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer()) 
    for i in range(6000):
        sess.run(train_step, feed_dict={x: x_data, y: y_data}) 
        if i%1000= =0:
            print(sess.run(loss, feed_dict={x: x_data, y:y_data}))
 
    prediction_y = sess.run(prediction, feed_dict={x: x_data}) 
    # change the matrix squeeze to an array of rank 1 to plt.plot
    prediction_y_squeeze = np.squeeze(prediction_y)
    x_data_squeeze = np.squeeze(x_data)
    # drawing
    plt.figure()
    plt.scatter(x_data, y_data) 
    plt.plot(x_data_squeeze, prediction_y_squeeze, c='r', lw=3)
    plt.title("curve")
    plt.xlabel("x")
    plt.ylabel("y")
    plt.show()
Copy the code

Matrix to array in PLT

When plotting with matplotlib.pyplot, note that plt.plot can only draw one-dimensional arrays, not matrices. For an array with rank 1, use the following code:

x_squeeze = np.squeeze(x)
Copy the code

If x and y are matrices of (1, 200), only scatter plots can be drawn:

plt.scatter(x, y, c='r', s=1) 
Copy the code

If you want to plot continuous plots, you need to transform the matrix:

x_squeeze = np.squeeze(x)
y_squeeze = np.squeeze(y)
plt.plot(x_squeeze, y_squeeze, c='r', lw=5)
Copy the code

Array to matrix

# np.linspace generates one-dimensional arrays, whereas neural networks need matrices
x_data_arr = np.linspace(-0.5.0.5.200)
# add [np.newaxis, :] to the end of the array to represent the row turn vector, and [:,np.newaxis] to represent the column turn vector
x_data = x_data_arr[np.newaxis, : ]
Copy the code

Fourth, the dimension of neural network

The forward propagation formula: z = w [l] [l] [l – 1) + b [l] z ^ ^ {} [l] = w {} [l] a ^ {} [l – 1] + b ^ {} [l] [l] = z w [l] [l – 1) + b [l]


a [ l ] = g [ l ] ( z [ l ] ) a^{[l]}=g^{[l]}(z^{[l]})

Backpropagation: TensorFlow can be realized in one step, so you can trust TF to seek derivative optimization!

The dimension with deep learning rules of summarize: w [l] : (n – 1] [l, n [l]) w ^ {} [l] : (n ^ {} [l – 1], n ^ {} [l]) w [l] : (n/l – 1, n) [l]


b [ l ] : ( n [ l ] . 1 ) b^{[l]}:(n^{[l]},1)


z [ l ] : ( n [ l ] . 1 ) z^{[l]}:(n^{[l]},1)


a [ l ] : ( n [ l ] . 1 ) a^{[l]}:(n^{[l]},1)

Ok, that’s the key, but just the dimension basics, the key point I want to make is the following: I’ve read a lot of blog posts on curve fitting, and none of them make it clear how the dimensions are derived when initializing weights W and thresholds B, and what needs to be noted. Here are some programming tips I’ve summarized: ⚠

  • There is a mapping of X → YX \to YX →y in the curve fitting problem, m x correspond to m Y. Use the standard deep learning thinking to think of this problem: input X and output y are both (200, 1) matrices, the number of samples is 200, the dimension of each sample is 1, and the column of each sample corresponds to an element of the output y matrix.
  • Although there are m sample, said b/l (n [l], [l] : (1) – > b n [l], m) b ^ {} [l] : (n ^ {} [l], 1) \ to b ^ {} [l] : (n ^ {} [l], [l] : (m) b n [l], [l] : (1) – > b n [l], m), But it doesn’t change 1 to 200 in the code. This is for a single sample, when placeholder is replaced with real data! That’s the beginning of vectorization. Deep learning students should not confuse theory with practice.
  • Dimension code for initialization parameters:
w1 = tf.Variable(tf.random_normal([ n_x, n^[1] ]))
b1 = tf.Variable(tf.zeros([ n^[1].1 ]))

z1 = tf.matmul(w1, x) + b
a1 = tf.nn.tanh(z1)
Copy the code