I spent two or three days to write the code of DNN last week. I encountered many problems in this process, which is why I did not summarize it in the blog. Don’t panic if you find that the neural network you built doesn’t really recognize the images we uploaded. For this data set, even though it has good results on the training set and development/test set, in fact the accuracy of our real test set (uploaded images by users) is completely out of reach! In such a small data set, the purpose of writing DNN model is to get familiar with the internal structure of deep neural network and apply the theoretical knowledge of deep learning into practice. Again, learning theoretical knowledge is the core. If you learn how neural networks work, you will have a better intuition to optimize neural networks.

Data sets and tools

The download link is actually the cat graph recognition dataset extraction code used in the logistic regression model: XX1W

2. Image reduction and normalization

import numpy as np # Numpy Scientific computing library
import h5py # Common software packages for interacting with datasets stored in H5 files
from lr_utils import load_dataset Load the data package for this dataset
import matplotlib.pyplot as plt # Chart
from dnn_utils import sigmoid, sigmoid_backward, relu, relu_backward
Copy the code
train_x, train_y, test_x, test_y ,classes = load_dataset() 

index = 2
plt.imshow(train_x[index])
print("y="+ str(train_y[: ,index])+",it is a "+ classes[np.squeeze(train_y[:,index])].decode("utf-8"))
Copy the code

# output the image with index 1
index = 1
plt.imshow(train_x[index])
print("y="+ str(train_y[:,index])+",it is a "+ classes[np.squeeze(train_y[:,index])].decode("utf-8"))
Copy the code

View the data set
m_train = train_y.shape[1] # Number of training set samples
m_test = test_y.shape[1] # Number of test set samples
num_px = train_x.shape[1] # Width/length of the picture
print("Number of training set samples:"+str(m_train))
print("Number of test set samples:"+str(m_test))
print("Width/height of each picture:"+str(num_px))
print("Size of each picture :("+str(num_px)+","+str(num_px)+", 3)")
print("Training set picture dimension:"+str(train_x.shape))
print("Training set tag dimension:"+str(train_y.shape))
print("Test set picture dimension:"+str(test_x.shape))
print("Test set tag dimension:"+str(test_y.shape))
# Why is there an extra 3 in the image dimension? Because each pixel is composed of three primary colors (R,G,B)
Copy the code
Number of training set samples:209Sample size of test set:50Width/height of each image:64Size of each image :(64.64.3Training set picture dimension :(209.64.64.3) training set tag dimension :(1.209) test set picture dimension :(50.64.64.3) test set label dimension :(1.50)
Copy the code
# Dimensionality reduction is required due to the need to deal with two-dimensional matrices
# is to stretch the 3 2d images into (num_px)^2, and then stack the m_train sample columns
train_x_flatten = train_x.reshape(train_x.shape[0] -1).T
test_x_flatten = test_x.reshape(test_x.shape[0] -1).T
print("Dimension after dimensionality reduction of training set:"+str(train_x_flatten.shape))
print("Dimension of test set after dimension reduction:"+str(test_x_flatten.shape))
Copy the code
Dimension of training set after dimensionality reduction :(12288.209) dimension of test set after dimensionality reduction :(12288.50)
Copy the code
The pixel value0-255Normalization, just divide by the graph255Train_x1 = train_x_flatTEN /255
test_x1 = test_x_flatten / 255
Copy the code

Second, random initialization of network parameters

Random initialization of parameters of shallow neural networks
def initialization(n_x, n_h, n_y) :
    W1 = np.random.randn(n_h, n_x) * 0.1 # multiplied by 0.1 is to keep the initialization parameter as small as possible
    b1 = np.random.randn(n_h, 1)
    W2 = np.random.randn(n_y, n_h) * 0.1
    b2 = np.random.randn(n_y, 1)
    
    parameters = {
        "W1": W1,
        "b1": b1,
        "W2": W2,
        "b2": b2   
    }
    return parameters
Copy the code
Random initialization of l-layer neural networks
def initialization_deep(layer_dims) :
    L = len(layer_dims)
    parameters = {}
    for l in range(1, L):
        parameters["W" + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) / np.sqrt(layer_dims[l-1]) # Divide by square root prevents gradient explosion/disappearance
        parameters["b" + str(l)] = np.zeros((layer_dims[l], 1))
    return parameters
Copy the code

3. Forward propagation

# Linear processing of forward propagation
def linear_forward(A, W, b) :
    Z = np.dot(W, A) + b
    cache = (A, W, b)
    return Z, cache
Copy the code
# Linear and activation processing of forward propagation
def linear_activation_forward(A_previous, W, b, activation) :
    Z, linear_cache = linear_forward(A_previous, W, b)
    if activation == "sigmoid":
        A, activation_cache = sigmoid(Z)
    elif activation == "relu":
        A, activation_cache = relu(Z)
    cache = (linear_cache, activation_cache) 
    return A, cache
Copy the code
# L layer neural network forward propagation
def L_model_forward(X, parameters) :
    A = X    
    L = len(parameters) // 2 The parameter w and b are divided by 2 to get the number of layers L
    caches = []
    for l in range(1, L):
        A_previous = A 
        A, cache = linear_activation_forward(A_previous, parameters['W' + str(l)], parameters['b' + str(l)], activation = "relu")
        caches.append(cache)
    AL, cache = linear_activation_forward(A, parameters['W' + str(L)], parameters['b' + str(L)], activation = "sigmoid")
    caches.append(cache)
    return AL, caches
Copy the code

Fourth, calculate the cost function

# Calculate the cost
def compute_cost(AL, Y) :
    m = Y.shape[1]
    cost = -np.sum(np.multiply(np.log(AL),Y) + np.multiply(np.log(1 - AL), 1 - Y)) / m # Cross entropy loss function
    cost = np.squeeze(cost) # The array representing the vector is converted to an array of rank 1 to facilitate plot continuous images
    return cost
Copy the code

5. Back propagation

# Linear process of back propagation
def linear_backward(dZ, cache) :
    A_prev, W, b = cache
    m = A_prev.shape[1]
    dW = np.dot(dZ, A_prev.T) / m
    db = np.sum(dZ, axis=1, keepdims=True) / m
    dA_previous = np.dot(W.T, dZ)
    return dA_previous, dW, db
Copy the code
# Backpropagation of linear and activation processes
def linear_activation_backward(dA, cache, activation) :
    linear_cache, activation_cache = cache
    if activation == "relu":
        dZ = relu_backward(dA, activation_cache)
        dA_previous, dW, db = linear_backward(dZ, linear_cache)
    elif activation == "sigmoid":
        dZ = sigmoid_backward(dA, activation_cache)
        dA_previous, dW, db = linear_backward(dZ, linear_cache)
    return dA_previous, dW, db
Copy the code
# L layer neural network back propagation
def L_model_backward(AL, Y, caches) :
    grads = {}
    L = len(caches)
    m = AL.shape[1]
    dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))
    
    current_cache = caches[L-1]
    grads["dA"+str(L)], grads["dW"+str(L)], grads["db"+str(L)] = linear_activation_backward(dAL, current_cache, "sigmoid") 
    for l in reversed(range(L-1)):
        current_cache = caches[l]
        dA_prev_temp, dW_temp, db_temp = linear_activation_backward(grads["dA" + str(l + 2)], current_cache, "relu")
        grads["dA" + str(l + 1)] = dA_prev_temp
        grads["dW" + str(l + 1)] = dW_temp
        grads["db" + str(l + 1)] = db_temp
    
    return grads
Copy the code

6. Parameter update

# update parameters
def update_parameters(parameters, grads, learning_rate) :
    L = len(parameters) // 2
    for l in range(1, L+1):
        parameters['W' + str(l)] = parameters['W' + str(l)] - learning_rate * grads["dW" + str(l)]
        parameters['b' + str(l)] = parameters['b' + str(l)] - learning_rate * grads["db" + str(l)]
    return parameters
Copy the code

7. Build a two-layer neural network

# Two-layer neural network
def two_layer_model(X, Y, layers_dims, learning_rate=0.0075, num_iterations=3000, print_cost=False, isPlot=True) :
    grads = {}
    costs = [] 
    (n_x,n_h,n_y) = layers_dims
    parameters = initialization(n_x, n_h, n_y)
    
    W1 = parameters["W1"]
    b1 = parameters["b1"]
    W2 = parameters["W2"]
    b2 = parameters["b2"]

    for i in range(0, num_iterations):
        # forward propagation
        A1, cache1 = linear_activation_forward(X, W1, b1, "relu")
        A2, cache2 = linear_activation_forward(A1, W2, b2, "sigmoid")
        
        # Calculate the cost
        cost = compute_cost(A2,Y)
        
        # Backpropagation
        dA2 = - (np.divide(Y, A2) - np.divide(1 - Y, 1 - A2))
        
        Type "dA2, cache2, cache1". Output: "dA1, dW2, DB2; Also dA0 (unused), dW1, DB1 ".
        dA1, dW2, db2 = linear_activation_backward(dA2, cache2, "sigmoid")
        dA0, dW1, db1 = linear_activation_backward(dA1, cache1, "relu")
        
        # Back-propagation was completed and the data were saved to grads
        grads["dW1"] = dW1
        grads["db1"] = db1
        grads["dW2"] = dW2
        grads["db2"] = db2
        
        # update parameters
        parameters = update_parameters(parameters,grads,learning_rate)
        W1 = parameters["W1"]
        b1 = parameters["b1"]
        W2 = parameters["W2"]
        b2 = parameters["b2"]
        
        Print the cost value
        if i % 100= =0:
            # Record cost
            costs.append(cost)
            Print the cost value
            if print_cost:
                print("The first", i ,"For the next iteration, the cost is:" ,np.squeeze(cost))
    # Complete the iteration, draw the graph according to the condition
    if isPlot:
        plt.plot(np.squeeze(costs))
        plt.ylabel('cost')
        plt.xlabel('iterations (per tens)')
        plt.title("Learning rate =" + str(learning_rate))
        plt.show()
    
    # returns the parameters
    return parameters
Copy the code

8. Build l-layer neural network

# L layer neural network model
def L_layer_model(X, Y, layers_dims, learning_rate=0.0075, num_iterations=3000, print_cost=False,isPlot=True) :
    np.random.seed(1)
    costs = []
    parameters = initialization_deep(layers_dims)
    
    for i in range(0, num_iterations):
        AL , caches = L_model_forward(X,parameters)
        cost = compute_cost(AL,Y)
        grads = L_model_backward(AL,Y,caches)
        parameters = update_parameters(parameters,grads,learning_rate)
        Print the cost value, ignored if print_cost=False
        if i % 100= =0:
            # Record cost
            costs.append(cost)
            Print the cost value
            if print_cost:
                print("The first", i ,"For the next iteration, the cost is:" ,np.squeeze(cost))
    # Complete the iteration, draw the graph according to the condition
    if isPlot:
        plt.plot(np.squeeze(costs))
        plt.ylabel('cost')
        plt.xlabel('iterations (per tens)')
        plt.title("Learning rate =" + str(learning_rate))
        plt.show()
    return parameters
Copy the code

9. Predictive function

# prediction
def predict(X, y, parameters) :
    m = X.shape[1]
    n = len(parameters) // 2 
    p = np.zeros((1,m))
    Propagate forward according to parameters
    probas, caches = L_model_forward(X, parameters)
    
    for i in range(0, probas.shape[1) :# The probability is rounded to determine whether it is a cat
        if probas[0,i] > 0.5:
            p[0,i] = 1
        else:
            p[0,i] = 0
    
    print("Accuracy is:"  + str(float(np.sum((p == y))/m)))
    return p
Copy the code

Test of two-layer neural network

# Two-layer neural network test
n_x = 12288
n_h = 7
n_y = 1
layers_dims = (n_x,n_h,n_y)

parameters = two_layer_model(train_x1, train_y, layers_dims = (n_x, n_h, n_y), learning_rate = 0.0075, num_iterations = 2500, print_cost=True, isPlot=True)
Copy the code
The first0In the next iteration, the cost value is:0.7063761340884793100In the next iteration, the cost value is:0.6419469811594473200In the next iteration, the cost value is:0.6208666114090164300In the next iteration, the cost value is:0.5933708778671708400In the next iteration, the cost value is:0.5603947807293141500In the next iteration, the cost value is:0.52507577356433600In the next iteration, the cost value is:0.48483872973856995700In the next iteration, the cost value is:0.43879411822651754800In the next iteration, the cost value is:0.38704167645831195900In the next iteration, the cost value is:0.3330460684710141000In the next iteration, the cost value is:0.28069510513419511100In the next iteration, the cost value is:0.232174990519032731200In the next iteration, the cost value is:0.191211675485827821300In the next iteration, the cost value is:0.161522942689906331400In the next iteration, the cost value is:0.12698674818977631500In the next iteration, the cost value is:0.102587963713564421600In the next iteration, the cost value is:0.084792082452645411700In the next iteration, the cost value is:0.070136949389881961800In the next iteration, the cost value is:0.059242021274774451900In the next iteration, the cost value is:0.050640654001931042000In the next iteration, the cost value is:0.043774649323749922100In the next iteration, the cost value is:0.0382632037980639652200In the next iteration, the cost value is:0.033784620399210012300In the next iteration, the cost value is:0.030074435213212172400In the next iteration, the cost value is:0.02698793871455636
Copy the code

predictions_train = predict(train_x1, train_y, parameters) # training set
predictions_test = predict(test_x1, test_y, parameters) # test set
Copy the code
Accuracy is:1.0Accuracy is:0.68
Copy the code
  • In the error analysis of deep learning, the accuracy in the development/test set is low, and in fact the variance is large.

L layer neural network test

layers_dims = [12288.20.7.5.1] # 5-layer model
parameters = L_layer_model(train_x1, train_y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost = True, isPlot=True)
Copy the code
The first0In the next iteration, the cost value is:0.7717493284237686100In the next iteration, the cost value is:0.6720534400822913200In the next iteration, the cost value is:0.6482632048575212300In the next iteration, the cost value is:0.6115068816101354400In the next iteration, the cost value is:0.567047326836611500In the next iteration, the cost value is:0.5401376634547801600In the next iteration, the cost value is:0.5279299569455267700In the next iteration, the cost value is:0.4654773771766852800In the next iteration, the cost value is:0.369125852495928900In the next iteration, the cost value is:0.39174697434805341000In the next iteration, the cost value is:0.315186988860061631100In the next iteration, the cost value is:0.27269984417893841200In the next iteration, the cost value is:0.237418534002681311300In the next iteration, the cost value is:0.199601205322086441400In the next iteration, the cost value is:0.189263003884633051500In the next iteration, the cost value is:0.16118854665827751600In the next iteration, the cost value is:0.148213896623633161700In the next iteration, the cost value is:0.137774878129729381800In the next iteration, the cost value is:0.12974017549190121900In the next iteration, the cost value is:0.121225350680052112000In the next iteration, the cost value is:0.11382060668633712100In the next iteration, the cost value is:0.107839285262541332200In the next iteration, the cost value is:0.102854660693526822300In the next iteration, the cost value is:0.100897454452617842400In the next iteration, the cost value is:0.092878215264723972500In the next iteration, the cost value is:0.08841251177615042600In the next iteration, the cost value is:0.085951304161464282700In the next iteration, the cost value is:0.081681269149263342800In the next iteration, the cost value is:0.078246612758155342900In the next iteration, the cost value is:0.07544408693855481
Copy the code

predictions_train = predict(train_x1, train_y, parameters) # training set
predictions_test = predict(test_x1, test_y, parameters) # test set
Copy the code
Accuracy is:0.9904306220095693Accuracy is:0.82
Copy the code

Xii. Display incorrectly classified images for analysis

def print_mislabeled_images(classes, X, y, p) :
    a = p + y
    mislabeled_indices = np.asarray(np.where(a == 1))
    plt.rcParams['figure.figsize'] = (40.0.40.0) # set default size of plots
    num_images = len(mislabeled_indices[0])
    for i in range(num_images):
        index = mislabeled_indices[1][i]
        
        plt.subplot(2, num_images, i + 1)
        plt.imshow(X[:,index].reshape(64.64.3), interpolation='nearest')
        plt.axis('off')
        plt.title("Prediction: " + classes[int(p[0,index])].decode("utf-8") + " \n Class: " + classes[y[0,index]].decode("utf-8"))
Copy the code
print_mislabeled_images(classes, test_x1, test_y, predictions_test)
Copy the code

  • It can be seen that the fuzzy picture, the offset of the main position, the brightness and darkness of the light and other factors may lead to errors in system judgment, so we can select the ones with a large proportion for optimization.

13. Predict your own pictures

my_image = "cat.jpg"
my_label = [1]
fname = ". /" + my_image
image = np.array(plt.imread(fname))
my_image = np.array(Image.fromarray(image).resize(size=(num_px,num_px))).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(my_image, my_label, parameters)

plt.imshow(image)
print ("y = " + str(np.squeeze(my_predicted_image)) + ", your L-layer model predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")
Copy the code
Accuracy is:1.0
y = 1.0, your L-layer model predicts a "cat" picture.
Copy the code

Don’t be alarmed if non-cat images are thrown and identified as cat images, for this data set, even though it works well on the training set and development/test set, in fact the accuracy on our real test set (user-uploaded images) is completely unachievable! In such a small data set, the purpose of writing DNN model is to get familiar with the internal structure of deep neural network and apply the theoretical knowledge of deep learning into practice. Again, learning theoretical knowledge is the core. If you learn how neural networks work, you will have a better intuition to optimize neural networks.