This paper mainly demonstrates image recognition on CIFAR-10 dataset. There is a large section of the previous tutorial text and code, if you have seen the friends can quickly read.

01 – | a simple linear model 02 – convolution neural network | 03 – PrettyTensor | 04 – save & restore 05 – integration study

By Magnus Erik Hvass Pedersen/GitHub/Videos on YouTube 英 文翻译

Introduction to the

This tutorial describes how to create a convolutional neural network for image classification on a CIRAR-10 dataset. It also explains how to use different networks for training and testing.

This is based on the previous tutorial, so you need to know the basics of TensorFlow and the add-on package Pretty Tensor. Much of the code and text is similar to previous tutorials, so if you’ve already seen it, you can quickly navigate this article.

The flow chart

The chart below directly shows the data transfer in the convolutional neural network implemented later. First there is a pretreatment layer for distorts input images, which is used to artificially enlarge the training set. Then there are two convolutional layers, two full connection layers, and a Softmax classification layer. There will be larger diagrams later to show the weights and the output of the convolution layer, and tutorial #02 has more details on how convolution works.

In this case the diagram looks misclassified. The image showed a dog, but the neural network wasn’t sure if it was a dog or a cat, deciding it was more likely to be a cat.

from IPython.display import Image
Image('images/06_network_flowchart.png')Copy the code

The import

%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from sklearn.metrics import confusion_matrix
import time
from datetime import timedelta
import math
import os

# Use PrettyTensor to simplify Neural Network construction.
import prettytensor as ptCopy the code

Developed using Python3.5.2 (Anaconda), the TensorFlow version is:

tf.__version__Copy the code

‘0.12.0 – rc0’

PrettyTensor version:

pt.__version__Copy the code

‘0.7.1’

Load the data

import cifar10Copy the code

Set the path to save the data set on your computer.

# cifar10.data_path = "data/CIFAR-10/"Copy the code

The CIFAR-10 dataset is about 163MB and will be downloaded automatically if no files are found in a given path.

cifar10.maybe_download_and_extract()Copy the code

Data has apparently already been downloaded and unpacked.

Load the class name.

class_names = cifar10.load_class_names()
class_namesCopy the code

Loading data: data/CIFAR-10/cifar-10-batches-py/batches.meta

[‘airplane’,

‘automobile’,

‘bird’,

‘cat’,

‘deer’,

‘dog’,

‘frog’,

‘horse’,

‘ship’,

‘truck’]

Load the training set. This function returns an image, an integer class number, and a one-hot encoded array of class numbers called a label.

images_train, cls_train, labels_train = cifar10.load_training_data()Copy the code

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_1

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_2

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_3

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_4

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_5

Load the test set.

images_test, cls_test, labels_test = cifar10.load_test_data()Copy the code

Loading data: data/CIFAR-10/cifar-10-batches-py/test_batch

The CIFAR-10 dataset has now been loaded with 60,000 images and associated labels (classification of images). The data set is divided into two independent subsets, namely the training set and the test set.

print("Size of:")
print("- Training-set:\t\t{}".format(len(images_train)))
print("- Test-set:\t\t{}".format(len(images_test)))Copy the code

Size of:

  • Training-set: 50000
  • Test-set: 10000

Data dimension

Data dimensions are used several times in the following code. These are already defined in the cirfa10 module, so we just need to import them.

from cifar10 import img_size, num_channels, num_classesCopy the code

The image was 32 x 32 pixels, but we cropped it to 24 x 24 pixels.

img_size_cropped = 24Copy the code

Help function for drawing pictures

This function is used to draw nine images in a 3×3 grid and write the real category and the predicted category under each image.

def plot_images(images, cls_true, cls_pred=None, smooth=True):

    assert len(images) == len(cls_true) == 9

    # Create figure with sub-plots.
    fig, axes = plt.subplots(3.3)

    # Adjust vertical spacing if we need to print ensemble and best-net.
    if cls_pred is None:
        hspace = 0.3
    else:
        hspace = 0.6
    fig.subplots_adjust(hspace=hspace, wspace=0.3)

    for i, ax in enumerate(axes.flat):
        # Interpolation type.
        if smooth:
            interpolation = 'spline16'
        else:
            interpolation = 'nearest'

        # Plot image.
        ax.imshow(images[i, :, :, :],
                  interpolation=interpolation)

        # Name of the true class.
        cls_true_name = class_names[cls_true[i]]

        # Show true and predicted classes.
        if cls_pred is None:
            xlabel = "True: {0}".format(cls_true_name)
        else:
            # Name of the predicted class.
            cls_pred_name = class_names[cls_pred[i]]

            xlabel = "True: {0}\nPred: {1}".format(cls_true_name, cls_pred_name)

        # Show the classes as the label on the x-axis.
        ax.set_xlabel(xlabel)

        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()Copy the code

Draw a few images to see if the data is correct

# Get the first images from the test-set.
images = images_test[0:9]

# Get the true classes for those images.
cls_true = cls_test[0:9]

# Plot the images and labels using our helper-function above.
plot_images(images=images, cls_true=cls_true, smooth=False)Copy the code

The pixelated image above is input to the neural network. If we smoothed the image, it might be easier for the human eye to recognize.

plot_images(images=images, cls_true=cls_true, smooth=True)Copy the code

TensorFlow figure

The whole point of TensorFlow is to use something called computational graph, which is much more efficient than doing the same amount of computation directly in Python. TensorFlow is more efficient than Numpy because TensorFlow knows the entire graph that needs to be run, whereas Numpy only knows the unique mathematical operation at a point in time.

TensorFlow also automatically calculates gradients of variables that need to be optimized for better model performance. This is because the graph is a combination of simple mathematical expressions, so the gradient of the entire graph can be derived using the chain rule.

TensorFlow also takes advantage of multi-core cpus and gpus. Google has made special chips for TensorFlow called Tensor Processing Units (TPUs), which are faster than gpus.

A TensorFlow diagram consists of the following parts, described in detail below:

  • Placeholder variables are used to change the input to the diagram.
  • The Model variables will be optimized to make the Model perform better.
  • The model is essentially just a bunch of mathematical functions that compute some outputs based on the Placeholder and the input variables of the model.
  • A cost measure is used to guide the optimization of variables.
  • An optimization strategy updates the variables of the model.

In addition, the TensorFlow diagram contains debugging states, such as printing log data with TensorBoard, which are not covered in this tutorial.

Placeholder variables

Placeholder is the input to the diagram, and we can change them every time we run the diagram. Call this process the feeding placeholder variable, which will be described later.

First we define the placeholder variable for the input image. This allows us to change the image we input into the TensorFlow diagram. This is also a tensor, which means a multidimensional vector or matrix. Set the value to float32 and the shape to None, img_size, img_size, num_channels means that the tensor can store any number of images, each one img_size and num_channels.

x = tf.placeholder(tf.float32, shape=[None, img_size, img_size, num_channels], name='x')Copy the code

Next we define placeholder variables for the actual tags that correspond to the image in the input variable X. The variable has the shape [None, num_classes], which means it holds any number of labels, each of which is a vector of length num_classes, which in this case is 10.

y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')Copy the code

We could also provide a placeholder for class-number, but we’ll calculate that in argmax. Here are just a few operations in TensorFlow; no operations are performed.

y_true_cls = tf.argmax(y_true, dimension=1)Copy the code

Preprocessing helper functions

The following helper function creates a TensorFlow calculation diagram that is used to preprocess the input image. No calculation is performed here; the function simply adds nodes to the TensorFlow diagram.

Neural network preprocessing methods are different in training and testing stages:

  • For training, the input image is cropped randomly, flipped horizontally, and adjusted with random values for hue, contrast, and saturation. This creates random variations of the original input image, artificially augmenting the training set. Some distorted image samples will be shown later.

  • For the test, the input image is clipped according to the center, and other adjustments are not made.

def pre_process_image(image, training):
    # This function takes a single image as input,
    # and a boolean whether to build the training or testing graph.

    if training:
        # For training, add the following to the TensorFlow graph.

        # Randomly crop the input image.
        image = tf.random_crop(image, size=[img_size_cropped, img_size_cropped, num_channels])

        # Randomly flip the image horizontally.
        image = tf.image.random_flip_left_right(image)

        # Randomly adjust hue, contrast and saturation.
        image = tf.image.random_hue(image, max_delta=0.05)
        image = tf.image.random_contrast(image, lower=0.3, upper=1.0)
        image = tf.image.random_brightness(image, max_delta=0.2)
        image = tf.image.random_saturation(image, lower=0.0, upper=2.0)

        # Some of these functions may overflow and result in pixel
        # values beyond the [0, 1] range. It is unclear from the
        # documentation of TensorFlow 0.10.0rc0 whether this is
        # intended. A simple solution is to limit the range.

        # Limit the image pixels between [0, 1] in case of overflow.
        image = tf.minimum(image, 1.0)
        image = tf.maximum(image, 0.0)
    else:
        # For training, add the following to the TensorFlow graph.

        # Crop the input image around the centre so it is the same
        # size as images that are randomly cropped during training.
        image = tf.image.resize_image_with_crop_or_pad(image,
                                                       target_height=img_size_cropped,
                                                       target_width=img_size_cropped)

    return imageCopy the code

In the following functions, the above functions are called for each image in the input Batch.

def pre_process(images, training):
    # Use TensorFlow to loop over all the input images and call
    # the function above which takes a single image as input.
    images = tf.map_fn(lambda image: pre_process_image(image, training), images)

    return imagesCopy the code

To draw the distorted image, we create a preprocessed Graph for TensorFlow, which we will run later.

distorted_images = pre_process(images=x, training=True)Copy the code

Create a helper function for the main handler

The following helper functions create the main parts of the convolutional neural network. We use Pretty Tensor, which was described in the tutorial.

def main_network(images, training):
    # Wrap the input images as a Pretty Tensor object.
    x_pretty = pt.wrap(images)

    # Pretty Tensor uses special numbers to distinguish between
    # the training and testing phases.
    if training:
        phase = pt.Phase.train
    else:
        phase = pt.Phase.infer

    # Create the convolutional neural network using Pretty Tensor.
    # It is very similar to the previous tutorials, except
    # the use of so-called batch-normalization in the first layer.
    with pt.defaults_scope(activation_fn=tf.nn.relu, phase=phase):
        y_pred, loss = x_pretty.\
            conv2d(kernel=5, depth=64, name='layer_conv1', batch_normalize=True).\
            max_pool(kernel=2, stride=2).\
            conv2d(kernel=5, depth=64, name='layer_conv2').\
            max_pool(kernel=2, stride=2).\
            flatten().\
            fully_connected(size=256, name='layer_fc1').\
            fully_connected(size=128, name='layer_fc2').\
            softmax_classifier(num_classes=num_classes, labels=y_true)

    return y_pred, lossCopy the code

Create a help function for the neural network

The following helper functions create the entire neural network, including the preprocessing and main processing modules defined above.

Note that the neural network is encoded into the ‘network’ variable scope. Because we actually created two neural networks in the TensorFlow diagram. By specifying a variable scope like this, variables can be reused in both neural networks, so the optimized variables of the training network can be reused in the test network.

def create_network(training):
    # Wrap the neural network in the scope named 'network'.
    # Create new variables during training, and re-use during testing.
    with tf.variable_scope('network', reuse=not training):
        # Just rename the input placeholder variable for convenience.
        images = x

        # Create TensorFlow graph for pre-processing.
        images = pre_process(images=images, training=training)

        # Create TensorFlow graph for the main processing.
        y_pred, loss = main_network(images=images, training=training)

    return y_pred, lossCopy the code

Create a neural network for the training phase

Start by creating a TensorFlow variable that holds the current number of optimization iterations. We’ve used a Python variable in this video, but in this video, we want to use another variable that we’ve found in TensorFlow that changes color.

Trainable =False indicates that TensorFlow does not optimize this variable.

global_step = tf.Variable(initial_value=0,
                          name='global_step', trainable=False)Copy the code

Create a neural network for training. The create_network() function returns y_pred and loss, but we only need the loss function for training.

_, loss = create_network(training=True)Copy the code

Create an optimizer that minimizes the Loss function. Also pass global_step to the optimizer so that it decrement by one with each iteration.

optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(loss, global_step=global_step)Copy the code

Create a neural network for the test phase

Now create the neural network for the test phase. Similarly, create_network() returns the prediction label y_pred for the input image. The optimization process also uses the Loss function. For testing we only need y_preD.

y_pred, _ = create_network(training=False)Copy the code

We then calculate the shaping number for the predicted category number. The network output y_pred is an array of 10 elements. The category number is the index of the largest element in the array.

y_pred_cls = tf.argmax(y_pred, dimension=1)Copy the code

We then create a Boolean vector that tells us whether the true category of each image is the same as the predicted category.

correct_prediction = tf.equal(y_pred_cls, y_true_cls)Copy the code

The above calculation first converts the Boolean vector type to a floating-point vector, in which case False becomes 0 and True becomes 1, and then computes the average of these values to calculate the accuracy of the classification.

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))Copy the code

Saver

To save the variables of the neural network (so that the network can be retrained), we create an object called saver-Object, which is used to save and restore all the variables of the TensorFlow diagram. Nothing is saved here, it’s done later in the optimize() function.

saver = tf.train.Saver()Copy the code

To get the weight

Now, we want to plot the weights of the neural network. When you use Pretty Tensor to create a network, all the variables of the layers are created indirectly by Pretty Tensoe. So we’re going to get variables from TensorFlow.

We use layer_conv1 and layer_conv2 to represent two convolution layers. This is also called variable scope (not to be confused with defaults_scope described above). PrettyTensor automatically names the variables it creates for each level, so we can use the scope name and the variable name to get the weight of a level.

The function implementation is a bit clunky because we have to use the TensorFlow function get_variable(), which is designed for other uses, creating new variables or reusing existing ones. Creating the following help function is simple.

def get_weights_variable(layer_name):
    # Retrieve an existing variable named 'weights' in the scope
    # with the given layer_name.
    # This is awkward because the TensorFlow function was
    # really intended for another purpose.

    with tf.variable_scope("network/" + layer_name, reuse=True):
        variable = tf.get_variable('weights')

    return variableCopy the code

With this helper function we can get variables. These are the objects of TensorFlow. You need a similar operation to get the contents of the variable: contents = session.run(weights_conv1), as described below.

weights_conv1 = get_weights_variable(layer_name='layer_conv1')
weights_conv2 = get_weights_variable(layer_name='layer_conv2')Copy the code

Get the output of layer

Again, we need to get the output of the convolution layer. This function is different from the one above to get the weights. Here we retrieve the last tensor of the output of the convolution layer.

def get_layer_output(layer_name):
    # The name of the last operation of the convolutional layer.
    # This assumes you are using Relu as the activation-function.
    tensor_name = "network/" + layer_name + "/Relu:0"

    # Get the tensor with this name.
    tensor = tf.get_default_graph().get_tensor_by_name(tensor_name)

    return tensorCopy the code

Get the output of the convolution layer for later drawing.

output_conv1 = get_layer_output(layer_name='layer_conv1')
output_conv2 = get_layer_output(layer_name='layer_conv2')Copy the code

Run TensorFlow

Creating TensorFlow sessions (Session)

Once the TensorFlow diagram is created, we need to create a TensorFlow session to run the diagram.

session = tf.Session()Copy the code

Initialize or restore a variable

It takes a long time to train a neural network, especially if you don’t have a GPU. That’s why we retain the ability to test our ability to test during training, which we’ve learned during the night, and that we can use it to analyze results without training a neural network.

If you want to retrain a neural network, you need to delete these cpus.

This is a folder that holds the checkpoints.

save_dir = 'checkpoints/'Copy the code

Create a folder if it does not exist.

if not os.path.exists(save_dir):
    os.makedirs(save_dir)Copy the code

This is a basic file name that I changing, adding iterations to TensorFlow.

save_path = os.path.join(save_dir, 'cifar10_cnn')Copy the code

Try to load the latest checkpoint. If checkpoint does not exist or changes the TensorFlow diagram, it may fail and throw an exception.

try:
    print("Trying to restore last checkpoint ...")

    # Use TensorFlow to find the latest checkpoint - if any.
    last_chk_path = tf.train.latest_checkpoint(checkpoint_dir=save_dir)

    # Try and load the data in the checkpoint.
    saver.restore(session, save_path=last_chk_path)

    # If we get to this point, the checkpoint was successfully loaded.
    print("Restored checkpoint from:", last_chk_path)
except:
    # If the above failed for some reason, simply
    # initialize all the variables for the TensorFlow graph.
    print("Failed to restore checkpoint. Initializing variables instead.")
    session.run(tf.global_variables_initializer())Copy the code
Trying to restore last checkpoint ...
Restored checkpoint from: checkpoints/cifar10_cnn-150000Copy the code

Create a help function for random training Batch

There are 50,000 images in the training set. Using these images to calculate the gradient of the model takes a lot of time. Therefore, only a small number of images are used in each iteration of the optimizer.

If running out of memory causes your computer to crash or become slow, you should try to reduce those numbers, but more optimized iterations may be needed in the meantime.

train_batch_size = 64Copy the code

The train-Batch function selects a random batch from the training set.

def random_batch(a):
    # Number of images in the training-set.
    num_images = len(images_train)

    # Create a random index.
    idx = np.random.choice(num_images,
                           size=train_batch_size,
                           replace=False)

    # Use the random index to select random images and labels.
    x_batch = images_train[idx, :, :, :]
    y_batch = labels_train[idx, :]

    return x_batch, y_batchCopy the code

Perform the help function for the optimization iteration

Function is used to perform a number of optimization iterations to gradually improve the network layer variables. In each iteration, a new batch of data is selected from the training set, and TensorFlow performs optimization on these training samples. Progress is printed every 100 iterations. A checkpoint is saved after every 1000 iterations and even after the last iteration.

def optimize(num_iterations):
    # Start-time used for printing time-usage below.
    start_time = time.time()

    for i in range(num_iterations):
        # Get a batch of training examples.
        # x_batch now holds a batch of images and
        # y_true_batch are the true labels for those images.
        x_batch, y_true_batch = random_batch()

        # Put the batch into a dict with the proper names
        # for placeholder variables in the TensorFlow graph.
        feed_dict_train = {x: x_batch,
                           y_true: y_true_batch}

        # Run the optimizer using this batch of training data.
        # TensorFlow assigns the variables in feed_dict_train
        # to the placeholder variables and then runs the optimizer.
        # We also want to retrieve the global_step counter.
        i_global, _ = session.run([global_step, optimizer],
                                  feed_dict=feed_dict_train)

        # Print status to screen every 100 iterations (and last).
        if (i_global % 100= =0) or (i == num_iterations - 1) :# Calculate the accuracy on the training-batch.
            batch_acc = session.run(accuracy,
                                    feed_dict=feed_dict_train)

            # Print status.
            msg = "Global Step: {0:>6}, Training Batch Accuracy: {1:>6.1%}"
            print(msg.format(i_global, batch_acc))

        # Save a checkpoint to disk every 1000 iterations (and last).
        if (i_global % 1000= =0) or (i == num_iterations - 1) :# Save all variables of the TensorFlow graph to a
            # checkpoint. Append the global_step counter
            # to the filename so we save the last several checkpoints.
            saver.save(session,
                       save_path=save_path,
                       global_step=global_step)

            print("Saved checkpoint.")

    # Ending time.
    end_time = time.time()

    # Difference between start and end-times.
    time_dif = end_time - start_time

    # Print the time-usage.
    print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))Copy the code

A helper function for drawing error samples

The draw () function is used to draw misclassified samples in the test set.

def plot_example_errors(cls_pred, correct):
    # This function is called from print_test_accuracy() below.

    # cls_pred is an array of the predicted class-number for
    # all images in the test-set.

    # correct is a boolean array whether the predicted class
    # is equal to the true class for each image in the test-set.

    # Negate the boolean array.
    incorrect = (correct == False)

    # Get the images from the test-set that have been
    # incorrectly classified.
    images = images_test[incorrect]

    # Get the predicted classes for those images.
    cls_pred = cls_pred[incorrect]

    # Get the true classes for those images.
    cls_true = cls_test[incorrect]

    # Plot the first 9 images.
    plot_images(images=images[0:9],
                cls_true=cls_true[0:9],
                cls_pred=cls_pred[0:9])Copy the code

Help function to draw the confusion matrix

def plot_confusion_matrix(cls_pred):
    # This is called from print_test_accuracy() below.

    # cls_pred is an array of the predicted class-number for
    # all images in the test-set.

    # Get the confusion matrix using sklearn.
    cm = confusion_matrix(y_true=cls_test,  # True class for test-set.
                          y_pred=cls_pred)  # Predicted class.

    # Print the confusion matrix as text.
    for i in range(num_classes):
        # Append the class-name to each line.
        class_name = "({}) {}".format(i, class_names[i])
        print(cm[i, :], class_name)

    # Print the class-numbers for easy reference.
    class_numbers = ["({0})".format(i) for i in range(num_classes)]
    print("".join(class_numbers))Copy the code

Compute the help function for classification

This function computes the predicted categories of images and returns a Boolean array representing whether each image is correctly classified.

Because the calculations may consume too much memory, batch them. If your computer crashes, try lowering batch-size.

# Split the data-set in batches of this size to limit RAM usage.
batch_size = 256

def predict_cls(images, labels, cls_true):
    # Number of images.
    num_images = len(images)

    # Allocate an array for the predicted classes which
    # will be calculated in batches and filled into this array.
    cls_pred = np.zeros(shape=num_images, dtype=np.int)

    # Now calculate the predicted classes for the batches.
    # We will just iterate through all the batches.
    # There might be a more clever and Pythonic way of doing this.

    # The starting index for the next batch is denoted i.
    i = 0

    while i < num_images:
        # The ending index for the next batch is denoted j.
        j = min(i + batch_size, num_images)

        # Create a feed-dict with the images and labels
        # between index i and j.
        feed_dict = {x: images[i:j, :],
                     y_true: labels[i:j, :]}

        # Calculate the predicted class using TensorFlow.
        cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict)

        # Set the start-index for the next batch to the
        # end-index of the current batch.
        i = j

    # Create a boolean array whether each image is correctly classified.
    correct = (cls_true == cls_pred)

    return correct, cls_predCopy the code

Calculates the prediction category of the test set.

def predict_cls_test(a):
    return predict_cls(images = images_test,
                       labels = labels_test,
                       cls_true = cls_test)Copy the code

Help function to calculate classification accuracy

This function calculates the classification accuracy of a given Boolean array indicating whether each image is correctly classified. For example, cls_accuracy([True, True, False, False, False]) = 2/5 = 0.4. This function also returns the number of correct classifications.

def classification_accuracy(correct):
    # When averaging a boolean array, False means 0 and True means 1.
    # So we are calculating: number of True / len(correct) which is
    # the same as the classification accuracy.

    # Return the classification accuracy
    # and the number of correct classifications.
    return correct.mean(), correct.sum()Copy the code

Help functions that show performance

The class accuracy () function prints the classification accuracy on the test set.

It will take a while to calculate the classification for all the images on the test set, so we call the above functions directly from this function so that we don’t have to recalculate the classification for each function.

def print_test_accuracy(show_example_errors=False, show_confusion_matrix=False):

    # For all the images in the test-set,
    # calculate the predicted classes and whether they are correct.
    correct, cls_pred = predict_cls_test()

    # Classification accuracy and the number of correct classifications.
    acc, num_correct = classification_accuracy(correct)

    # Number of images being classified.
    num_images = len(correct)

    # Print the accuracy.
    msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})"
    print(msg.format(acc, num_correct, num_images))

    # Plot some examples of mis-classifications, if desired.
    if show_example_errors:
        print("Example errors:")
        plot_example_errors(cls_pred=cls_pred, correct=correct)

    # Plot the confusion matrix, if desired.
    if show_confusion_matrix:
        print("Confusion Matrix:")
        plot_confusion_matrix(cls_pred=cls_pred)Copy the code

Draw the help function of the convolution weights

def plot_conv_weights(weights, input_channel=0):
    # Assume weights are TensorFlow ops for 4-dim variables
    # e.g. weights_conv1 or weights_conv2.

    # Retrieve the values of the weight-variables from TensorFlow.
    # A feed-dict is not necessary because nothing is calculated.
    w = session.run(weights)

    # Print statistics for the weights.
    print("Min: {0:.5f}, Max: {1:.5f}".format(w.min(), w.max()))
    print("Mean: {0:.5f}, Stdev: {1:.5f}".format(w.mean(), w.std()))

    # Get the lowest and highest values for the weights.
    # This is used to correct the colour intensity across
    # the images so they can be compared with each other.
    w_min = np.min(w)
    w_max = np.max(w)
    abs_max = max(abs(w_min), abs(w_max))

    # Number of filters used in the conv. layer.
    num_filters = w.shape[3]

    # Number of grids to plot.
    # Rounded-up, square-root of the number of filters.
    num_grids = math.ceil(math.sqrt(num_filters))

    # Create figure with a grid of sub-plots.
    fig, axes = plt.subplots(num_grids, num_grids)

    # Plot all the filter-weights.
    for i, ax in enumerate(axes.flat):
        # Only plot the valid filter-weights.
        if i<num_filters:
            # Get the weights for the i'th filter of the input channel.
            # The format of this 4-dim tensor is determined by the
            # TensorFlow API. See Tutorial #02 for more details.
            img = w[:, :, input_channel, i]

            # Plot image.
            ax.imshow(img, vmin=-abs_max, vmax=abs_max,
                      interpolation='nearest', cmap='seismic')

        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()Copy the code

Draw the help function for the convolution layer output

def plot_layer_output(layer_output, image):
    # Assume layer_output is a 4-dim tensor
    # e.g. output_conv1 or output_conv2.

    # Create a feed-dict which holds the single input image.
    # Note that TensorFlow needs a list of images,
    # so we just create a list with this one image.
    feed_dict = {x: [image]}

    # Retrieve the output of the layer after inputting this image.
    values = session.run(layer_output, feed_dict=feed_dict)

    # Get the lowest and highest values.
    # This is used to correct the colour intensity across
    # the images so they can be compared with each other.
    values_min = np.min(values)
    values_max = np.max(values)

    # Number of image channels output by the conv. layer.
    num_images = values.shape[3]

    # Number of grid-cells to plot.
    # Rounded-up, square-root of the number of filters.
    num_grids = math.ceil(math.sqrt(num_images))

    # Create figure with a grid of sub-plots.
    fig, axes = plt.subplots(num_grids, num_grids)

    # Plot all the filter-weights.
    for i, ax in enumerate(axes.flat):
        # Only plot the valid image-channels.
        if i<num_images:
            # Get the images for the i'th output channel.
            img = values[0, :, :, i]

            # Plot image.
            ax.imshow(img, vmin=values_min, vmax=values_max,
                      interpolation='nearest', cmap='binary')

        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()Copy the code

Enter a sample of the image variation

In order to increase the number of training images artificially, the neural network preprocesses random variations of input images. This makes the neural network more flexible in recognizing and classifying images.

This is a helper function used to draw variations of the input image.

def plot_distorted_image(image, cls_true):
    # Repeat the input image 9 times.
    image_duplicates = np.repeat(image[np.newaxis, :, :, :], 9, axis=0)

    # Create a feed-dict for TensorFlow.
    feed_dict = {x: image_duplicates}

    # Calculate only the pre-processing of the TensorFlow graph
    # which distorts the images in the feed-dict.
    result = session.run(distorted_images, feed_dict=feed_dict)

    # Plot the images.
    plot_images(images=result, cls_true=np.repeat(cls_true, 9))Copy the code

The help function gets the test set image and its classification number.

def get_test_image(i):
    return images_test[i, :, :, :], cls_test[i]Copy the code

Take an image from the test set and its real category.

img, cls = get_test_image(16)Copy the code

Draw nine random variations of the image. If you rerun the code, you might get a different result.

plot_distorted_image(img, cls)Copy the code

Perform optimization

My laptop has four cores, 2GHz each. The computer comes with a GPU, but not too fast for TensorFlow, so only the CPU is used. It took about an hour to iterate 10,000 times on the computer. I performed 150,000 optimization iterations in this tutorial, which took 15 hours. I let it run at night and several times during the day.

We’ve captured the checkpoints during optimization, and re-run the code using the checkpoints that we change, so we can stop and continue optimization later.

if False:
    optimize(num_iterations=1000)Copy the code

The results of

After 150,000 optimization iterations, classification accuracy on the test set was about 79-80%. Some of the misclassified images are drawn below. Some of them are hard for even the human eye to tell apart, and some of them are legitimate mistakes, like big cars and trucks, cats and dogs, but some of them are a little weird.

print_test_accuracy(show_example_errors=True,
                    show_confusion_matrix=True)Copy the code

Accuracy on test-set: 79.3% (7932/10000) Example errors:

Confusion Matrix: [775 20 71 8 14 4 18 10 44 36] (0) airplane [ 7 914 5 0 3 7 9 3 14 38] (1) automobile [ 32 2 724 28 42 44 94 17 9 8] (2) bird [ 18 7 48 508 56 209 99 29 7 19] (3) cat [ 4 2 45 25 769 29 75 43 3 5] (4) deer [ 8 6 34 89 35 748 38 32 1 9] (5) dog [ 4 2 18 9 14 14 930 4 2 3] (6) frog [ 6 2 23 18 31 55 17 833 0 15] (7) horse [ 31 29 15 11 8 7 15 0 856 28] (8) ship [ 13 67 4 5 0 4 7 7 18 875] (9) truck (0) (1) (2) (3) (4) (5) (6) (7) (8) (9)

The convolution weights

Some weights (or filters) for the first convolution layer are shown below. There are three input channels, so there are three sets, and you can change the input_channel to change the drawing result.

Positive weights are red, negative weights are blue.

plot_conv_weights(weights=weights_conv1, input_channel=0)Copy the code
Min:  -0.61643, Max:   0.63949
Mean: -0.00177, Stdev: 0.16469Copy the code

Some weights (or filters) for the second convolution layer are shown below. They’re much closer to zero than the first convolution layer, and you can see the lower standard deviation.

plot_conv_weights(weights=weights_conv2, input_channel=1)Copy the code
Min:  -0.73326, Max:   0.25344
Mean: -0.00394, Stdev: 0.05466Copy the code

Output of the convolution layer

Help function to draw an image.

def plot_image(image):
    # Create figure with sub-plots.
    fig, axes = plt.subplots(1.2)

    # References to the sub-plots.
    ax0 = axes.flat[0]
    ax1 = axes.flat[1]

    # Show raw and smoothened images in sub-plots.
    ax0.imshow(image, interpolation='nearest')
    ax1.imshow(image, interpolation='spline16')

    # Set labels.
    ax0.set_xlabel('Raw')
    ax1.set_xlabel('Smooth')

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()Copy the code

Draw an image from the test set. An unprocessed pixel image is used as input to the neural network.

img, cls = get_test_image(16)
plot_image(img)Copy the code

The raw image is used as the input of the neural network, and then the output of the first convolution layer is drawn.

plot_layer_output(output_conv1, image=img)Copy the code

Take the same image as input and plot the output of the second convolution layer.

plot_layer_output(output_conv2, image=img)Copy the code

Prediction of category tags

Gets the prediction category label and category number of the image.

label_pred, cls_pred = session.run([y_pred, y_pred_cls],
                                   feed_dict={x: [img]})Copy the code

Print forecast category labels.

# Set the rounding options for numpy.
np.set_printoptions(precision=3, suppress=True)

# Print the predicted label.
print(label_pred[0])Copy the code

[0.0. 0.493 0.0.49 0.006 0.01 0.

The prediction category label is an array of length 10, and each element represents how confident the neural network is that the image is of that category.

In this example, index 3 has a value of 0.493 and index 5 has a value of 0.490. This means that the neural network believes that the image is either category 3 or category 5, cat or dog.

class_names[3]Copy the code

‘cat’

class_names[5]Copy the code

‘dog’

Close the TensorFlow session

We have now completed the task with TensorFlow, closing the session and freeing resources.

# This has been commented out in case you want to modify and experiment
# with the Notebook without having to restart it.
# session.close()Copy the code

conclusion

This tutorial describes how to create a convolutional neural network for image classification on a CIRAR-10 dataset. The classification accuracy on the test set was about 79-80%.

The output of the convolution layer is also drawn, but it is difficult to see how the neural network can distinguish and classify images. Better visualization skills are needed.

practice

Here are some suggested exercises that might help you improve your TensorFlow skills. In order to learn how to use TensorFlow more appropriately, practical experience is important.

Before you make changes to this Notebook, you may want to make a backup.

  • Perform 10,000 iterations to see how accurate the classification is. A checkpoint will be saved to store all the variables of the TensorFlow diagram.
  • Perform another 100,000 iterations to see if classification accuracy improves. And then 100,000 more times. Is there any improvement in accuracy, and do you think it’s worth the extra computing time?
  • Try changing the variation of the image during the preprocessing phase.
  • Try to change the structure of the neural network. You can make the neural network bigger or smaller. How does this affect training time or classification accuracy? Note that when you change the architecture of a neural network, you cannot reload that parameter.
  • Try using batch-normalization in the second convolutional layer. Try deleting it in both layers as well.
  • Look at some of the better neural networks on CIFAR-10 and try to implement them.
  • Explain to a friend how the program works.