In the analysis of TensorFlow official ResNet model implementation, we analyze the basic method of model implementation and operation based on Estimator. In addition, the source code also provides a means commonly used in neural networks – Transfer Learning implementation.

The migration study

Depending on the task, training a deep neural network from scratch sometimes requires huge amounts of data to get good results. If you have limited data and want to use neural networks as a solution, try transfer learning.

Take an example: you are responsible for maintaining an automated production line in a factory, with 10 different parts passing randomly on a conveyor belt. Industrial cameras can capture complete parts one by one, but you need to adjust subsequent manipulator actions according to the type of part. There are very few parts images available for training, and you happen to have an ImageNet image classification neural network model trained with a lot of data. How to make full use of these two points is a typical application scenario of transfer learning.

What does transfer learning transfer

There are layers in the structure of deep neural networks. For CNN, feature extraction at different levels of convolutional layers also presents hierarchy. Specifically, the convolution layer at the bottom is sensitive to low-order features, such as edges and clumps. As the hierarchy rises, the extracted features become more and more abstract. This hierarchical feature extraction capability is the basis of transfer learning. It ensures that when tasks are similar, such as classifying 1024 different natural objects and 10 different parts, the feature extraction layer of the trained neural network can be “migrated” to the new classification task to continue the feature extraction function.

The specific practice of transfer learning

Common practices include:

  1. “Frozen” feature extraction part.
  2. The new data is used to train the terminal to be responsible for the output classification of several fully connected layers.

How is TensorFlow implemented

The official ResNet model implementation provides the ability to transfer learning. Just specify –pretrained_model_checkpoint_path and –fine_tune.

In code, the final Dense layer is skipped when loading the model.

if flags_obj.pretrained_model_checkpoint_path is not None: warm_start_settings = tf.estimator.WarmStartSettings( flags_obj.pretrained_model_checkpoint_path, vars_to_warm_start='^(?! .*dense)')Copy the code

The vars_to_warm_start parameter filters out the last full connection layer using a regular expression.

Then, when updating parameters based on gradients, filter out the parts that do not need to be updated.

grad_vars = optimizer.compute_gradients(loss)
      if fine_tune:
        grad_vars = _dense_grad_filter(grad_vars)
      minimize_op = optimizer.apply_gradients(grad_vars, global_step)
Copy the code

The implementation of _dense_grad_filter is as follows:

def _dense_grad_filter(gvs):
      """Only apply gradient updates to the final layer.

      This function is used for fine tuning.

      Args:
        gvs: list of tuples with gradients and variable info
      Returns:
        filtered gradients so that only the dense layer remains
      """
      return [(g, v) for g, v in gvs if 'dense' in v.name]
Copy the code

This implementation is based on node’s name attribute. Therefore, when modifying the network, ensure that the node name you add does not conflict with it.

reference

Transfer learning Tensorflow implementation for image recognition

Yinguobing.com/tensorflow-…

Tensorflow Estimator uses hook to implement Finetune

Github.com/tensorflow/…

medium.com/@utsumuki_n…

Github.com/tensorflow/…

Stackoverflow.com/questions/4…

How does TensorFlow implement Transfer Learning

TensorFlow transfer learning actual case of flower recognition

Building a Custom Estimator for a Series of Large-scale distributed Deep Learning Models based on Tensorflow HIGH order API (taking CNN model of text classification as an example)