This is a story about an IT guy

IT guy starts learning how to identify roses using Inception code base

Background knowledge

In the field of machine learning, convolutional neural networks are at the heart of most state-of-the-art computer vision solutions for a variety of tasks. Tensorflow’s Inception code base also solves the problem of image classification based on this network. Inception Network is the implementation of the 2015 paper Rethinking the Inception Architecture for Computer Vision. In this paper, we explore how to make use of the increased computational power as efficiently as possible through proper decomposition convolution and positive regularization, so as to achieve a substantial profit better than the current level. The paper also describes some guiding principles of network design. Although the practicability of these design principles based on large-scale experiments is speculative, and more experimental data are needed to evaluate their accuracy and validity in the future, it has been proved that serious deviation from these principles will often lead to the degradation of network quality. And following the rules to fix deviations will improve the overall architecture.

First, input and output data should not be overcompressed early in network design. When designing the network, it is necessary to roughly extract the features according to the data of each dimension of the feature graph. Usually, the size of the feature graph data that reaches the network layer gradually decreases from input to output. However, if the data is overcompressed in the early stage, a large amount of information will be lost. Information content is not only evaluated by dimension of feature map, but also by relevant structure and other important factors. Information loss will affect the quality of network training. Principle two, high latitude feature map representations can be replaced locally by networks. More features can be obtained by gradually adding nonlinear activation responses to convolutional neural networks and the speed of network training can be accelerated. Principle 3: Dimensionality reduction of multi-channel low-dimensional features before convolutional networks will not reduce feature information. For example, before performing (e.g., 3×3 convolution), the dimensions of the input representation can be reduced before spatial aggregation without serious adverse effects. Based on the assumption that the reason is that the strong correlation between adjacent cells leads to much less information loss during dimensionality reduction. Given these principles, data compression should be possible, and the reduction in size should facilitate faster learning. Principle 4: Find the best network structure and balance the width and depth of the network to achieve the optimal network performance. The optimal performance of the network can be improved by balancing the number of filters and network depth at each stage. Increasing the width and depth of a network can improve network quality. If both are increased in parallel, the optimal result of constant computation can be achieved. Therefore, the calculated budget should be distributed in a balanced way between the depth and width of the network.

While these principles may make sense, applying them directly to improve network quality is not easy. The Inception network also uses these design principles only in a vague and gradual way. For further details, please refer to Rethinking the Inception Architecture for Computer Vision. An analysis of deep Neural network Models for practical applications in 2017 by ICLR provides a comprehensive analysis of important technologies in practical applications: Accuracy, memory footprint, parameters, operation count, inference time and power consumption. Among them, Inception network has the best performance in precision, but not all of the best performance in other dimensions. The accuracy evaluation results are as follows:

Introduction to Inception Code

Google has open-source a code base called Models along with Tensorflow. The code repository contains [TensorFlow] (https://www.tensorflow.org) implemented in many different model, structure is as follows: – official: a collection of sample model using TensorFlow advanced API. Designed to be developed with the latest and most stable TensorFlow API, tested and kept up to date. This is where new TensorFlow users are recommended to start. -research: is a large set of models implemented by researchers in TensorFlow. They are not officially supported and are maintained by individual researchers, but are the most active directory of code. -samples: Contains code snippets and smaller models to demonstrate the capabilities of TensorFlow, including code provided in various blog posts. – Tutorials: Is the set of models described in the TensorFlow tutorial. Among them, the Inception code base introduced in this article is part of Research, and its file structure and description are as follows:

├ ─ ─ the README, mdA very comprehensive description of the inception code base├ ─ ─ the WORKSPACE#bazel compiled for use├── ├─ │ ├─ │ ├─ ├ _v3_architect.png#inception_v3 Network architecture illustration├── exercises ─ BUILD Exercises ─ data# Annotation and code directory of training data│ ├ ─ ─ build_image_data. PyConvert image data to TFRecord file format using Protos│ ├ ─ ─ build_imagenet_data. PyConvert ImageNet image data set to TFRecord file format using Protos│ ├ ─ ─ download_and_preprocess_flowers. ShDownload the flower dataset and convert the dataset to TFRecord file format│ ├─ Download_And_Preprocess_Imagenet.sh │ ├─ Download_And_Preprocess_Imagenet.shDownload the 2012ImageNet training and evaluation dataset and convert the dataset to TFRecord file format│ ├ ─ ─ download_imagenet. Sh# download the 2012ImageNet training and evaluation dataset│ ├ ─ ─ imagenet_2012_validation_synset_labels. TXTThe tag of the training data is processed with preprocess_imagenet_validation_data.py│ ├ ─ ─ imagenet_lsvrc_2015_synsets. TXT# Training data tag│ ├ ─ ─ imagenet_metadata. TXT# Train data tags and corresponding semantics│ ├ ─ ─ preprocess_imagenet_validation_data. Py# Associate 2012 ImageNet images with tags│ └ ─ ─ process_bounding_boxes. Py# script to associate image data with tag data├ ─ ─ the dataset. Py# lightweight library for managing training data sets├ ─ ─ flowers_data. Py# Use the flower training data managed by DataSet├ ─ ─ flowers_eval. Py# Evaluation script for flower classification training model, encapsulation of inception_eval.py├ ─ ─ flowers_train. Py# Training script for classifying flowers, inception_train.py package├ ─ ─ image_processing. Py# Single image processing library, support multiple threads parallel and preprocessing images.├ ─ ─ imagenet_data. PyTraining data using ImageNet managed by DataSet├ ─ ─ imagenet_distributed_train. PyA library that supports ImageNet training for distributed systems and wraps inception_distributed_train.py├ ─ ─ imagenet_eval. Py# Evaluation script for the ImageNet dataset training model├ ─ ─ imagenet_train. Py# Training script for ImageNet dataset├ ─ ─ inception_distributed_train. PySupport distributed systems for the training of inception networks├ ─ ─ inception_eval. PyAn evaluation validation library for a network training model├ ─ ─ inception_model. PyBuild a model of the Inception V3 network on a dataset├ ─ ─ inception_train. Py# Inception network training script library└ ─ ─ slim#Tensorflow is a lightweight code base that contains design, training, and evaluation models that are not discussed in this article.
Copy the code

This article analyzes only the core code of inception_train.py (future updates to this script have been moved to the Slim codelbase, please refer to the source code for other files).

First, the flag parameters used by the Inception network training script are described.

Train_dir: used to write event logs and checkpoint files, i.e., events.out. XXX and model.ckpt-xxxx generated by training. The default value is ‘/ TMP /imagenet_train ‘. Max_steps: Indicates the maximum number of steps to train. The default value is 10000000. Subset is training or validation. The default value is ‘train’.

Num_gpus: Manages the hardware used to run TensorFlow. Num_gpus specifies the number of Gpus used. The default value is 1. Log_device_placement: Sets whether to record devices. The default value is False.

Type used for management training fine_tune: Management training type that, if set, randomly initializes the last layer of weights to train the network on new tasks. The default value is False, which means that training can continue on the pre-training model. Pretrained_model_checkpoint_path: Specifies the path to store the pretraining model when a new training task starts.

The following are tuning parameters for the learning rate, which are largely dependent on hardware architecture, batch size, and changes to the model architecture specification. Selecting a fine-tuned learning rate is an empirical process that requires some experimentation. Initial_learning_rate: indicates the initial learning rate. The default value is 0.1. Num_epochs_per_decay: decay adjustment period. Learning_rate_decay_factor: parameter factor for adjusting the learning rate.

Second, do some analysis of the tower_loss method defined in the script.

Function: _tower_loss is used to calculate the total loss on a single tower in the training model. During training, the program will split the batch picture task, and a batch of pictures will be split to different towers. That is, if batCH_size = 32 and num_gPUS = 2, each tower will process 16 images. 1. Build network inference diagrams using Inception.

with tf.variable_scope(tf.get_variable_scope(), reuse=reuse_variables):
    logits = inception.inference(images, num_classes, for_training=True,
                                 restore_logits=restore_logits,
                                 scope=scope)
Copy the code

2. Construct the part of the inference diagram to calculate the loss.

split_batch_size = images.get_shape().as_list()[0]
inception.loss(logits, labels, batch_size=split_batch_size)
Copy the code

3. Only assemble all losses in the current tower.

losses = tf.get_collection(slim.losses.LOSSES_COLLECTION, scope)
Copy the code

Where Slim comes from the losses. Py file in the Slim directory in Inception. This method gets all the lost values of the network. Where LOSSES_COLLECTION = ‘_losses’

4. Calculate all losses of the current tower.

regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
total_loss = tf.add_n(losses + regularization_losses, name='total_loss')
Copy the code

Combined with the regularization on the loss function is an important method to prevent a fitting, by getting tf. GraphKeys. REGULARIZATION_LOSSES collection all the variables in the regularization of the loss value.

5. Adjust the parameters, the initial decay rate is 0.9, and calculate the moving average of all losses and total losses.

Loss_averages = tf. Train. ExponentialMovingAverage (name = 0.9'avg')
loss_averages_op = loss_averages.apply(losses + [total_loss])
Copy the code

Through tf. Train. ExponentialMovingAverage function creates moving average and updating parameters, the renewal speed of control model. All loss was stored in a global variable (Graphkeys.moving_average_variables), and then the iterative moving average was calculated for each variable by apply.

Finally, some analysis of the training function for Inception is made.

Function: train trains on the data set.

Create a variable global_step to count the number of train calls. This equals the number of batches processed * flags.num_gpus.

global_step = tf.get_variable(
        'global_step', [],
        initializer=tf.constant_initializer(0), trainable=False)
Copy the code

# Calculate the learning rate adjustment scheme.

num_batches_per_epoch = (dataset.num_examples_per_epoch() / FLAGS.batch_size)
decay_steps = int(num_batches_per_epoch * FLAGS.num_epochs_per_decay)
Copy the code

# Exponentially attenuate the learning rate based on the number of steps.

lr = tf.train.exponential_decay(FLAGS.initial_learning_rate,
                                    global_step,
                                    decay_steps,
                                    FLAGS.learning_rate_decay_factor,
                                    staircase=True)
Copy the code

Create an optimizer that performs gradient descent using the RMSProp algorithm.

opt = tf.train.RMSPropOptimizer(lr, RMSPROP_DECAY,
                                    momentum=RMSPROP_MOMENTUM,
                                    epsilon=RMSPROP_EPSILON)
Copy the code

# Fetch ImageNet images and tags and split batch processing between Gpus.

split_batch_size = int(FLAGS.batch_size / FLAGS.num_gpus)
Copy the code

# override the number of preprocessing threads to solve the problem of increasing the number of towers.

num_preprocess_threads = FLAGS.num_preprocess_threads * FLAGS.num_gpus
images, labels = image_processing.distorted_inputs(
        dataset,
        num_preprocess_threads=num_preprocess_threads)
Copy the code

1. The tag 0 is reserved for the (unused) background class.

num_classes = dataset.num_classes() + 1
Copy the code

# Split tower image and label batches.

images_splits = tf.split(axis=0, num_or_size_splits=FLAGS.num_gpus, value=images)
labels_splits = tf.split(axis=0, num_or_size_splits=FLAGS.num_gpus, value=labels)
Copy the code

Calculate the gradient of each model tower. This function constructs the entire ImageNet model, but shares variables across all towers.

with slim.arg_scope([slim.variables.variable], device='/cpu:0'):
	loss = _tower_loss(images_splits[i], labels_splits[i], num_classes,
                           scope, reuse_variables)
Copy the code

Reuse the variable for the next tower.

reuse_variables = True
Copy the code

Keep a summary of the final tower.

summaries = tf.get_collection(tf.GraphKeys.SUMMARIES, scope)
Copy the code

Only the batch standardization update operation of the final tower is retained. Ideally, we would get updates from all towers, but these statistics accumulate so quickly that we can ignore statistics from other towers without significant loss.

batchnorm_updates = tf.get_collection(slim.ops.UPDATE_OPS_COLLECTION, scope)
Copy the code

Calculate the gradient of ImageNet batch data on this tower.

grads = opt.compute_gradients(loss)
Copy the code

Track the gradients of all towers.

tower_grads.append(grads)
Copy the code

Calculate the average of each gradient. Note that this is where all towers begin to synchronize.

grads = _average_gradients(tower_grads)
Copy the code

# Add summaries to track learning rates.

summaries.append(tf.summary.scalar('learning_rate', lr))
Copy the code

# Add gradient histogram.

for grad, var in grads:
      if grad is not None:
        summaries.append(
            tf.summary.histogram(var.op.name + '/gradients', grad))
Copy the code

Apply gradients to adjust shared variables.

apply_gradient_op = opt.apply_gradients(grads, global_step=global_step)
Copy the code

# Add histograms for trainable variables.

for var in tf.trainable_variables():
      summaries.append(tf.summary.histogram(var.op.name, var))
Copy the code

Track the moving average of all trainable variables. Maintains a “double average” of BatchNormalization global statistics. This is more complex than necessary, but uses it for backward compatibility with previous models.

variable_averages = tf.train.ExponentialMovingAverage(
        inception.MOVING_AVERAGE_DECAY, global_step)
variables_to_average = (tf.trainable_variables() +  tf.moving_average_variables())
variables_averages_op = variable_averages.apply(variables_to_average)
Copy the code

Group all updates into one training.

batchnorm_updates_op = tf.group(*batchnorm_updates)
train_op = tf.group(apply_gradient_op, variables_averages_op, batchnorm_updates_op)
Copy the code

Create a save.

saver = tf.train.Saver(tf.global_variables())
Copy the code

Build summary actions from the last tower summary.

summary_op = tf.summary.merge(summaries)
Copy the code

Build an initialization operation to run below.

init = tf.global_variables_initializer()
Copy the code

# Start running actions on Graph. Allow_soft_placement must be set to True to build towers on gpus, because some operating systems do not have GPU implementations.

sess = tf.Session(config=tf.ConfigProto(
        allow_soft_placement=True,
        log_device_placement=FLAGS.log_device_placement))
    sess.run(init)

    ifFLAGS. Pretrained_model_checkpoint_path:... variables_to_restore = tf.get_collection(slim.variables.VARIABLES_TO_RESTORE) restorer = Tf. Train. Saver (variables_to_restore) restorer. Restore (sess, FLAGS. Pretrained_model_checkpoint_path)...Copy the code

Start queue running.

tf.train.start_queue_runners(sess=sess)

    Create a writer
    summary_writer = tf.summary.FileWriter(FLAGS.train_dir, graph=sess.graph)
    for step inRange (FLAGS. Max_steps) :...# Execute the training operation in this stepStart_time = time.time() _, loss_value = sess.run([train_op, loss])... start_time = time.time() _, loss_value = sess.run([train_op, loss])...Update summary information every 100 steps
      if step % 100 == 0:
        summary_str = sess.run(summary_op)
        summary_writer.add_summary(summary_str, step)

      Check checkpoint files periodically. Check checkpoint files periodically
      if step % 5000 == 0 or (step + 1) == FLAGS.max_steps:
        checkpoint_path = os.path.join(FLAGS.train_dir, 'model.ckpt')
        saver.save(sess, checkpoint_path, global_step=step)
Copy the code

After understanding the train method, the network can be micro-integrated.

Practical verification

This article only covers 4 parts of the Inception README document: ### How to fine-tune a pre-training model for a new task

Introduction to a,

Before calling the training method, the image and label data must be preprocessed, that is, a new data set is converted to the fragmented TFRecord format, and each entry is a serialized TF.Example proto structure. The code base provides a script that demonstrates how to do this for a small dataset of thousands of flower images distributed across five tags.

daisy, dandelion, roses, sunflowers, tulips

The code base provides automatic scripts (download_and_preprocess_flowers.sh) to download the dataset and convert it to TFRecord format. Much like the ImageNet dataset, each record in the TFRecord format is a serialized TF. Example proto structure, whose entries include jPEG-encoded strings and integer labels. See parse_example_PROto for more information. The script takes just a few minutes to run, depending on your network connection speed to download and process images. Your hard drive requires 200MB of free storage (much less than the full 500GB+ 2012 ImageNet dataset). Here we choose DATA_DIR = / TMP/flower-data/as such a location, but edit it at will.

Set the flower data store location
FLOWERS_DATA_DIR=/tmp/flowers-data/

Build the preprocessor script
cd tensorflow-models/inception
bazel  
build //inception:download_and_preprocess_flowers

Run the script to download and process the data
bazel-bin/inception/download_and_preprocess_flowers "${FLOWERS_DATA_DIR}"
Copy the code

If the script runs successfully, the last line of terminal output should look like this:

2016-02-24 20:42:25.067551: Finished writing all 3170 images in data set.
Copy the code

When the script ends, two shards for training and validating files will be found in DATA_DIR. These files are similar to train-????? – of – 00002 and the validation -??????? -of-00002 indicates the training data set and verification data set respectively. Note: If you are preparing a custom image dataset, you need to call build_image_data.py on the custom dataset.

Alternatively, you can choose to download the pre-training model provided by the code base

# specify the path to store the downloaded model
INCEPTION_MODEL_DIR=$HOME/inception-v3-model
  
mkdir -p ${INCEPTION_MODEL_DIR}

cd ${INCEPTION_MODEL_DIR}

Use the curl command to download the curl file
curl -O http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz
tar xzf inception-v3-2016-03-01.tar.gz

A directory named inception-v3 will be created in the directory and contain the following files
> ls inception-v3

README.txt

checkpoint

model.ckpt-157585
Copy the code

You can now fine-tune the pre-trained Inception V3 model using flower data sets.

Second, how to retrain the model on flower data

To prepare to fine-tune the pre-trained Inception-v3 model on the flower dataset, two parameters of the training script need to be modified: -pretrained_model_checkpoint_path: path to the pre-trained Inception-v3 model. If this flag is specified, it loads the entire model from the checkpoint before the script begins training. – fine_tune: A Boolean value indicating whether the last classification layer should be randomly initialized or restored. If you want to continue training pre-trained models from checkpoints, set this flag to false. If you set this flag to true, you can train the new classification layer from scratch.

Taken together, you can retrain the pre-trained Inception- V3 model on the flower dataset using the following command.

# Build flowers_train network with bazel, which encapsulates the call to inception_train in the source code.
cd tensorflow-models/inception

bazel build //inception:flowers_train

# Set the downloaded pre-trained model path.

MODEL_PATH="${INCEPTION_MODEL_DIR}/inception-v3/model.ckpt-157585"

# Set the path to save the data, the data is already in TFRecord file format
FLOWERS_DATA_DIR=/tmp/flowers-data/

Set to save event logs and checkpoint data output during training
TRAIN_DIR=/tmp/flowers_train/

# Retrain the flower data using the pre-trained model via the flowers_train network with fine_tune set to True.
bazel-bin/inception/flowers_train --train_dir="${TRAIN_DIR}" 
--data_dir="${FLOWERS_DATA_DIR}" 
--pretrained_model_checkpoint_path="${MODEL_PATH}"- fine_tune = True - initial_learning_rate = 0.001 - input_queue_memory_factor = 1Copy the code

Some additional options have been added to the training process.

  • Fine-tuning the model to a separate data set requires a significant reduction in the initial learning rate. We set the initial learning rate to 0.001.
  • The flower data set is very small, so we reduced the queue size for our example. For more details, see Tuning Memory requirements.

The training script will only report losses. To assess the quality of the fine-tuning model, you need to run the validation script Flowers_eval:

# Start building the Flowers_eval model
cd tensorflow-models/inception

bazel build //inception:flowers_eval

Set the event log and checkpoint directory output during training
TRAIN_DIR=/tmp/flowers_train/

# set the directory where the training dataset resides. The dataset is in TFRecord file format
FLOWERS_DATA_DIR=/tmp/flowers-data/

Set the directory to save the verification result log file
EVAL_DIR=/tmp/flowers_eval/

Start validating the results of the flowers_train training with flowers_eval
bazel-bin/inception/flowers_eval 
--eval_dir="${EVAL_DIR}" 
--data_dir="${FLOWERS_DATA_DIR}" 
--subset=validation 
--num_examples=500 
--checkpoint_dir="${TRAIN_DIR}" 
--input_queue_memory_factor=1 
--run_once
Copy the code

During the training, it was found that after running 2000 steps of the model, the accuracy of the assessment reached about 93.4%.

Successfully loaded model from/TMP /flowers/model.ckpt-1999 at step= 1999.2016-03-01 16:52:51.761219: Starting evaluation on (Validation).2016-03-01 16:53:05:450419: [20 Batches out of 20] (36.5 examples/ SEC; 0.684 SEC /batch) 2016-03-01 16:53:05.450471: Precision @ 1 = 0.9340 recall @ 5 = 0.9960 [500 examples]Copy the code

Third, how to construct a new data set for retraining

Use the existing scripts provided with the model to build new data sets for training or fine tuning. The main script is build_image_data.py. In short, this script requires a structured catalog of images and transforms it into a sharded TFRecord that can be read by the Inception model. Specifically, you need to create a directory of training images with a specified structure, located in TRAIN_DIR and VALIDATION_DIR, arranged as follows:

$TRAIN_DIR/dog/image0.jpeg

$TRAIN_DIR/dog/image1.jpg

$TRAIN_DIR/dog/image2.png

...

$TRAIN_DIR/cat/weird-image.jpeg

$TRAIN_DIR/cat/my-image.jpeg

$TRAIN_DIR/cat/my-image.JPG

...

$VALIDATION_DIR/dog/imageA.jpeg

$VALIDATION_DIR/dog/imageB.jpg

$VALIDATION_DIR/dog/imageC.png

...

$VALIDATION_DIR/cat/weird-image.PNG

$VALIDATION_DIR/cat/that-image.jpg

$VALIDATION_DIR/cat/cat.JPG
...
Copy the code

Note: This script will append an extra background class with an index of 0, so your class labels range from 0 to num_labels. Using the example above, the corresponding class tag generated from build_image_data.py looks like this:

0
  
1 dog
 
2 cat
Copy the code

Each subdirectory in TRAIN_DIR and VALIDATION_DIR corresponds to a unique label for the image that resides within that subdirectory. The image may be a JPEG or PNG image, and no other image types are currently supported. The data is arranged in this directory structure and build_image_data.py can be run on the data to generate a sharded TFRecord dataset. The complete list of information contained in tf.example is described in the build_image_data.py annotation. To run build_image_data.py, run the following command line:

# set the directory where the TFRecord file is generated
OUTPUT_DIRECTORY=$HOME/my-custom-data/

# build script build_image_data
cd tensorflow-models/inception
bazel 
build //inception:build_image_data

Run the script to convert
bazel-bin/inception/build_image_data 
--train_directory="${TRAIN_DIR}" 
--validation_directory="${VALIDATION_DIR}" 
--output_directory="${OUTPUT_DIRECTORY}" 
--labels_file="${LABELS_FILE}" 
--train_shards=128 
--validation_shards=24 
--num_threads=8
Copy the code

OUTPUT_DIRECTORY is the location of the shard TFRecords. LABELS_FILE will be a text file read by the script that provides a list of all the tags. For example, in the case of a flower dataset, LABELS_FILE contains the following data: Daisy

dandelion

roses

sunflowers

Tulips notice that each line of each label corresponds to an entry in the final classifier in the model. That is, daisies correspond to the classifier of item 1; Dandelion is item 2, etc. We skipped the tag 0 for the background class. After running this script, the following files are generated:

$TRAIN_DIR/train-00000-of-00128

$TRAIN_DIR/train-00001-of-00128

...

$TRAIN_DIR/train-00127-of-00128
and
$VALIDATION_DIR/validation-00000-of-00024

$VALIDATION_DIR/validation-00001-of-00024

...

$VALIDATION_DIR/validation-00023-of-00024
Copy the code

Where 128 and 24 are the number of shards specified for each dataset respectively. The goal is to select the number of shards so that each shard has approximately 1024 images. Once this data set is established, a Inception model can be trained or fine-tuned on this data set. Also, if you use training scripts, modify num_classes () and num_examples_per_epoch () in flowers_data.py to correspond to the newly created data.

Fourth, the practical considerations of the training model

The model architecture and training process are heavily dependent on the hardware used to train the model. If you want to train or fine-tune this model on your machine, you need to adjust and empirically determine a set of training hyperparameters that suit your Settings.

1) Look for good hyperparameters

About 5-10 hyperparameters control the speed of network training. In addition to –batch_size and — num_gpus, there are several constants defined in inception_train.py that specify the learning task.

RMSPROP_DECAY = 0.9 # RMSProp algorithm attenuation term.

MOMENTUM = 0.9 # RMSProp Momentum value of the algorithm.

RMSPROP_EPSILON = 1.0 # RMSProp algorithm Epsilon item.

INITIAL_LEARNING_RATE = 0.1 # Learn the value at initialization time.

NUM_EPOCHS_PER_DECAY = 30.0 # Learning rate begins to decay during the Epochs.

LEARNING_RATE_DECAY_FACTOR = 0.16 # Learning rate decay factor.
Copy the code

In the training, the adjustment of the following parameters has a great influence on the results.

  • INITIAL_LEARNING_RATE: Higher learning rate can speed up training. But too high a rate can lead to instability, allowing model parameters to diverge to infinity or NaN.
  • Batch_size: Larger batch sizes can improve the quality estimation of gradients and train models with higher learning rates.
  • Num_gpus: Generally GPU memory is the bottleneck preventing the adoption of larger volumes, and using more Gpus allows the use of larger batches.
2) Adjust memory requirements

Training this model has a large memory requirement in terms of CPU and GPU, which is relatively small compared to CPU memory. Two items determine how much GPU memory is used — model architecture and batch size (BATCH_size). Assuming you keep the model architecture unchanged, the only control parameter for training on GPU requirements is batch size.

“A good rule of thumb is to try to use the largest batch size possible to fit the GPU.”

If you run out of GPU memory, lower — batCH_size or use more Gpus on your machine. This model performs batch segmentation among Gpus, so N Gpus can handle a batch N times that of 1 GPU. This model also requires a lot of CPU memory. Inception adjusts this model to use approximately 20GB of CPU memory. Therefore, getting about 40GB of CPU memory would be ideal. If this is not possible, you can lower the memory requirements of the model by lowering — input_queue_memory_factor. The main training relative to cross-num_preprocess_threads is asynchronous preprocessing of images. The preprocessed images are stored in a queue, where each GPU performs a queue operation to receive batch images. To ensure a good sequence of data, we maintain a large queue of 1024 x Input_queue_memory_factor images. For the current model architecture, this equates to approximately 4GB of CPU memory. You can reduce memory usage by reducing input_queue_memory_factor. However, a large reduction in this value may result in slightly lower prediction accuracy when the model is trained from scratch. See the annotations in image_processing.py for more details.

Through the above learning and actual combat verification, IT men using trained models can finally identify what the probability of roses.

References and precautions

Reference: 1.Inception 2.Rethinking the Inception Architecture for Computer Vision 3.An analysis of deep neural network models for 4.ImageNet is a common academic data set used to train image recognition systems in machine learning. 5.Inception model architecture diagram is as follows:

Note: For the Inception codebase in Research, which is open source, Google’s developers have shifted most of their subsequent improvements to the Slim codebase. The latest version of the code base can be found in Slim for continuous learning.

PS. As a new bird in machine learning, I am not aware of many concepts and algorithms in the learning process. Please correct me if there are any mistakes.

Scan the QR code and follow the official account. Get the latest mobile development technology dry goods.