Keras is a highly available Python API that helps you quickly build and train your own deep learning models. Its back end is TensorFlow or Theano. This article assumes that you are already familiar with TensorFlow and convolutional neural networks. If not, check out this 10-minute introduction to TensorFlow and Convolutional Neural Networks tutorial and come back to this article.

In this tutorial, we will learn the following aspects:

  1. Why Keras? Why is Keras considered the future of deep learning?
  2. Install Keras step by step on Ubuntu.
  3. Keras TensorFlow tutorial: Keras Basics.
  4. Understand Keras sequence model 4.1 practical examples to explain linear regression problems
  5. Save and restore pre-trained models using Keras
  6. Keras API 6.1 Developing VGG Convolutional Neural Networks using Keras API 6.2 Building and running SqueezeNet Convolutional neural networks using Keras API

1. Why Keras?

Keras is a framework developed by Francois Chollet, a Google engineer, to help you do rapid prototyping on Theano. Later, this was extended to include TensorFlow as a backend as well. And recently, TensorFlow decided to provide it as part of a contrib file.

Keras is considered the future of building neural networks, and here are some reasons why it’s popular:

  1. Lightweight and fast development: Keras aims to eliminate boilerplate code. A few lines of Keras code can do more than the native TensorFlow code. You can also easily implement CNN and RNN and run them on a CPU or GPU.

  2. Framework “winners” : Keras is an API that runs on top of other deep learning frameworks. The framework can be TensorFlow or Theano. Microsoft also plans to use CNTK as a back end to Keras. At present, the world of neural network frameworks is very fragmented and developing very fast. For details, you can see this Tweet from Karpathy:


Imagine how painful it is to have to learn a new framework every year. So far, TensorFlow seems to be a trend, and as more and more frameworks start to support Keras, it may become a standard.

Keras is currently one of the fastest growing deep learning frameworks. It’s also a big reason for its popularity because you can use different deep learning frameworks as backends. You can imagine a scenario where you read an interesting paper and you want to test the model on your own data set. Let’s assume again that you are familiar with TensorFlow, but know very little about Theano. Then, you have to reproduce the paper using TensorFlow, but the cycle is very long. However, if the code is now written in Keras, you can use the code simply by changing the back end to TensorFlow. It will be a great boost to the community.

2. How to install Keras with TensorFlow as the back end

A) Dependent installation

Install H5PY for model saving and loading:

pip install h5py
Copy the code

There are also dependency packages to install.

pip install numpy scipy
pip install pillow
Copy the code

If you have not already installed TensorFlow, follow this tutorial to install TensorFlow first. Once you have TensorFlow installed, you can easily install Keras using PIP.

sudo pip install keras
Copy the code

Use the following command to view the Keras version.

>>> import keras
Using TensorFlow backend.
>>> keras.__version__
'2.0.4'
Copy the code

Once Keras is installed, you need to modify the backend file, that is, to determine whether you want TensorFlow as the back end or Theano as the back end. The modified configuration file is located in ~/.keras/keras.json. The configuration is as follows:

{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}
Copy the code

Note that the image_data_format parameter is channels_last, which means that the back end is TensorFlow. In TensorFlow, the image is stored in [height, width, channels], but in Theano, it is completely different, that is, [Channels, height, width]. Therefore, if you don’t set this parameter correctly, the intermediate results of your model will be very strange. For Theano, this parameter is channels_first.

So, you are now ready to use Keras to build the model, with TensorFlow as the back end.

3. Keras basics

The primary data structure in Keras is the Model, which defines a complete graph. You can add any network structure to an existing diagram.

import keras
Copy the code

Keras has two different approaches to modeling:

  1. Sequential Models: This approach is used to implement some simple models. You just need to add layers to some existing models.

  2. Functional API: Keras’s API is very powerful and you can use it to construct more complex models such as multi-output models, directed acyclic graphs, and so on.

In the next section of this article, we will learn the theory and examples of Keras’s Sequential Models and Functional APIS.

4. Keras Sequential models

In this section, I will introduce the theory of Keras Sequential Models. I’ll quickly explain how it works, using specific code. After that, we’ll solve a simple linear regression problem, so you can run the code while reading it, to impress.

Here’s how to start importing and building the sequence model.

from keras.models import Sequential
models = Sequential()
Copy the code

Next we can add Dense (Full Connected Layer), Activation, Conv2D, MaxPooling2D functions to the model.

From keras. Layers import Dense, Activation, Conv2D, MaxPooling2D, Flatten, Dropout model. Add (Conv2D(64, (3,3), activation='relu', Input_shape = (100,100,32)) # This ads a Convolutional layer with 64 filters of size 3 * 3 to the graphCopy the code

Here’s how to add some of the most popular layers to the web. I’ve written a lot about layers in the convolutional Neural network tutorial.

1. The convolution layer

Here, we use a convolution layer with 64 convolution kernels with a dimension of 3*3. Relu activation function is then used to activate the layer, and the dimension of input data is 100*100*32. Note that if it is the first convolution layer, then the dimension of the input data must be added. The following parameters can be omitted.

Model.add (Conv2D(64, (3,3), activation='relu', input_shape = (100,100,32))Copy the code

2. MaxPooling layer

Specify the type of layer and the size of the red color, and then automatically complete the red color operation. Cool!

Model. The add (MaxPooling2D (pool_size = (2, 2)))Copy the code

3. Full connection layer

This layer is called the Dense layer in Keras, and we just set the dimensions of the output layer, and Keras helped us do it automatically.

model.add(Dense(256, activation='relu'))
Copy the code

4. Dropout

Model. The add (Dropout (0.5))Copy the code

5. Flat layer

model.add(Flatten())
Copy the code

Data input

The first layer of the network needs to read the training data. So we need to define the dimensions of the input data. Therefore, the input_shape parameter is used to specify the dimension size of the input data.

Model. Add (Conv2D(32, (3,3), activation='relu', input_shape=(224, 224, 3))Copy the code

In this example, the first layer of data input is a convolution layer, and the size of the input data is 224*224*3.

This helps you build a model using the sequence model. Next, let’s learn the most important part. Once you specify a network architecture, you also need to specify optimizers and loss functions. We use the compile function in Keras to do this. For example, in the code below, we use RMSprop as the optimizer and binary_Crossentropy as the loss function value.

model.compile(loss='binary_crossentropy', optimizer='rmsprop')
Copy the code

If you want to use stochastic gradient descent, then you need to choose appropriate initial values and hyperparameters:

Optimizers import SGD = SGD(LR =0.01, decay= 1E-6, Momentum =0.9, nesterov=True) model.compile(loss='categorical_crossentropy', optimizer=sgd)Copy the code

Now we have built the model. Next, let’s enter data into the model, which in Keras is done through the FIT function. You can also train by specifying batch_size and epochs in this function.

model.fit(x_train, y_train, batch_size = 32, epochs = 10, validation_data(x_val, y_val))
Copy the code

Finally, we use the Evaluate function to test the performance of the model.

score = model.evaluate(x_test, y_test, batch_size = 32)
Copy the code

These are the steps to build a neural network in Keras using a sequence model. Now, let’s build a simple linear regression model.

4.1 Practical examples to explain linear regression problems

Problem statement

In a linear regression problem, you get a lot of data points, and then you have to use a straight line to fit the discrete points. In this example, we created 100 discrete points and fitted them with a straight line.

A) Create training data

The data range of TrainX is -1 to 1, the relationship between TrainY and TrainX is 3 times, and we have added some noise points.

import keras from keras.models import Sequential from keras.layers import Dense import numpy as np trX = np.linspace(-1, 1, 101) trY = 3 *trX + Np.random.randn (* trx.shape) * 0.33Copy the code

B) Model building

First we need to build a sequence model. All we need is a simple link, so we can just use a Dense layer and activate it with a linear function.

model = Sequential()
model.add(Dense(input_dim=1, output_dim=1, init='uniform', activation='linear'))
Copy the code

The following code sets the input data x, weight W, and bias item B. Let’s look at the actual initialization. As follows:

weights = model.layers[0].get_weights()
w_init = weights[0][0][0]
b_init = weights[1][0]
print('Linear regression model is initialized with weights w: %.2f, b: %.2f' % (w_init, b_init)) 
## Linear regression model is initialized with weight w: -0.03, b: 0.00
Copy the code

Now, l can train the linear model with trX and trY, which are three times as many as trX. So the weight w should have a value of 3.

We use simple gradient descent as the optimizer and mean square error (MSE) as the loss value. As follows:

model.compile(optimizer='sgd', loss='mse')
Copy the code

Finally, we use the FIT function to enter the data.

model.fit(trX, trY, nb_epoch=200, verbose=1)
Copy the code

After training, we print the weights again:

weights = model.layers[0].get_weights() w_final = weights[0][0][0] b_final = weights[1][0] print('Linear regression model is trained to have weight w: %.2f, b: %.2f' % (w_final, b_final) ##Linear Regression model is trained to have weight W: 2.94, b: 0.08Copy the code

As you can see, after running 200 rounds, the weight is now very close to 3. You can change the number of runs to between [100, 300] and see how the output structure changes. Now that you’ve learned how to build a linear regression model with very little code, you need more code in TensorFlow to build the same model.

5. Save and restore pre-trained models using Keras

HDF5 binary format

Once you have completed your training with Keras, you can save your network in HDF5. Of course, you’ll need to install H5PY first. The HDF5 format is ideal for storing large amounts of digital data and processing this data from NUMpy. For example, we can easily slice multi-terabyte data sets stored on disk as if they were real NUMPY arrays. You can also store multiple datasets in a single file, traverse them or view the.shape and.dtype properties.

If you need confidence, NASA is also using HDF5 for data storage. H5py is a Python wrapper to the HDF5 C API. Almost anything you can do with C on HDF5 can be done with Python on H5PY.

To save weight

If you want to save trained weights, you can use save_weights directly.

model.save_weights("my_model.h5")
Copy the code

Load the pre-training weights

If you want to load previously trained models, you can use the load_weights function.

model.load_weights('my_model_weights.h5')
Copy the code

6. Keras API

For simple models and problems, sequential models are great. But if you want to build a complex network in the real world, then you need to know some functional apis. In many popular neural networks, we have a minimal network structure, and the whole model is superimposed on these minimal models. These basic apis allow you to build models layer by layer. Therefore, you need very little code to build a complete and complex neural network.

Let’s see how it works. First, you need to import some packages.

from keras.models import Model
Copy the code

Now, you need to specify the input data, instead of entering the data in the last FIT function in the sequential model. This is one of the most significant differences between the sequence model and these functional apis. We use the input() function to declare a 1*28*28 tensor.

from keras.layers import Input
## First, define the vision modules

digit_input = Input(shape=(1, 28, 28))
Copy the code

Now, let’s use the API to design a convolutional layer. We need to specify which layer to use the convolutional network. The specific code is as follows:

x = Conv2D(64, (3, 3))(digit_input)
x = Conv2D(64, (3, 3))(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)
Copy the code

Finally, we build a model for the specified input and output data.

vision_model = Model(digit_input, out)
Copy the code

Of course, we also need to specify loss functions, optimizers, and so on. But just like we did in the sequence model, you can do it using the fit function and compile function.

Next, let’s build a VGG-16 model, which is a big and “old” model, but is a good model to learn from because of its simplicity.

6.1 Develop VGG convolutional neural network using Keras API

VGG:

VGG convolutional neural network is a model proposed by Oxford University in 2014. When this model was proposed, it immediately became the most popular convolutional neural network model at that time due to its simplicity and practicability. It performs very well in image classification and target detection. In the 2014 ILSVRC competition, VGG achieved 92.3% accuracy in top-5. There are several variants of this model, of course the most popular is VGG-16, which is a model with 16 layers. You can see that it requires input data with dimensions 224*224*3.


*Vgg 16 architecture*

Let’s write a separate function to fully implement the model.

img_input = Input(shape=input_shape)
# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)

x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
Copy the code

We can call this complete model vgg16.py.

In this example, let’s test by running some data from the imageNet dataset. The specific code is as follows:

model = applications.VGG16(weights='imagenet') img = image.load_img('cat.jpeg', target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) preds = model.predict(x) for results in decode_predictions(preds): For result in results: print('Probability %0.2f%% => [%s]' % (100*result[2], result[1])Copy the code


As you can see in the picture, the model makes an identification prediction about the object in the picture.

We built a VGG model from the API, but since VGG is a very simple model, we didn’t fully develop the capabilities of the API. Next, let’s demonstrate the true power of the API by building a SqueezeNet model.

6.2 Construct and run SqueezeNet convolutional neural network using Keras API

SequeezeNet is a remarkable network architecture that is notable not for the improvement in correctness but for the reduction in computation. When the correctness of SequeezeNet is close to AlexNet, the storage of the pre-trained model above ImageNet is less than 5 MB, which is very favorable for using CNN in the real world. The SqueezeNet model introduces a Fire model, which consists of alternating Squeeze and Expand modules.


*SqueezeNet fire module*

Now, we make multiple copies of the FIRE model to build a complete network model, as follows:


To build this network, we will first build a separate Fire module using the capabilities of the API.

# Squeeze part of fire module with 1 * 1 convolutions, followed by Relu
x = Convolution2D(squeeze, (1, 1), padding='valid', name='fire2/squeeze1x1')(x)
x = Activation('relu', name='fire2/relu_squeeze1x1')(x)

#Expand part has two portions, left uses 1 * 1 convolutions and is called expand1x1 
left = Convolution2D(expand, (1, 1), padding='valid', name='fire2/expand1x1')(x)
left = Activation('relu', name='fire2/relu_expand1x1')(left)

#Right part uses 3 * 3 convolutions and is called expand3x3, both of these are follow#ed by Relu layer, Note that both receive x as input as designed. 
right = Convolution2D(expand, (3, 3), padding='same', name='fire2/expand3x3')(x)
right = Activation('relu', name='fire2/relu_expand3x3')(right)

# Final output of Fire Module is concatenation of left and right. 
x = concatenate([left, right], axis=3, name='fire2/concat')
Copy the code

To reuse this code, we can convert it to a function:

sq1x1 = "squeeze1x1"
exp1x1 = "expand1x1"
exp3x3 = "expand3x3"
relu = "relu_"
WEIGHTS_PATH = "https://github.com/rcmalli/keras-squeezenet/releases/download/v1.0/squeezenet_weights_tf_dim_ordering_tf_kernels.h5"
Copy the code

Modular processing

sq1x1 = "squeeze1x1"
exp1x1 = "expand1x1"
exp3x3 = "expand3x3"
relu = "relu_"

def fire_module(x, fire_id, squeeze=16, expand=64):
   s_id = 'fire' + str(fire_id) + '/'
   x = Convolution2D(squeeze, (1, 1), padding='valid', name=s_id + sq1x1)(x)
   x = Activation('relu', name=s_id + relu + sq1x1)(x)

   left = Convolution2D(expand, (1, 1), padding='valid', name=s_id + exp1x1)(x)
   left = Activation('relu', name=s_id + relu + exp1x1)(left)

   right = Convolution2D(expand, (3, 3), padding='same', name=s_id + exp3x3)(x)
   right = Activation('relu', name=s_id + relu + exp3x3)(right)

   x = concatenate([left, right], axis=3, name=s_id + 'concat')
return x
Copy the code

Now we can build the complete model using the separate ** FIRE ** module we built.

x = Convolution2D(64, (3, 3), strides=(2, 2), padding='valid', name='conv1')(img_input) x = Activation('relu', name='relu_conv1')(x) x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), name='pool1')(x) x = fire_module(x, fire_id=2, squeeze=16, expand=64) x = fire_module(x, fire_id=3, squeeze=16, expand=64) x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), name='pool3')(x) x = fire_module(x, fire_id=4, squeeze=32, expand=128) x = fire_module(x, fire_id=5, squeeze=32, expand=128) x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), name='pool5')(x) x = fire_module(x, fire_id=6, squeeze=48, expand=192) x = fire_module(x, fire_id=7, squeeze=48, expand=192) x = fire_module(x, fire_id=8, Squeeze =64, expand=256) x = fire_module(x, fire_id=9, squeeze=64, expand=256) x = Dropout(0.5, name='drop9')(x) x = Convolution2D(classes, (1, 1), padding='valid', name='conv10')(x) x = Activation('relu', name='relu_conv10')(x) x = GlobalAveragePooling2D()(x) out = Activation('softmax', name='loss')(x) model = Model(inputs, out, name='squeezenet')Copy the code

The complete network model was placed in squeezenet.py. We should download the imageNet pre-training model and then train and test it on our own data set. The following code does this:

import numpy as np from keras_squeezenet import SqueezeNet from keras.applications.imagenet_utils import preprocess_input, decode_predictions from keras.preprocessing import image model = SqueezeNet() img = image.load_img('pexels-photo-280207.jpeg', target_size=(227, 227)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) preds = model.predict(x) all_results = decode_predictions(preds) for results in All_results: print('Probability %0.2f%% => [%s]' % (100*result[2], result[1]))Copy the code

For the same graph prediction, we can get the following prediction probability.


This concludes our Keras TensorFlow tutorial. Hope it can help you 🙂


Source: CV – tricks