Using Keras to create and evaluate a neural network model is very convenient, but you need to follow a few strict steps to build the model. In this article we will explore step by step the creation, training, and evaluation of neural network models in Keras and learn how to make predictions using trained models. After reading this article, you will know how to define, compile, train, and evaluate a deep neural network in Keras. How to choose and use the default model to solve regression and classification prediction problems. How to develop and run your first multi-layer perceptron network using Keras. Update March 2017: Update the example to Keras 2.0.2 / TensorFlow 1.0.1 / Theano 0.9.0. Here is a summary of the five steps we will take to build a neural network model in Keras: Define a neural network. Compile the neural network. Train neural networks. Evaluation neural network. The trained neural network model is used for prediction.Copy the code
1/ Step 1: Define the network The first thing to do is define your neural network. In Keras, a neural network can be defined through a series of layers. The containers of these layers are the Sequential classes and the first step is to create an instance of the Sequential class. You can then create the network layers you need in the order they are connected. For example, we can do the following two steps: Model = Sequential() model.add(Dense(2)) Additionally, we can define the model by creating an array of layers and passing it to the Sequential constructor. Layers = [Dense(2)] model = Sequential(layers) The first layer of the replication code network must define the expected input dimension. This parameter can be specified in a number of ways, depending on the type of model being built, but in this article's multi-layer perceptron model we will specify it through the input_DIM attribute. For example, we want to define a small multi-layer perceptron model with 2 inputs in the visible layer, 5 neurons in the hidden layer, and 1 neuron in the output layer. This model can be defined as follows: Model = Sequential() model.add(Dense(5, input_dim=2)) model.add(Dense(1)) you can think of this Sequential model as a pipeline, feeding data from one end and getting predictions from the other. This separation of normally interconnected layers and inclusion in the model as separate layers is a very useful concept in Keras to clearly indicate the responsibilities of each layer in the transformation of data from input to output. For example, the Activation function used to sum and transform signals from individual neurons can be isolated and added to the Sequential model as a layer. model = Sequential() model.add(Dense(5, Input_dim =2)) model. Add (Activation('relu')) model. Add (Dense(1)) Model. It determines the format of the predicted value. For example, here are some common types of predictive modeling problems and the structures and standard activation functions they can use at the output layer: Regression problems: Use the linear activation function "Linear" and use the number of neurons that match the number of outputs. Dichotomous problem: Using the logical activation function "sigmoid", there is only one neuron in the output layer. Multi-classification problem: use Softmax activation function "Softmax"; If you use the one-hot encoded output format, there is one neuron for each output. Step 2: Compile the network Once we have defined the network, we must compile it. Compilation is an efficient step. It takes the sequence of layers we define and converts it through a series of efficient matrix transformations into a format that can be executed on the GPU or CPU, depending on the Keras configuration. You can think of the compilation process as an estimation of your network. Whether you are training using an optimizer scheme or loading a set of pre-training weights from a saved file, you will need to compile after defining the model because the compilation step transforms your network into an efficient structure suitable for your hardware. The same is also true of making predictions. The compilation step needs to set some parameters specifically for the training of your network. It is particularly important to set the optimization algorithm used by the training network and the loss function used to evaluate the network to minimize the results through the optimization algorithm. The following example specifies the stochastic gradient descent (SGD) optimization algorithm and the mean square error (MSE) as functions when compiling a model defined for regression problems. Model.pile (Optimizer =' SGD ', Loss =' mSE ') Copy code to predict the kinds of modeling problems that can also limit the types of loss functions that can be used. For example, here are the standard loss functions for several different types of predictive modeling: Regression problem: mean square error "MSE". Dichotomous problem: logarithmic loss (also known as crossentropy) "_binarycrossentropy". Multiple categorization problem: multiple category logarithm loss "_categoricalcrossentropy". You can check out the loss functions supported by Keras. The most common optimization algorithm is stochastic gradient descent, but Keras also supports other optimization algorithms. The following optimization algorithms are probably the most commonly used because of their generally good performance: Stochastic gradient descent "SGD" requires tuning of learning rate and momentum parameters. ADAM The "ADAM" parameter needs to be adjusted for the learning rate. RMSprop The learning rate needs to be adjusted for RMSprop. Finally, you can specify specific metrics other than the value of the loss function during the training of the model. In general, for classification problems, the most commonly collected indicator is accuracy. The metrics to be collected are determined by the name in the Settings array. For example: model.compile(Optimizer =' SGD ', Loss ='mse', metrics=['accuracy']) This process can also be viewed as adjusting weights to fit the training data set. Training network requires training data, including input matrix X and corresponding output y. In this step, the network is trained using a backpropagation algorithm and optimized using the optimization algorithm developed at compile time along with the loss function. The back-propagation algorithm needs to specify the trained Epoch (turn number, Epoch number) and exposure number to the data set. Each epoch can be divided into multiple sets of data input and output pairs, also known as batch sizes. Batch sets a number that defines the number of input/output pairs before updating the weights in each epoch. This is also a way to optimize efficiency by ensuring that too many I/O pairs are not loaded into memory (video memory) at the same time. Here is an example of the simplest training network: Model.compile (Optimizer =' SGD ', Loss ='mse', metrics=['accuracy']) It includes a summary of the performance of the model in training (including the loss function value of each round and the indicators collected at compile time). Step 4: Evaluate the network After the network training is completed, you can evaluate it. The data from the training set can be used to evaluate the network, but the resulting indicators are not useful for predicting the network. Because the network has "seen" the data during training. So we can evaluate network performance using additional data sets that we haven't "looked at" before. This will provide an estimate of the network's future performance in predicting data that it has never seen before. The evaluation model will evaluate the loss values of input and output pairs in all test sets, as well as other metrics (such as classification accuracy) specified at model compilation time. This step returns a set of evaluation indicator results. For example, a model that uses accuracy as an indicator at compile time can be evaluated on a new data set as follows: Loss, accuracy = Model.evaluate (X, y) Finally, if we are satisfied with the performance of the trained model, we can use it to make predictions about new data. This step is as simple as calling the predict() function directly on the model and passing in a new set of inputs. Example: Predictions = model.predict(x) Replication code Predictions will be returned in the format defined by the network output layer. In regression problems, these predicted values derived from linear activation functions may directly conform to the format required by the problem. For a dichotomous problem, the predicted value may be a set of probabilities that indicate the likelihood that the data will fall into the first category. These probabilities can be converted to 0 and 1 by rounding (k.round). For a multi-classification problem, the result may also be a set of probability values (assuming the output variables are one-Hot coded), so it also needs to use the argmax function to convert these probability arrays into the desired single-class output. Let's use a small Example to put everything together. We will take the Pima Indians diabetes diclassification problem as an example. You can download this data set from the UCI Machine Learning repository. The problem has eight input variables and needs to output a categorical value of 0 or 1. We will construct a multilayer perceptron neural network containing 8 input visible layer, 12 neuron hidden layer, rectifier activation function, 1 neuron output layer and Sigmoid activation function. We will train the network with 100 epochs, set the batch size to 10, and use ADAM optimization algorithm and logarithmic loss function. After training, we evaluated the model using training data, and then used the training data to make separate predictions for the model. This is done for convenience, and generally we use additional test data sets for evaluation and new data for prediction. The complete code is as follows: Models import Sequential from Keras. Models import Sequential from Keras. Layers import Dense import numpy numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",") X = dataset[:,0:8] Y = dataset[:,8] # 1. Define network model = Sequential() model.add(Dense(12, input_DIM =8, activation='relu')) model.add(Dense(1, activation='sigmoid')) # 2. Model.compile (Loss ='binary_crossentropy', optimizer=' Adam ', metrics=['accuracy']) # 3. Fit (X, Y, epochs=100, batch_size=10) # 4. = Model. Evaluate (X, Y) print("\nLoss: %.2f, accuracy: %.2f%%" % (loss, accuracy*100)) # 5. Predictions probabilities = model.predict(X) predictions = [float(round(X)) for X in probabilities] accuracy = Numpy. Mean (Predictions == Y) print("Prediction Accuracy: %.2f%%" % (Accuracy *100)) 768/768 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s - loss: 0.5219 acc: 0.7591 Epoch 99/100 768/768 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s - loss: 0.5250 acc: 0.7474 Epoch 100/100 768/768 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s - loss: 0.5416 acc: 0.7331 32/768 / >... In this paper, we explore 5 steps of constructing neural network when using Keras library for deep learning. You also learned how to define, compile, train, and evaluate a deep neural network in Keras. How to choose and use the default model to solve regression and classification prediction problems. How to develop and run your first multi-layer perceptron network using Keras. Do you have any other questions about Keras's neural network model? Or do you have any suggestions for this article? Leave them in the comments and I'll do my best to answer them.Copy the code