Original link:tecdat.cn/?p=15791
This paper will use the means of machine learning to classify iris according to species. This tutorial uses TensorFlow to do the following:
- Build a model,
- Train the model with sample data sets, and
- The model is used to predict unknown data.
TensorFlow programming
This guide uses the following advanced TensorFlow concepts:
- Using TensorFlow’s default eager Execution development environment,
- Import data using the Datasets API,
- TensorFlow’s Keras API is used to build the layers and the entire model.
The structure of this tutorial is similar to that of many TensorFlow programs:
- Data set import
- Select model type
- Train the model
- Evaluation model effect
- Use trained models to make predictions
Setting up
Configuration import
Import TensorFlow and any other Python libraries you need. By default, TensorFlow evaluates operations in real time using eager execution.
from __future__ import absolute_import, division, print_function, unicode_literals
import os
import matplotlib.pyplot as plt
Copy the code
import tensorflow as tf
Copy the code
print("TensorFlow version: {}".format(tf.__version__))
print("Eager execution: {}".format(tf.executing_eagerly()))
Copy the code
TensorFlow version: 2.0.0
Eager execution: True
Copy the code
Iris classification problem
Imagine that you are a botanist looking for a way to automatically categorize every iris you find. Machine learning offers a variety of algorithms for classifying flowers statistically. For example, a sophisticated machine learning program can classify flowers based on photos. We will classify irises according to the length and width of their calyx and petals.
There are about 300 species of iris, but our program will classify only the following three:
- Mountain iris
- Virginia iris
- Color fleur-de-lis
Import and parse the training data set
Download the data set file and convert it into a structure available for use by this Python program.
Download data set
Use the tf.keras.utils.get_file function to download the training dataset file. This function returns the path to the downloaded file:
- 0s 0us/step
Local copy of the dataset file: /home/kbuilder/.keras/datasets/iris_training.csv
Copy the code
Check the data
The dataset iris_Train.csv is a plain text file that stores tabular data in comma-separated values (CSV) format. Use the head-n5 command to view the first five entries:
! head -n5 {train_dataset_fp}Copy the code
120, 4, setosa, versicolor, virginica 6.4, 2.8, 5.6, 2.2, 5.0, 2.3, 3.3, 1.0, 1, 4.9, 2.5, 4.5, 1.7, 2, 4.9, 3.1, 1.5, 0.1, 0Copy the code
We can notice the following from this dataset view:
- The first line is the header, which contains the dataset information:
-
There were 120 samples. Each sample has four characteristics and a tag name, which has three possibilities.
-
The following lines are data records, each sample occupies one line, where:
- The first four fields are characteristics: these four fields represent the characteristics of the sample. In this data set, these fields store floating point numbers representing floral measurements.
- The last column is the label: the value we want to predict. For this data set, the value is an integer value of 0, 1, or 2 (each value corresponds to a flower name).
Let’s express it in code:
Print ("Features: {}". Format (feature_names)) print("Label: {}". Format (label_name))Copy the code
Features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
Label: species
Copy the code
Each label is individually associated with a string name (for example, “setosa”), but machine learning generally relies on numeric values. The tag number is mapped to a specified representation, for example:
0
Iris: mountain1
: Color-changing iris2
: Virginia iris
To create atf.data.Dataset
TensorFlow’s Dataset API handles many common situations encountered when loading data into a model. This is a high-level API for reading data and converting it into a trainable format.
Since the dataset is a text file in CSV format, use the make_csv_dataset function to parse the data into an appropriate format. Since this function generates data for the training model, the default behavior is to randomize the data (shuffle=True, shuffle_BUFFer_SIZE =10000) and repeat the data set indefinitely (num_epochs=None). We also set the batch_size parameter:
WARNING:tensorflow:From / TMPFS/SRC/tf_docs_env/lib/python3.6 / site - packages/tensorflow_core/python/data/experimental/ops/readers. Py: 521: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.Copy the code
Make_csv_dataset returns a (features, label) pair to the built tf.data.Dataset, where features is a dictionary: {‘feature_name’: value}
These Dataset objects are iterable. Let’s take a look at some of these features:
OrderedDict([('sepal_length', <tf.Tensor: id=In 68, shape=(32), dtype=float32, numpy=
Array ([6.7, 6.1.6.6.6.7.5.4.5.5.5.1.5.8.5.2.6.4.7.3.4.9.6.1.4.6.4.6.5.5.6.7.6. , 5.7.6. , 7.7.5. , 5.8.5. , 4.5.5.1.5.3.5.6.5.2.6.4.6.6.4.6].dtype=float32)>), ('sepal_width', <tf.Tensor: id=In 69, shape=(32), dtype=float32, numpy=
array([3. , 2.6.3. , 3. , 3.4.2.6.3.7.2.7.2.7.3.2.2.9.2.4.2.8.3.4.3.6.2.4.3.1.2.9.2.8.2.2.3.8.3.3.2.7.3.2.2.3.2.5.3.7.2.5.3.4.2.8.2.9.3.2].dtype=float32)>), ('petal_length', <tf.Tensor: id=In 66, shape=(32), dtype=float32, numpy=
Array ([5.2, 5.6.4.4.5. , 1.5.4.4.1.5.4.1.3.9.5.3.6.3.3.3.4. ,
1.4.1. , 3.7.5.6.4.5.4.5.5. , 6.7.1.4.5.1.1.2.1.3.3. ,
1.5.3.9.1.4.5.6.4.6.1.4].dtype=float32)>), ('petal_width', <tf.Tensor: id=In 67, shape=(32), dtype=float32, numpy=
Array ([2.3, 1.4.1.4.1.7.0.4.1.2.0.4.1. , 1.4.2.3.1.8.1. , 1.3.0.3.0.2.1. , 2.4.1.5.1.3.1.5.2.2.0.2.1.9.0.2.0.3.1.1.0.2.1.1.0.2.2.2.1.3.0.2].dtype=float32)>)])
Copy the code
Notice that samples with similar characteristics are grouped into groups, that is, into batches. Changing batCH_size sets the number of samples stored in these characteristic arrays.
After plotting a few characteristics in this batch, you begin to see some clustering phenomena:
plt.scatter(features['petal_length'],
features['sepal_length'],
c=labels,
cmap='viridis')
plt.xlabel("Petal length")
plt.ylabel("Sepal length")
plt.show()
Copy the code
To simplify the model building steps, create a function to repackage the feature dictionary into a single array of shapes (batch_size, NUM_features).
This function uses the tf.stack method, which fetches values from a list of tensors and creates composite tensors for the specified dimensions:
def pack_features_vector(features, labels): """ Pack features into an array """ Features = tf.stack(list(featues.values ()), axis=1) return Features, labelsCopy the code
Then use the tf.data.dataset. Map method to package the features in each (features,label) pair into the training Dataset:
train_dataset = train_dataset.map(pack_features_vector)
Copy the code
The characteristic elements of the Dataset form an array of the form (batch_size, NUM_features).
Select model type
Why use models?
The model refers to the relationship between features and labels. For iris classification, the model defines the relationship between calyx and petal measurements and predicted iris species. Some simple models can be described with a few lines of algebra, but complex machine learning models have a lot of parameters that are hard to aggregate.
Can you determine the relationship between the four features and iris varieties without using machine learning? That is, can you create models using traditional programming techniques, such as lots of conditional statements? Maybe, if the data set is repeatedly analyzed and the relationship between petal and calyx measurements and specific species is finally determined. For more complex data sets, this becomes very difficult, or perhaps impossible. A good machine learning approach can determine the model for you. If you feed enough representative samples into the right type of machine learning model, the program will figure out the relationship for you.
Selection model
We need to select the type of model to train. Models have many types, and picking the right type requires some experience. This tutorial uses neural networks to solve iris classification problems. Neural networks can discover complex relationships between features and labels. A neural network is a highly structured graph containing one or more hidden layers. Each hidden layer contains one or more neurons. There are many types of neural networks, and this program uses intensive neural networks, also known as fully connected neural networks: neurons in one layer will take input connections from each neuron in the previous layer. For example, Figure 2 shows a dense neural network with one input layer, two hidden layers, and one output layer:
When the model in Figure 2 is trained and unlabeled samples are obtained, it will generate three prediction results: the possibility that the corresponding iris belongs to the specified variety. Such predictions are called inferences. For this example, the sum of the output predictions is 1.0. In Figure 2, the predicted results are decomposed as follows: Iris mountain is 0.02, Iris chameleon is 0.95, iris Virginia is 0.03. This means that the model predicts a 95% probability that an unlabeled iris sample will be a color-changing iris.
Create the model using Keras
The TensorFlow TF.keras API is the preferred way to create models and layers. With this API, you can easily build models and experiment, while Keras handles the complex work of connecting all the pieces together.
The tF.keras.Sequential model is a linear stack of layers. The constructor of the model takes a series of layer instances; In this example, two dense layers (each containing 10 nodes) and one output layer (each containing three nodes representing tag predictions) are used. The first layer’s input_shape parameter pairs the number of features in the data set and is a required parameter:
model = tf.keras.Sequential([ tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(4,)), Tf.keras.layers.Dense(10, activation= tF.nn.relu), tF.keras.layers.Dense(3)]Copy the code
The activation function determines the output form of each node in the layer. These nonlinearities are important; without them, the model would be equated to a single layer. There are many activation functions, but the hidden layer usually uses ReLU.
The ideal number of hidden layers and neurons depends on the question and data set. As with many aspects of machine learning, choosing the best shape of a neural network requires a certain level of knowledge and experimental foundation. In general, increasing the number of hidden layers and neurons usually results in more powerful models, which require more data to train effectively.
Using the model
Let’s take a quick look at how this model handles a set of characteristics:
<tf.Tensor: id=In 231, shape=(5, 3), dtype=float32, numpy=
Array ([[0.40338838, 0.01194552.1.964499 ],
[0.5877474 , 0.02103703.2.9969394 ],
[0.40222907.0.35343137.0.7817157 ],
[0.4376807 , 0.40464264.0.8379218 ],
[0.39644662.0.31841943.0.8436158 ]], dtype=float32)>
Copy the code
In this example, each sample returns one logit for each category.
To convert these logarithms into probabilities for each category:
<tf.Tensor: id=In 236, shape=(5, 3), dtype=float32, numpy=
Array ([[0.36700222, 0.55596304.0.07703481],
[0.3415203 , 0.62778115.0.03069854],
[0.2622449 , 0.55832386.0.17943124],
[0.25050646.0.58161455.0.167879 ],
[0.27149206.0.5549062 , 0.17360175]], dtype=float32)>
Copy the code
Performing the TF.argmax operation on each category yields a predicted category index. However, the model has not yet been trained, so these predictions are not ideal.
print("Prediction: {}".format(tf.argmax(predictions, axis=1)))
print(" Labels: {}".format(labels))
Copy the code
Prediction: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1] Labels: [1 2 0 0 0 2 0 1 0 2 0 0 2 2 2 2 1 2 2 2 1 2 0 2 1 0 2 2 1 1 1 2 2 2]Copy the code
Training model
Training is a machine learning phase in which the model is gradually optimized, that is, it understands the data set. The goal is to understand the structure of the training data set sufficiently to make predictions about the test data. If you get too much information from the training data set, the prediction will only apply to the data the model has seen, but will not generalize. The problem is called overfitting — like memorizing answers without understanding how to solve the problem.
The iris classification problem is an example of supervised machine learning: models are trained on samples containing labels. In unsupervised machine learning, samples do not contain labels. Instead, models usually find patterns in features.
Define loss and gradient functions
During both the training and evaluation phases, we need to calculate the loss of the model. This measures how far the model’s predictions deviate from the expected label, that is, how badly the model works. We want to minimize or optimize this value as much as possible.
Our model will use tf. Keras. Losses. SparseCategoricalCrossentropy function to calculate the loss, this function will accept the category of probability prediction results and expected tags, and then returns the average loss of sample.
Loss test: 2.1644210815429688
Copy the code
Use the context of TF.gradienttape to calculate gradients to optimize your model.
Creating the optimizer
The optimizer applies the calculated gradient to the variables of the model to minimize the Loss function. You can think of the loss function as a surface where we want to find the lowest point by walking around. The gradient points in the direction of the fastest ascent, so we’ll move down in the opposite direction. We calculate losses and gradients for each batch iteratively to adjust the model during training. The model gradually finds the best combination of weights and biases to minimize losses. The lower the loss, the better the model’s prediction.
TensorFlow has many optimization algorithms for training. Learning_rate is used to set the step size for each iteration (walk down). This is a hyperparameter that you often need to adjust to get better results.
Let’s set up the optimizer:
Optimizer = tf. Keras. Optimizers. Adam (learning_rate = 0.01)Copy the code
We will use it to calculate a single optimization step:
Step: 0, Initial Loss: 2.1644210815429688
Step: 1, Loss: 1.8952136039733887
Copy the code
Training cycle
With everything in place, it’s time to train the model! The training cycle feeds samples of the data set into the model to help it make better predictions. The following code blocks set up these training steps:
- Iterate over each cycle. One cycle through one data set.
- In one cycle, traverse the training
Dataset
In each sample, and get the sample ofCharacteristics of the(x
) andThe label(y
). - The prediction was made according to the characteristics of the samples, and the prediction results were compared with the labels. Measure the inaccuracy of the predicted results and use the resulting values to calculate the losses and gradients of the model.
- use
optimizer
Update model variables. - Track some statistics for visualization.
- Repeat these steps for each cycle.
The num_epochs variable is the number of times the collection of datasets is traversed. Counter-intuitively, the longer you train your model, the better it gets. Num_epochs is a hyperparameter that can be adjusted. Choosing the right number usually requires a certain amount of experience and experimentation.
## Note: Re-run the unit using the same model variable # Reserve the results for drawing train_loss_results = [] train_accuracy_results = [] num_epochs = 201 for X, y in train_dataset: Loss_value, grad(model, X, Y) optimizer. Apply_gradients (zip(grads, Epoch_loss_avg (loss_value) # Add the current batch loss # Compare the prediction tag with the actual tag epoch_accuracy(y, Train_loss_results.append (epoch_loss_avg.result()) train_accuracy_results.append(epoch_accuracy.result())Copy the code
Epoch 000: Loss: 1.435, Accuracy: 30.000%
Epoch 050: Loss: 0.091, Accuracy: 97.500%
Epoch 100: Loss: 0.062, Accuracy: 97.500%
Epoch 150: Loss: 0.052, Accuracy: 98.333%
Epoch 200: Loss: 0.055, Accuracy: 99.167%
Copy the code
Visualize loss function changes over time
While the training process of the output model is helpful, it is often more helpful to look at the process.
See losses go down and accuracy go up.
plt.show()
Copy the code
Evaluate the effectiveness of the model
The model has been trained, and now we can get some statistical information about its effects.
Evaluation refers to determining how well a model makes a prediction. To determine the effect of the model on iris classification, pass some calyx and petal measurements to the model and ask the model to predict the iris species they represent. Then, the predicted results of the model are compared with the actual labels. For example, if the model correctly predicted half of the varieties in the input sample, the accuracy would be 0.5. The graph shows a better model, which made four out of five predictions, with an 80% accuracy rate:
Sample characteristics | The label | Model to predict | |||
---|---|---|---|---|---|
5.9 | 3.0 | 4.3 | 1.5 | 1 | 1 |
6.9 | 3.1 | 5.4 | 2.1 | 2 | 2 |
5.1 | 3.3 | 1.7 | 0.5 | 0 | 0 |
6.0 | 3.4 | 4.5 | 1.6 | 1 | 2 |
5.5 | 2.5 | 4.0 | 1.3 | 1 | 1 |
Figure.Iris classifier with 80% accuracy |
Build test data sets
Assessment models are similar to training models. The biggest difference is that the samples come from a separate test set, not a training set. In order to evaluate the model fairly, it is important that the sample used for the evaluation model be different from the sample used for the training model.
The establishment of test Dataset is similar to that of training Dataset. Download the CSV text file and parse the corresponding values, then slightly randomize the data:
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv 8192/573 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =====================================================================] - 0s 0us/stepCopy the code
Evaluate the model against the test data set
In the following code cells, we iterate through each sample in the test set, and then compare the predicted results of the model to the actual tags. This is to measure the accuracy of the model across the entire test set.
print("Test set accuracy: {:.3%}".format(test_accuracy.result()))
Copy the code
The Test set accuracy: 96.667%Copy the code
For example, we can see that for the last batch of data, the model usually predicts correctly:
<tf.Tensor: id=In 115075, shape=(30, 2), dtype=int32, numpy=
array([[1, 1],
[2.2],
[0.0],
[1.1],
[1.1],
[1.1],
[0.0],
[2.1],
[1.1],
[2.2],
[2.2],
[0.0],
[2.2],
[1.1],
[1.1],
[0.0],
[1.1],
[0.0],
[0.0],
[2.2],
[0.0],
[1.1],
[2.2],
[1.1],
[1.1],
[1.1],
[0.0],
[1.1],
[2.2],
[1.1]], dtype=int32)>
Copy the code
Predictions are made using trained models
We have trained a model and “proved” that it works, but it is not enough to classify iris species. Now, we use the trained model to make some predictions about the unlabeled samples (that is, samples that contain features but do not contain labels).
In the real world, tagless samples can come from many different sources, including applications, CSV files, and data. For the time being we will manually provide three unlabeled samples to predict their labels. Recall that tag number mapping notation:
0
Iris: mountain1
: Color-changing iris2
: Virginia iris
for i, logits in enumerate(predictions): class_idx = tf.argmax(logits).numpy() p = tf.nn.softmax(logits)[class_idx] name = class_names[class_idx] print("Example Prediction: {} {} ({: 4.1 f} %) ". The format (I, name, 100 * p))Copy the code
Example 1 Prediction: Iris Setosa (99.9%) Example 1 Prediction: Iris Versicolor (100.0%) Example 2 prediction: Iris virginica (96.2%).Copy the code
reference
1. Improved Nelson-Siegel model fitting yield curve analysis with r language using neural network
2. R language to achieve fitting neural network prediction and result visualization
3. Python uses genetic algorithm-neural network-fuzzy logic control algorithm for lottery analysis
4. Python for NLP: Classification using Keras’s multi-label text LSTM neural network
5. Use R language to realize the neural network to predict the stock example
6.R language deep learning image classification based on Keras small data set
7. An example of SEQ2SEQ model for NLP uses Keras for neural machine translation
8. Deep learning model analysis of sugar based on grid search algorithm optimization in Python
9. Matlab uses Bayesian optimization for deep learning