There are no complex systems in reality — Minimalism Paradigm 1: Things are simple
0. Tease
Why was this article written?
1. This is not an advertisement, the author is not big V, no one contacted to write, please rest assured eat.
2. This is not an analytical article, and the author’s level is limited, so he cannot make an in-depth interpretation. Zhuanlan.zhihu.com/p/85111240 this article on the interpretation of TF2.0 just big unfinished, let everybody laughed.
3. This can be regarded as a technical article, through the OneFlow installation, simple operation to compare the OneFlow compared with the mainstream deep learning framework.
Briefly analyze the current situation of deep learning framework:
Although not to parse the article, but for deep learning framework analysis should have some. Currently, the mainstream deep learning frameworks (not including inference frameworks) can be said to be TensorFlow, PyTorch, and MXNet. In fact, this is the competition of capital and resources behind them. TensorFlow is still in a dominant position due to its completeness and accumulation of deployments, forcing PyTorch, MXNet and other niche frameworks to offer ONNX to break TensorFlow’s deployment monopoly. PyTorch’s success in stealing a big share of the academic pie from TensorFlow is largely due to its correct direction. Instead of focusing on TensorFlow’s strengths, PyTorch has focused on TensorFlow’s pain points and wooed users through ease of use. MXNet, on the other hand, is unlikely to compete with TensorFlow and PyTorch. Some of the advantages of MXNet are not the pain points of TensorFlow and PyTorch.
Looking at the domestic arena, Baidu PaddlePaddle, although open source early, but has been in a tepid state. The domestic framework is indeed unable to compete with the international mainstream framework, the gap is huge. But this year four domestic institutions or companies have opted for open source: Huawei’s MineSpore, Tsinghua University’s Jittor, MegEngine, and OneFlow, the leading technology today. Although not competitive, but in the sino-US trade friction, independent property rights more and more attention.
In my humble opinion, developers don’t really care which framework they use, but more about the completeness and ease of use of the framework. The massive switch of developers from TensorFlow1.x to PyTorch at the time shows that the brand is not important to developers, but the framework is. So the author of the domestic open source framework or positive optimistic attitude, do good will have user support.
Let’s take a look at how well OneFlow is doing.
1. Get started
The following part from OneFlow development documentation: docs.oneflow.org/index.html
1.0 Introduction
What is OneFlow?
OneFlow is an open source, all-new architecture design, the world’s leading industry-wide universal deep learning framework.
What are the advantages of OneFlow?
- Distributed training new experience, multi-machine multi-card as simple as single single card
- Perfect fit for one-stop platform (K8S + Docker)
- Native support for oversized models
- Near zero runtime overhead, linear acceleration ratio
- Flexible support for multiple deep learning compilers
- Automatic mixing accuracy
- Neutrality and openness, broad cooperation
- Continuous improvement of the operator set, model library
1.1 Installation Guide
Install OneFlow stable release
System requirements: The Nvidia driver and CUDA have been installed
- Python > = 3.5
- Nvidia Linux x86_64 driver version >= 440.33
Install OneFlow with Legacy CUDA as follows:
pip install --find-links https://oneflow-inc.github.io/nightly oneflow_cu102 --user pip install --find-links https://oneflow-inc.github.io/nightly oneflow_cu101 --user pip install --find-links https://oneflow-inc.github.io/nightly oneflow_cu100 --user pip install --find-links https://oneflow-inc.github.io/nightly oneflow_cu92 --user pip install --find-links https://oneflow-inc.github.io/nightly oneflow_cu91 --user pip install --find-links https://oneflow-inc.github.io/nightly oneflow_cu90 --userCopy the code
How to view the Nvidia Linux X86_64 Driver Version? You can see my Driver Version: 450.57
(base) song@songxpc:~$ nvidia-smi Thu Aug 20 12:10:44 2020 + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | NVIDIA SMI 450.57 Driver Version: CUDA 450.57 Version: 11.0 | | -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | GPU Name Persistence -m | Bus - Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:01:00. Off | 0 N/A | | 47% 54 c | 5136 mib P2 80 w / 257 w / 11019 mib 38% Default | | | | | | N/A +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1668 G /usr/bin/totem 10MiB | | 0 N/A N/A 1974 G /usr/lib/xorg/Xorg 857MiB | | 0 N/A N/A 2128 G /usr/bin/gnome-shell 331MiB | | 0 N/A N/A 2563 G ... token=2514374358980620094 376MiB | | 0 N/A N/A 14664 C python 2531MiB | | 0 N/A N/A 27295 G ... AAAAAAAAA= --shared-files 872MiB | | 0 N/A N/A 31690 G ... token=3577040725527546973 149MiB | +-----------------------------------------------------------------------------+ (base) song@songxpc:~$Copy the code
How do I view the native CUDA version? It can be seen that the AUTHOR’s Cuda is 10.2
(base) song@songxpc:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
Copy the code
Install the OneFlow framework using Conda, which recommends installing Miniconda
Input in sequence:
Conda create --name OF python=3.7 -y conda activate OF conda install cudatoolkit=10.2 CUDNN =7 -y PIP install --find-links https://oneflow-inc.github.io/nightly oneflow_cu102 --userCopy the code
Test the installation using the following instructions:
(OF) the song @ songxpc: ~ $python 3.7.8 | said by conda - forge | (default, Jul 31, 2020, 02:25:08) [GCC 7.5.0] on Linux Type"help"."copyright"."credits" or "license" for more information.
>>> import oneflow as of
>>> of.__version__
'0.1.8'
>>>
Copy the code
3 minutes to get started
This article introduces how to quickly start OneFlow, we can complete a complete neural network training process in 3 minutes.
Run an example
If OneFlow is already installed, use the following command to download the mlp_mnist.py script from the document repository and run it.
wget https://docs.oneflow.org/code/quick_start/mlp_mnist.py # Download script
python mlp_mnist.py Run the script
Copy the code
You get output similar to the following:
2.7290366 0.81281316 0.50629824 0.35949975 0.35245502...Copy the code
The output is a series of numbers, each number representing the value of the loss after each round of training, the goal of training is to lose as little as possible. At this point you have completed a complete neural network training with OneFlow.
Code reading
Here is the complete code.
# mlp_mnist.py
import oneflow as flow
import oneflow.typing as tp
BATCH_SIZE = 100
@flow.global_function(type="train")
def train_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) -> tp.Numpy:
with flow.scope.placement("cpu"."At"):
initializer = flow.truncated_normal(0.1)
reshape = flow.reshape(images, [images.shape[0] -1])
hidden = flow.layers.dense(
reshape,
512,
activation=flow.nn.relu,
kernel_initializer=initializer,
name="dense1",
)
logits = flow.layers.dense(
hidden, 10, kernel_initializer=initializer, name="dense2"
)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(labels, logits)
lr_scheduler = flow.optimizer.PiecewiseConstantScheduler([], [0.1])
flow.optimizer.SGD(lr_scheduler, momentum=0).minimize(loss)
return loss
if __name__ == "__main__":
check_point = flow.train.CheckPoint()
check_point.init()
(train_images, train_labels), (test_images, test_labels) = flow.data.load_mnist(
BATCH_SIZE, BATCH_SIZE
)
for i, (images, labels) in enumerate(zip(train_images, train_labels)):
loss = train_job(images, labels)
if i % 20= =0:
print(loss.mean())
Copy the code
Let’s take a quick look at this code.
OneFlow is more special than other deep learning frameworks here:
@flow.global_function(type="train")
def train_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) -> tp.Numpy:
Copy the code
Train_job is a function decorated with @flow.global_function and is often referred to as job function. Only job functions decorated with @flow.global_function can be recognized by OneFlow. Type =”train”; Type =”predict” indicates a validation or prediction job.
The training or prediction of a neural network in OneFlow requires two parts of information:
- One is the structure of the neural network itself and related parameters, which are defined in the job functions mentioned above.
- The other part is what configuration is used to train the network, for example
learning rate
, model optimization and updating method. The job functions are configured as follows:
Lr_scheduler = flow. The optimizer. PiecewiseConstantScheduler ([], [0.1]) flow. The optimizer. SGD (lr_scheduler, momentum=0).minimize(loss)
This code contains all the elements for training a neural network, except the job function and its configuration described above:
check_point.init()
: Initializes network model parameters;flow.data.load_mnist(BATCH_SIZE,BATCH_SIZE)
: Prepare and load training data;train_job(images, labels)
: Returns the loss value of each training;print(loss.mean())
: Print a loss value every 20 times of training.
The above is just an example of a simple network. In the handwriting recognition using convolutional neural network, we give a more comprehensive and specific introduction to the process of using OneFlow. In addition, you can also refer to the OneFlow basic topics for detailed introduction of various training issues. At the same time also provides some classical network sample code and data for reference.
2.
Recognize MNIST handwritten numbers
In this article, we will learn:
- Configure the software and hardware environments using the Oneflow interface
- Use oneflow’s interface to define the training model
- Implementation of oneflow training operation function
- Save and load the model
- Implement oneflow check job function
In this paper, through the use of LeNet model, training MNIST data set to introduce the use of OneFlow each core link, the end of the article with a complete example code.
You can also run the following command to view the script functions (script operation depends on the GPU) before learning the function.
First, synchronize the document repository and switch to the corresponding path:
git clone https://github.com/Oneflow-Inc/oneflow-documentation.git
cd oneflow-documentation/cn/docs/code/quick_start/
Copy the code
- The above command will train the MNIST dataset and save the model.
Output:
File mnist.npz already exist, path: ./mnist.npz 5.9947124 1.0865117 0.5317516 0.20937675 0.26428983 0.21764673 0.23443426...Copy the code
The training model is a prerequisite for lenet_eval.py and lenet_test.py below. You can also download and use the trained model without any training steps
# in docs/code/quick_start/
wget https://oneflow-public.oss-cn-beijing.aliyuncs.com/online_document/docs/quick_start/lenet_models_1.zip
unzip lenet_models_1.zip
Copy the code
- Model validation python lenet_eval.py verifies the model you just generated using MNIST test sets and gives the accuracy.
Output:
File mnist. NPZ already exist, Path:./mnist. NPZ accuracy: 99.4%Copy the code
- Image recognition
python lenet_test.py ./9.png
# output: prediction: 9
Copy the code
The above command will use the previously trained model to predict the contents of the “9.png” image we have prepared. You can also download mnIST images extracted by us to verify the prediction effect of your own training model.
MNIST dataset introduction
MNIST is a database of handwritten numbers. Including training set and test set; The training set contains 60,000 pictures and their corresponding labels, and the test set contains 60,000 pictures and the labels of the picture test. Yann LeCun et al. have normalized and centered the images, and packaged them as binary files for download. yann.lecun.com/exdb/mnist/
Define training model
Oneflow. Nn and oneflow. Layers provide partial operators for building models.
def lenet(data, train=False) :
initializer = flow.truncated_normal(0.1)
conv1 = flow.layers.conv2d(
data,
32.5,
padding="SAME",
activation=flow.nn.relu,
name="conv1",
kernel_initializer=initializer,
)
pool1 = flow.nn.max_pool2d(
conv1, ksize=2, strides=2, padding="SAME", name="pool1", data_format="NCHW"
)
conv2 = flow.layers.conv2d(
pool1,
64.5,
padding="SAME",
activation=flow.nn.relu,
name="conv2",
kernel_initializer=initializer,
)
pool2 = flow.nn.max_pool2d(
conv2, ksize=2, strides=2, padding="SAME", name="pool2", data_format="NCHW"
)
reshape = flow.reshape(pool2, [pool2.shape[0] -1])
hidden = flow.layers.dense(
reshape,
512,
activation=flow.nn.relu,
kernel_initializer=initializer,
name="dense1".)if train:
hidden = flow.nn.dropout(hidden, rate=0.5, name="dropout")
return flow.layers.dense(hidden, 10, kernel_initializer=initializer, name="dense2")
Copy the code
In the above code, we set up a LeNet network model.
Implement training operation function
OneFlow provides the oneflow.global_function decorator, which allows you to turn a Python function into a training job function.
Global_function decorator
The oneflow.global_function decorator accepts a type as an argument to specify the job type, with type=”train” specifying the job type as training and type=”predict” specifying the job type as prediction or validation. There is also a function_config object parameter inside the decorator for specific types of job configuration.
@flow.global_function(type="train")
def train_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
# job function implementation...
Copy the code
The tP.numpy. Placeholder is the data Placeholder, and tP.numpy specifies that the job function, when called, will return a Numpy object.
Specify optimization characteristics
We can specify parameters and optimizers to be optimized through the flow.optimizer interface. In this way, OneFlow will be optimized in the specified way during each iteration of the training job.
@flow.global_function(type="train")
def train_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) -> tp.Numpy:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=True)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
labels, logits, name="softmax_loss"
)
lr_scheduler = flow.optimizer.PiecewiseConstantScheduler([], [0.1])
flow.optimizer.SGD(lr_scheduler, momentum=0).minimize(loss)
return loss
Copy the code
Above, we obtain loss through flow.nn. sparse_softMAX_cross_entropy_with_logits, and optimize Loss as the target parameter.
Lr_scheduler sets the learning rate plan, [0.1] indicates that the initial learning rate is 0.1.
Flow.optimizer.sgd specifies that the optimizer is SGD; Passing loss as a parameter to minimize means that the optimizer will aim to minimize loss.
Call job functions and interact
The training can begin by calling the job function.
The result returned when the job function is called is determined by the return value type specified when the job function is defined. It can return one or more results.
Returns an example of the result
Define job functions in lenet_train.py:
@flow.global_function(type="train")
def train_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) -> tp.Numpy:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=True)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
labels, logits, name="softmax_loss"
)
lr_scheduler = flow.optimizer.PiecewiseConstantScheduler([], [0.1])
flow.optimizer.SGD(lr_scheduler, momentum=0).minimize(loss)
return loss
Copy the code
The job function returns a value of type tp.Numpy, so when called, a Numpy object is returned:
for epoch in range(20) :for i, (images, labels) in enumerate(zip(train_images, train_labels)):
loss = train_job(images, labels)
if i % 20= =0:
print(loss.mean())
Copy the code
We called train_job and printed 1 loss per loop 20 times.
An example of returning multiple results
Job functions defined in the validation model code lenet_eval.py:
@flow.global_function(type="predict")
def eval_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) - >Tuple[tp.Numpy, tp.Numpy]:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=False)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
labels, logits, name="softmax_loss"
)
return (labels, logits)
Copy the code
Numpy: Tuple[tp.numpy, tp.numpy]; Tuple: Tuple[tp.numpy, tp.numpy]; Tuple: Tuple[tp.numpy, tp.numpy];
for i, (images, labels) in enumerate(zip(test_images, test_labels)):
labels, logits = eval_job(images, labels)
acc(labels, logits)
Copy the code
We call the job function to return labels and Logits and use them to evaluate model accuracy.
Synchronous versus asynchronous invocation
All the code in this article calls job functions synchronously. In fact, OneFlow also supports calling job functions asynchronously, which is detailed in the article getting the results of job functions.
Initialization, saving, and loading of models
Initialization and saving of the model
The oneflow.train.CheckPoint class constructs an object that can be used to initialize, save, and load the model. During the training process, we can initialize the model by init method and save the model by save method. The following cases:
if __name__ == '__main__':
check_point = flow.train.CheckPoint()
check_point.init()
# Loading data and training...
check_point.save('./lenet_models_1')
Copy the code
Once saved, we will have a directory named “lenet_models_1” that contains the subdirectories and files corresponding to the model parameters.
Model loading
In the process of checking or prediction, we can through oneflow. Train. CheckPoint. Load method to load the existing model parameters. The following cases:
if __name__ == '__main__':
check_point = flow.train.CheckPoint()
check_point.load("./lenet_models_1")
# Verification process...
Copy the code
Load automatically reads the previously saved model and loads it.
Model verification
There is almost no difference between the checkup job function and the training job function, except that the model parameters in the checkup process come from the already saved model, so there is no need to initialize or update model parameters during iteration.
Verification job function preparation
@flow.global_function(type="predict")
def eval_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) - >Tuple[tp.Numpy, tp.Numpy]:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=False)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
labels, logits, name="softmax_loss"
)
return (labels, logits)
Copy the code
Tuple[tp.numpy, tp.numpy] returns a Tuple with two elements, each of which is a Numpy object. We will call the training job function and calculate the accuracy based on the results returned.
Iterative calibration
In the following ACC function to count the total number of samples and verify the total number of correct samples, we will call the job function to get labels and logits:
g_total = 0
g_correct = 0
def acc(labels, logits) :
global g_total
global g_correct
predictions = np.argmax(logits, 1)
right_count = np.sum(predictions == labels)
g_total += labels.shape[0]
g_correct += right_count
Copy the code
Call the validation job function:
if __name__ == "__main__":
check_point = flow.train.CheckPoint()
check_point.load("./lenet_models_1")
(train_images, train_labels), (test_images, test_labels) = flow.data.load_mnist(
BATCH_SIZE, BATCH_SIZE
)
for epoch in range(1) :for i, (images, labels) in enumerate(zip(test_images, test_labels)):
labels, logits = eval_job(images, labels)
acc(labels, logits)
print("accuracy: {0:.1f}%".format(g_correct * 100 / g_total))
Copy the code
Above, the validation function is called through a loop, and the final output is the accuracy of the judgment on the test set.
Predict pictures
By modifying the above validation code so that the validation data comes from the original image rather than the existing data set, we can use the model for image prediction.
def load_image(file) :
im = Image.open(file).convert("L")
im = im.resize((28.28), Image.ANTIALIAS)
im = np.array(im).reshape(1.1.28.28).astype(np.float32)
im = (im - 128.0) / 255.0
im.reshape((-1.1.1, im.shape[1], im.shape[2]))
return im
def main() :
if len(sys.argv) ! =2:
usage()
return
check_point = flow.train.CheckPoint()
check_point.load("./lenet_models_1")
image = load_image(sys.argv[1])
logits = eval_job(image, np.zeros((1,)).astype(np.int32))
prediction = np.argmax(logits, 1)
print("prediction: {}".format(prediction[0]))
if __name__ == "__main__":
main()
Copy the code
The complete code
Training model
Code: lenet_train. Py
#lenet_train.py
import oneflow as flow
import oneflow.typing as tp
BATCH_SIZE = 100
def lenet(data, train=False) :
initializer = flow.truncated_normal(0.1)
conv1 = flow.layers.conv2d(
data,
32.5,
padding="SAME",
activation=flow.nn.relu,
name="conv1",
kernel_initializer=initializer,
)
pool1 = flow.nn.max_pool2d(
conv1, ksize=2, strides=2, padding="SAME", name="pool1", data_format="NCHW"
)
conv2 = flow.layers.conv2d(
pool1,
64.5,
padding="SAME",
activation=flow.nn.relu,
name="conv2",
kernel_initializer=initializer,
)
pool2 = flow.nn.max_pool2d(
conv2, ksize=2, strides=2, padding="SAME", name="pool2", data_format="NCHW"
)
reshape = flow.reshape(pool2, [pool2.shape[0] -1])
hidden = flow.layers.dense(
reshape,
512,
activation=flow.nn.relu,
kernel_initializer=initializer,
name="dense1".)if train:
hidden = flow.nn.dropout(hidden, rate=0.5, name="dropout")
return flow.layers.dense(hidden, 10, kernel_initializer=initializer, name="dense2")
@flow.global_function(type="train")
def train_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) -> tp.Numpy:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=True)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
labels, logits, name="softmax_loss"
)
lr_scheduler = flow.optimizer.PiecewiseConstantScheduler([], [0.1])
flow.optimizer.SGD(lr_scheduler, momentum=0).minimize(loss)
return loss
if __name__ == "__main__":
flow.config.gpu_device_num(1)
check_point = flow.train.CheckPoint()
check_point.init()
(train_images, train_labels), (test_images, test_labels) = flow.data.load_mnist(
BATCH_SIZE, BATCH_SIZE
)
for epoch in range(20) :for i, (images, labels) in enumerate(zip(train_images, train_labels)):
loss = train_job(images, labels)
if i % 20= =0:
print(loss.mean())
check_point.save("./lenet_models_1") # need remove the existed folder
print("model saved")
Copy the code
Check the model
Code: lenet_eval. Py
Pre-training model: lenet_models_1.zip
#lenet_eval.py
import numpy as np
import oneflow as flow
from typing import Tuple
import oneflow.typing as tp
BATCH_SIZE = 100
def lenet(data, train=False) :
initializer = flow.truncated_normal(0.1)
conv1 = flow.layers.conv2d(
data,
32.5,
padding="SAME",
activation=flow.nn.relu,
name="conv1",
kernel_initializer=initializer,
)
pool1 = flow.nn.max_pool2d(
conv1, ksize=2, strides=2, padding="SAME", name="pool1", data_format="NCHW"
)
conv2 = flow.layers.conv2d(
pool1,
64.5,
padding="SAME",
activation=flow.nn.relu,
name="conv2",
kernel_initializer=initializer,
)
pool2 = flow.nn.max_pool2d(
conv2, ksize=2, strides=2, padding="SAME", name="pool2", data_format="NCHW"
)
reshape = flow.reshape(pool2, [pool2.shape[0] -1])
hidden = flow.layers.dense(
reshape,
512,
activation=flow.nn.relu,
kernel_initializer=initializer,
name="dense1".)if train:
hidden = flow.nn.dropout(hidden, rate=0.5, name="dropout")
return flow.layers.dense(hidden, 10, kernel_initializer=initializer, name="dense2")
@flow.global_function(type="predict")
def eval_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) - >Tuple[tp.Numpy, tp.Numpy]:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=False)
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
labels, logits, name="softmax_loss"
)
return (labels, logits)
g_total = 0
g_correct = 0
def acc(labels, logits) :
global g_total
global g_correct
predictions = np.argmax(logits, 1)
right_count = np.sum(predictions == labels)
g_total += labels.shape[0]
g_correct += right_count
if __name__ == "__main__":
check_point = flow.train.CheckPoint()
check_point.load("./lenet_models_1")
(train_images, train_labels), (test_images, test_labels) = flow.data.load_mnist(
BATCH_SIZE, BATCH_SIZE
)
for epoch in range(1) :for i, (images, labels) in enumerate(zip(test_images, test_labels)):
labels, logits = eval_job(images, labels)
acc(labels, logits)
print("accuracy: {0:.1f}%".format(g_correct * 100 / g_total))
Copy the code
Digital prediction
Code: lenet_test. Py
Pre-training model: lenet_models_1.zip
MNIST dataset image mnist_raw_images.zip
import numpy as np
import oneflow as flow
from PIL import Image
import sys
import os
import oneflow.typing as tp
BATCH_SIZE = 1
def usage() :
usageHint = """ usage: python {0} eg: python {0} {1} """.format(
os.path.basename(sys.argv[0]), os.path.join("."."9.png"))print(usageHint)
def lenet(data, train=False) :
initializer = flow.truncated_normal(0.1)
conv1 = flow.layers.conv2d(
data,
32.5,
padding="SAME",
activation=flow.nn.relu,
name="conv1",
kernel_initializer=initializer,
)
pool1 = flow.nn.max_pool2d(
conv1, ksize=2, strides=2, padding="SAME", name="pool1", data_format="NCHW"
)
conv2 = flow.layers.conv2d(
pool1,
64.5,
padding="SAME",
activation=flow.nn.relu,
name="conv2",
kernel_initializer=initializer,
)
pool2 = flow.nn.max_pool2d(
conv2, ksize=2, strides=2, padding="SAME", name="pool2", data_format="NCHW"
)
reshape = flow.reshape(pool2, [pool2.shape[0] -1])
hidden = flow.layers.dense(
reshape,
512,
activation=flow.nn.relu,
kernel_initializer=initializer,
name="dense1".)if train:
hidden = flow.nn.dropout(hidden, rate=0.5, name="dropout")
return flow.layers.dense(hidden, 10, kernel_initializer=initializer, name="dense2")
@flow.global_function(type="predict")
def eval_job(
images: tp.Numpy.Placeholder((BATCH_SIZE, 1.28.28), dtype=flow.float),
labels: tp.Numpy.Placeholder((BATCH_SIZE,), dtype=flow.int32),
) -> tp.Numpy:
with flow.scope.placement("gpu"."At"):
logits = lenet(images, train=False)
return logits
def load_image(file) :
im = Image.open(file).convert("L")
im = im.resize((28.28), Image.ANTIALIAS)
im = np.array(im).reshape(1.1.28.28).astype(np.float32)
im = (im - 128.0) / 255.0
im.reshape((-1.1.1, im.shape[1], im.shape[2]))
return im
def main() :
if len(sys.argv) ! =2:
usage()
return
check_point = flow.train.CheckPoint()
check_point.load("./lenet_models_1")
image = load_image(sys.argv[1])
logits = eval_job(image, np.zeros((1,)).astype(np.int32))
prediction = np.argmax(logits, 1)
print("prediction: {}".format(prediction[0]))
if __name__ == "__main__":
main()
Copy the code
3. Summary
As you can see, OneFlow is easy to model and train, and the migration to this framework is inexpensive. I want to rest the pen to exercise, I hope everyone to keep a good body (~~).
1. Reference
– 10. “Wang Ba road” from 0.1 to 2.0 are seeing TensorFlow struggle history – a minimalist’s AI DE song articles – zhihu zhuanlan.zhihu.com/p/85111240
11. docs.oneflow.org/build_ship/…