\
While Baidu has been promoting PaddlePaddle and Megvii is planning to release its own deep learning training framework MegEngine on 25th this month, Tsinghua University has suddenly released Jittor, the first deep learning framework developed by a Chinese university.
Website links cg.cs.tsinghua.edu.cn/jittor/
Making links github.com/Jittor/jitt…
If the development of deep learning frameworks is compared to the Spring and Autumn period and the Warring States Period, the period before 2018 can be counted as the Spring and Autumn Period, including the early Torch (2002), Theano (2007), Caffe (2013), Deeplearning4J (2014), Keras (2015), TensorFlow (2015), MXNet (2015), Chainer (2015), Micro Soft CNTK (2016), Baidu PaddlePaddle (2016), Caffe2 (2017), PyTorch (2017), etc. There are even micro frameworks attached to these frameworks — FastAI, TFLayer (Keras is also included). Various deep learning frameworks emerge one after another. Flames were rising.
However, since 2018, deep learning has entered the Warring States period, with powerful countries constantly encroaching on the territory of weak countries, so that today PyTorch and TensorFLow dominate. Caffe, once the dominant player in the industry, is still playing a role in the industry. Keras is basically integrated into TensorFlow, Chainer has announced its change to PyTorch, Caffe2 is also integrated into PyTorch. MXNet has written great textbooks and has many open source models. But it was still tepid. In addition, PaddlePaddle, which is made in China, has recently spent money to promote its Detection framework, Server deployment framework and AI Studio on major public accounts, and is still striving to become a powerful country. \
Megvii is also expected to launch its MegEngine training framework on The 25th of this month and will open source several model examples at the same time, joining the fray. Today, Tsinghua University released Jittor, the first deep learning framework developed by a university in China. \
First, let’s introduce the Jittor development team. The team comes from the graphics laboratory of The Computer Department of Tsinghua University. The director of the laboratory is Professor Hu Shimin, and the main research and development force is Liang Dun, Yang Guoye, Yang Guowei and Zhou Wenyang.
1
The basic composition of the Jittor \
According to the Jittor website, the Jittor framework is based on dynamic compilation and consists of two core parts: meta-operator and unified computation graph. \
The meta-operator is the basic operation unit, also known as OP. The author claims that the meta-operator is as easy to use as Numpy, and can achieve more complex and efficient operations than Numpy. The basic meta-operators can be fused into convolution, pooling, BN and other operators. The following is a diagram of the structure of the model from the meta-operator, to the basic deep learning unit. Do you get the feeling that this diagram is a bit old, similar to the introduction of the automation system 10 years ago ~) \
The Jittor built-in meta-operator compiler can dynamically compile Python operators into C++ code. \
The unified computing map combines the advantages of static and dynamic graphs, and is easy to use. It can efficiently optimize the model on hardware such as CPU and GPU.
The Jittor built-in compiler, compatible with LLVM, automatically and dynamically compiles code based on the hardware, further optimizing C++ code to generate low-level operators that are more friendly to the underlying hardware. \
One of the more innovative aspects of the Jittor is that it manages both GPU memory and memory in a unified manner, allowing you to switch to memory when the memory runs out. So you don’t have to worry about running out of 8 to 12 gigabytes of video memory. But motherboard memory efficiency, compared to video memory decline, need to test comparison.
According to the presentation, the Jittor has some underlying innovations compared to various domestic shell-packed chips or operating systems. \
2
Front-end use \
The Jittor back end is compiled using CUDA and C++, but the front end is Python. The front end is very similar to PyTorch. Because parameter saving and data transfer are in the same Numpy and Pickle formats as PyTorch, the Jittor can even load PyTorch’s model directly. Amazing ~ \
The following is an example of a two-layer fully connected network setup and training. For those who know how to use PyTorch, import jittor as Torch directly to switch seamlessly. This is no joke
import jittor as jtfrom jittor import Modulefrom jittor import nnclass Model(Module): def __init__(self) : self.layer1 = nn.Linear(1.10) self.relu = nn.Relu() self.layer2 = nn.Linear(10.1) def execute (self,x) : x = self.layer1(x) x = self.relu(x) x = self.layer2(x) return xdef get_data(n): # generate random data for training test. for i in range(n): x = np.random.rand(batch_size, 1) y = x*x yield jt.float32(x), Jt.parameters (y)model = model () Learning_rate = 0.1optim = nn.sgd (model.parameters(), learning_rate)for i,(x,y) in enumerate(get_data(n)): pred_y = model(x) loss = ((pred_y - y)**2) loss_mean = loss.mean() optim.step (loss_mean) print(f"step {i}, loss = {loss_mean.data.sum()}")
Copy the code
It seems that the only difference compared to PyTorch is that no loss. Backward () is required, and the migration is really pretty seamless.
Jittor also has a nice conversion script for PyTorch. I want to say, this conversion is really easy ah, although they are not the same backstage, but the face, is simply a pair of twins ah. So people transfer from PyTorch, there is no learning cost. \
For installation and other information, you can go to the official website to view, here is no further details.
3
Model example
It can be said that the popularity of a framework, in addition to the ease of use of the framework itself, its official examples, and whether developers contribute actively to the development of a framework, all have a significant impact. Therefore, Baidu Paddle team implemented all kinds of models crazily, basically as long as TensorFlow had, Paddle made a set.
The Jittor also synchronized open source full examples of SSD, DeepLabV3+, and LSGAN in the morning.
-
- Basic concepts: Op, Var
- Meta-operator: realize its own convolution layer through meta-operator
- Custom operators: write your operators in C++ and CUDA and compile them on the fly
- Example 1: Linear regression
- Example 2: MNIST image classification
- Example 3: LSGAN for image generation
- Example 4: Semantic segmentation of DeepLabV3+
- Example 5: SSD for target detection
In its implementation of VGG16 as Backbone SSD, Jittor’s mAP is even one thousandth higher on Pascal data sets and its inference speed is 10-50% faster than PyTorch’s.
I have a look at the implementation of DeepLabV3+ and SSD, writing is very good, beginners can refer to its code structure.
4
conclusion
In general, the Jittor has put some effort into the back end, and the front end maintains the same interface as PyTorch, making it easy for developers to migrate.
However, in the early days of deep learning frameworks, the basic functional units of training frameworks were similar, and there was even a common format like ONNX. Perhaps training frameworks will become more and more unified. PyTorch seems to be in the ascendance now, and TensorFlow is not to be outdone, while Paddle is also spending a lot of money promoting them. In terms of deployment, TensorRT, OpenVINO, and TVM are also in a three-way battle. Let us know in the comments section if Jittor can survive.
Anyway, domestic products should be strong. Caffe, MXNet and TVM were all developed by Chinese, but they were all developed by American universities and companies. We wish Jittor, PaddlePaddle and Megvii MegEngine, released a week later, big and strong ** and even abroad. 支那
Machine Learning Online Manual Deep Learning online Manual AI Basics download (PDF updated to25Set) site QQ group1003271085To join the wechat group, please reply to "add group" to get a discount station knowledge planet coupon, please reply to "knowledge planet" like the article, click on itCopy the code