Four open source frameworks for folk machine learning

Theano

Theano is the grandmaster of deep learning frameworks. Its development began in 2007, with early developers including legendary Yoshua Bengio and Ian Goodfellow.

Theano is based on Python and is a library that specializes in dealing with multidimensional arrays (similar to NumPy in this respect). When combined with other deep learning libraries, it is well suited for data exploration. It is designed to perform operations for large-scale neural network algorithms in deep learning. In fact, it’s better understood as a compiler for mathematical expressions: defining the results you want in symbolic language, the framework compiles your programs to run efficiently on the GPU or CPU.

It is so similar to the later Tensorflow feature (or should I say Tensorflow is like Theano) that the two are often compared together. Both of them are low-level, and Theano is more of a research platform than a deep learning library. You need to do a lot of work from the bottom up to create the models you need. Theano, for example, has no grading of neural networks.

But over the years, a number of open source deep learning libraries based on Theano have been developed, including Keras, Lasagne and Blocks. These higher-level Wrapper apis can significantly reduce development time and hassles. Even, as far as Leifeng knows, few developers will use the “streaking” Theano, and most will need an auxiliary API. By the way, Theano is a whole ecosystem, don’t just run naked with it and complain it doesn’t work.

Theano has been the industry standard for deep learning development and research for a long time. Moreover, due to its academic origins, it was originally designed for academic research, which has led many scholars in the field of deep learning to use Theano today. But as Tensorflow rose to prominence with Google’s backing, Theano waned and was used by fewer and fewer people. A landmark event in the process was when Ian Goodfellow, one of the founders, abandoned Theano and went to Google to develop Tensorflow.

As a result, more experienced developers tend to see no harm in using Theano for beginners of deep learning. However, for professional developers, Tensorflow is recommended.

Advantages:

Python + NumPy
Use calculation chart
RNN is compatible with computational graphs
There are high-level libraries like Keras and Lasagne
Many developers report that it has a lower learning barrier than Tensorflow

Disadvantages:

It’s very low level
Than the Torch bloated
No distributed support
Some error messages are useless
Large models sometimes take a long time to compile
Insufficient support for pre-trained models
Fewer and fewer people use it

Caffe

This is another grandmaster deep learning framework that has been around since 2013.

Its full name is “Convolution Architecture For Feature Extraction”, meaning “Convolution Architecture For Feature Extraction”, which clearly reflects its use. Caffe was founded by Jia Yangqing, a Chinese doctoral student at the University of California, Berkeley. Jia was doing research at the Berkeley Center for Computer Vision and Learning. After his PhD, he worked at Google and then Facebook.

Caffe is well known among AI developers. Caffe is second only to Tensorflow in GitHub’s latest popularity rankings for machine learning projects. It is a widely used machine vision library that brings Matlab’s approach to fast convolutional networks to C and C++. While Caffe is seen by some developers as a general-purpose framework, it is designed for computer vision — not ideal for other deep learning applications, such as word and speech recognition and processing time series data.

Caffe’s main applications: image classification using convolutional neural network. It represents the best in the industry and is the first choice for developers.

When it comes to Caffe, we have to mention Model Zoo. The latter is where a series of models developed based on Caffe converge. Therefore, the biggest benefit of Caffe for developers is that they can choose from Model Zoo’s vast, pre-trained network of neural networks to download directly and use it immediately.

As far as Lei Feng knows, many of these models are world class. There are many tutorials on them:

Alex’s CIFAR-10 tutorial with Caffe
Training LeNet on MNIST with Caffe
ImageNet with Caffe

Industry insiders generally agree that Caffe is suitable for industrial applications where the primary goal is to implement basic algorithms and facilitate rapid development. For more specific tasks, however, it suffers from a lack of flexibility — making changes to the model often requires C++ and CUDA, although Python and Matlab can also make minor adjustments.

Advantages:

Ideal for feedforward neural networks and image processing tasks
It is ideal for using existing neural networks
You can train models without writing code
Nice work on the Python interface

Disadvantages:

You need C++ and CUDA to write the new GPU hierarchy.
Does not perform well on recursive neural networks
For large neural networks, it is cumbersome (GoogLeNet, ResNet)
No commercial support

Torch

Torch is an outlier compared to other open source frameworks.

That’s right: it’s based on Lua, which was born in Brazil in the 1990s, not Python, which is widely used in machine learning. Both Lua and Python are relatively easy to get started with. But the latter has clearly come to dominate machine learning, especially in academia. While software engineers in the corporate world are most familiar with Java, Lua is relatively new. This has led to difficulties in promoting Torch. As a result, the Torch, while powerful, is not for the masses of developers.

So what makes it so powerful?

First, Torch is well suited to convolutional neural networks. Its developers say the Torch’s native interface is more natural and comfortable to use than other frameworks.
Second, third-party extension toolkits provide rich recursive neural network (RNN) models.

Because of these strengths, many Internet giants developed custom versions of the Torch to aid their AI research. These include Facebook, Twitter and, before being recruited by Google, DeepMind.

Defining a new hierarchy in Torch is easier than in Caffe because you don’t need to write C++ code. Compared to TensorFlow and Theano, Torch is more flexible because it is imperative; The first two are declarative. You must declare a calculation diagram. This makes operations like beam search on the Torch much easier than they are.

Torch’s hot application: In augmented learning, convolutional neural networks and proxies for image problems.

For developers whose primary interest is enhancement learning, Torch is preferred.

Advantages:

High degree of flexibility
Highly modular
Easy to write your own hierarchy
There are many trained models

Disadvantages:

Need to learn Lua
You usually need to write your own training code
Not suitable for circulating neural networks
No commercial support

SciKit-learn

Scikit-learn is an open-source Python algorithm framework developed by David Cournapeau as part of Google’s Summer of Code project in 2007.

It is a concise, efficient algorithm library that provides a range of supervised and unsupervised learning algorithms for data mining and data analysis. Scikit-learn covers almost all the major algorithms of machine learning, making it a fixture in the Python open source world.

Its library is built on SciPy (Scientific Python) — you must have SciPy installed to use SciKit-learn. Its framework includes:

NumPy: Basic multidimensional array package
SciPy: A basic library for scientific computing
Matplotlib: Comprehensive 2D/3D mapping
IPython: Improved interactive controller
Sympy: Sympy mathematics
Pandas: Data structure and analysis

How it got its name: SciPy extensions and modules have traditionally been named SciKits. The module that provides the learning algorithm is named Scikit-learn.

The main difference between it and TensorFlow, the other big algorithmic framework in the Python world, is that TensorFlow is much lower. Scikit-learn, on the other hand, provides a modular solution for implementing machine learning algorithms, many of which can be used directly.

Advantages:

Filtered, high-quality models
Covers most machine learning tasks
Scalable to large data sizes
Using a simple

Disadvantages:

Flexibility and low

MXNet

When it comes to open source frameworks from academia, MXNet has to be mentioned. However, because Amazon has adopted it as its own platform, it was already covered in our last roundup of seven open source machine learning projects from Google, Microsoft, OpenAI, and others. Please click the link if you are interested.

Source: Lei Feng

Four open source frameworks for folk machine learning

Theano

Advantages:

Disadvantages:

Caffe

Advantages:

Disadvantages:

Torch

Advantages:

Disadvantages:

SciKit-learn

Advantages:

Disadvantages:

MXNet

Related Posts

Six, realm

JVM class loader source code analysis

Notes in easyExcel