In 2016, it seems like every big company is doing machine learning, or if not, it’s on its way. So 2016 May go down in history as the machine learning boom. In 2017, the craze showed no signs of abating. Recently, the upgraded Version of AlphaGo swept past almost all the Chinese go masters, like no man’s land, and finally finished with 60 straight wins. Enthusiasm for machine learning has been rekindled.
Machine learning is so popular that you may want to find relevant resources to learn and research. This article has compiled a list of excellent open source frameworks, platforms, systems, libraries, and toolkits for machine learning.
Platforms and systems
- TensorFlow — TensorFlow is Google’s second generation machine learning system, with extended support for deep learning built into it. Any computation that can be expressed as a computational flow graph can use TensorFlow
- PaddlePaddle — A deep learning platform developed by Baidu that is easy to use, efficient, flexible and scalable. PaddlePaddle supports deep learning algorithms for multiple products within Baidu
- Apache SINGA — SINGA is a conventional distributed learning platform based on large data set training, large deep learning modules. SINGA supports a variety of popular deep learning modules
- Scikit Flow — A simplified interface to TensorFlow that mimics Scikit learning and allows users to use it in predictive analytics and data mining
- VELES — Distributed deep learning application, users only need to provide parameters, the rest can be handed to VELES. VELES is another TensorFlow developed by Samsung
- SpeeDO — Parallel deep learning system for general-purpose hardware. SpeeDO requires no special I/O hardware, supports CPU/GPU clusters, and can be easily deployed on a variety of cloud environments such as AWS, Google GCE, Microsoft Azure, and more
The framework
- Torchnet – Facebook’s open source deep learning framework to speed up A.I research
- LightGBM – A Microsoft open source framework that implements GBDT algorithms and supports efficient parallel training. It aims to solve the problems encountered by GBDT in mass data, so that GBDT can be better and faster used in industrial practice
- Guagua is a subproject of Shifu, an open source machine learning framework from PayPal, which addresses the distributed problem of model training
- Chainer — Chainer Bridges the theoretical algorithms and practical applications of deep learning, a flexible framework for deep learning
- Shifu — A fast and scalable machine learning framework based on Hadoop
- Keystone XL — A framework written in Scala designed to simplify the construction of large-scale, end-to-end machine learning pipelines, built on Apache Spark
- LightNet – A lightweight, versatile, matlab-based deep learning framework. The aim is to provide an easy-to-understand, easy-to-use and efficient computing platform for deep learning research
- DeepLearningKit — Open source deep learning framework for iOS, OS X, and tvOS
- GoLearn — GoLearn is a machine learning framework for Go
- YCML — Machine learning framework written in Objective-C that also supports Swift
Toolkits and libraries
- DMTK — Microsoft’s open source distributed machine learning toolkit includes the DMTK distributed machine learning framework, LightLDA for training topic models, and distributed word vectors
- CNTK — Microsoft’s open source deep learning toolkit for speech recognition is very efficient with the power of the GPU
- DSSTNE — Amazon’s open source deep learning tool supports two Graphics processors (Gpus) at the same time for intelligent search and recommendations
- Scikit-learn — Python machine learning project, a concise and efficient algorithm library that provides a range of supervised and unsupervised learning algorithms for data mining and data analysis. Scikit-learn covers almost all the major algorithms of machine learning
- Deeplearning4j – The first commercial-grade open source distributed deep learning library written for Java and Scala. Designed for business environments, Deeplearning4j is plug-and-play designed to enable rapid prototyping by non-researchers with more preset usage and less configuration
- CaffeOnSpark – Yahoo open source distributed deep learning package based on Hadoop/Spark
- BigDL — Intel open source distributed deep learning library based on Apache Spark that supports high-performance big data analysis
- Swift AI — A high-performance AI and machine learning library, written entirely in Swift, currently supports iOS and OS X and includes a set of generic AI and machine learning tools
- Gorgonia — Go machine learning library for writing and evaluating mathematical formulas for multi-dimensional arrays. Similar to Theano and TensorFlow, it supports GPU/CUDA and distributed computing
- Shark C++ — a fast, modular, feature-rich open source C++ machine learning library that provides a variety of machine learning related techniques such as linear/nonlinear optimization, kernel-based learning algorithms, neural networks, and more
- MLPACK — a machine learning library for C++, highlighted by its scalability, high speed and ease of use. Designed to enable new users to use machine learning through simple, consistent apis, while providing professional users with the high performance and maximum flexibility of C++
- Smile – A Java library containing a variety of existing machine learning algorithms. For example, adjacency list and matrix graph algorithms, Swing-based visual libraries, etc
- PredictionIO — An open source machine learning server that developers and data analysts can use to build intelligent applications, as well as prediction features such as personalized recommendations and content discovery
- Aerosolve — Machine learning engine that powers Airbnb’s pricing advice system
- Vowpal Wabbit — Machine learning systems that push the frontiers of machine learning technology by leveraging technologies such as online, hashing, reduction, reduction, learning, search, active and interactive learning
- Apache SystemML — SystemML is a flexible, scalable machine learning (ML) language written in Java. It provides automatic optimization and ensures efficiency and scalability through data and clustering features. SystemML can run in MapReduce or Spark
Original link: my.oschina.net/editorial-s… Editor in charge: Open Source China – the director must mark in the main body and keep the original link and author information