MLDemos introduction

MLDemos is an open source visualization tool for machine learning algorithms designed to help research and understand how multiple algorithms operate and how their parameters affect and modify the results of problems such as classification, regression, clustering, dimensionality reduction, dynamic systems and reinforcement learning (reward maximization).

MLDemos is open source and free to use for personal and academic purposes.

Created by Dr. Basilio Noris in the Learning Algorithms and Systems Lab, the program was developed with the support and support of the following entities, organizations and groups.



The installation

Binary package:

MLDemos 0.5.2 for Windows

Minimum requirements: XP SP3

MLDemos 0.5.2 for Mac

Minimum requirements: Snow Leopard

MLDemos 0.3.2 _CDE

Minimum required: kernel 2.6.X Thanks Philip Guo (website)

legitimacy

These packages contain binary versions of many open source libraries. I include them here, knowing that this may not be fully compatible with the distribution strategy for each corresponding library. I will attempt to contact and obtain the necessary permissions from the relevant parties, to the extent possible. At the same time, I distribute the software in good faith, and my goal is to enable people to learn and use the different approaches implemented here. See the validation section below for a list of contributors.

You may use this software for personal and educational purposes. You may not use it for commercial purposes. As long as you provide a link to this page, you can redistribute the software. This page will then always link to the latest version of the software, so you’re better off using that version here.

The source code:

The MLDemos source code can be obtained directly from Git or a public repository (get the latest version of the Devel branch)

git clonegit://gitorious.org/mldemos/mldemos.git - b develCopy the code

public GitHub repository

Source Backup (0.3.0)

A necessary condition for

This code requires Qt (5.10) and (partially) OpenCV (3.1) and Boost (1.47). Earlier versions of these libraries may also work, but you can also use newer versions. Be sure to adjust the include and lib paths to point them to the correct directories.

The software was compiled and tested using QtCreator 2.1 and 2.6 on Mac OSX High Sierra, Windows 10, Gentoo, Ubuntu and Kubuntu 10.04.

  • Windows

In order to compile MLDemos in Windows, you need MinGW (usually the QGSDK installation comes with MinGW)

  • Debian

Professor Barak A. Pearlmutter has created A Debian package that will be available soon. In the meantime you can set it up with the following instructions:

 git clone git://github.com/barak/mldemos.git
 cdmldemos git checkout debian dpkg-checkbuilddeps fakeroot debian/rules binary sudo dpkg --install .. /mldemos_*.debCopy the code

Note: OpenCV2.4 cannot be used directly (only 2.1), this will require you to build OpenCV2.4. This is only necessary for using MLP and Boosting. These are two important algorithms, so you’d better make an effort:

 git clone git://github.com/barak/opencv.git
 cdopencv git checkout master dpkg-checkbuilddeps fakeroot debian/rules binary sudo dpkg --install .. /*opencv*.debCopy the code

Again, thank you Barak!

Known bug * * * *

  • WINDOWS: Clearing the canvas in a 3D display leaves some memory footprint, which may accumulate over multiple completions (memory errors on WINDOWS only)
  • LINUX (CDE package) loading and saving external files does not work
  • The approximate nearest KNN classification creates a strange blank area with some metrics on some machines.
  • Saving on Linux CDE packages does not work
  • Resizing the canvas while drawing the bonus map does not update the base data (avoid doing so).
  • In Boosting, changing data does not recalculate the learner, which can result in poor results if the data significantly alters boundaries

New features

Changelog

v0.5.0

New visualization and data set capabilities

  • Added 3d visualization and classification of samples, regression and maximization of results
  • Added visual panels that contain individual graphs, correlations, densities, etc
  • Editing tools were added to drag/magnetize data, change classes, and increase or decrease the size of datasets
  • Added classification dimensions (index dimensions with non-numeric values)
  • Added “Dataset Edit” panel to exchange, delete and rename dimension, class or classification values
  • Several bug fixes for displaying, importing/exporting ** data, sorting performance **

New algorithms and methods

  • Added “Grid Search” panel to batch test value ranges of up to two parameters at a time

  • One-vs-all multi-category classification is added for the non-multi-category algorithm

  • New data can now be trained and tested (trained on one dataset, tested on another)

  • SVM automatic correlation determination added to the RBF kernel (thanks Ashwini Shukla!)

  • Added hierarchical self-organizing maps for growth (original code by Michael Dittenbach)

  • Random forest classification was added

  • Add LDA as a classifier (in addition to projectors)

  • Added save/load model options for GMM and SVM

Software screenshots

algorithm

Implementation method

classification

  • Support Vector Machine (SVM) (C, NU, Pegasos)
  • Correlation vector machine (RVM)
  • Gaussian mixture model (GMM)
  • Multilayer perceptron + backward propagation
  • Gentle AdaBoost + naive Bayes
  • K-nearest Neighbors (KNN)
  • Gauss process Classification (GP)
  • Random forests

Return to the

  • Support vector regression (SVR)
  • Correlation vector regression (RVR)
  • Gaussian mixed regression (GMR)
  • MLP + BackProp
  • The approximate KNN
  • Gaussian Process regression (GPR)
  • Sparse Optimized Gaussian Process (SOGP)
  • Locally Weighted Scatterplot smoothing (LOWESS)
  • Locally weighted projection regression (LWPR)

Power system

  • GMM + GMR
  • LWPR
  • SVR
  • SEDS
  • SOGP (Slow!)
  • MLP
  • KNN
  • Enhanced SVM (ASVM)

clustering

  • K-means (K – Means)
  • Soft K-Means
  • Kernel K-Means
  • K-Means++
  • GMM
  • One Class SVM
  • FLAME
  • DBSCAN

To predict

  • Principal Component Analysis (PCA)
  • The kernel PCA
  • Independent Component Analysis (ICA)
  • Canonical correlation analysis (CCA)
  • Linear discriminant analysis (LDA)
  • Fisher linear criterion
  • EigenFaces to 2D (using PCA)

Maximization of rewards (Reinforcement learning)

  • Random search
  • Random walk
  • PoWER
  • Genetic Algorithm (GA)
  • Particle swarm optimization algorithm
  • Particle filter
  • Donut (An Unsupervised Anomaly Detection Algorithm for Seasonal KPIs Based on Variational Autoencoder)
  • Gradientless method (NLOPT)

contribution

If you are developing a new algorithm that fits the MLDemos framework and would like to integrate it into your software, please contact us (see information below) and describe the type of help you need to implement the MLDemos plug-in.

Thank you

If a lot of people didn’t put a lot of effort into implementing the different algorithms here that combine into a single program, the program wouldn’t exist.

  • Florent D’Hallouin (GMM + GMR) – LASA
  • Dan Grollman (SOGP) – LASA
  • Mohammad Khansari (SEDS + DSAvoid) – LASA
  • Ashwini Shukla (ASVM, ARD Kernels) – LASA
  • Stephane Magnenat (ESMLR) – website
  • Chih-Chung Chang and Chih-Jen Lin (libSVM) – website
  • David Mount and Sunik Arya (ANN library) – website
  • Davis E. King (DLIB) – website
  • Stefan Klanke and Sethu Vijayakumar (LWPR) – website
  • Robert Davies (Newmat) – website
  • JF Cardoso (ICA) – website
  • Steven G. Johnson (NLOpt) – website
  • The WillowGarage crowd (OpenCV) – website
  • Trolltech/Nokia/Digia (Qt) – website
  • The authors of several of the icons – website
  • PhD students attending EPFL 2012 ML course (Julien Eberle, Pierre-Antoine Sondag, Guillaume deChambrier, Klas Kronander, Renaud Richardet, Raphael Ullman) Furthermore, the performance of the plan itself would have been greatly reduced without the support of LASA and the work of the development team: Christophe Paccolat, Nicolas Sommer and Otpal Vittoz.

Thanks also to those who did not contribute code but contributed directly: Aude Billard, FrancoisFleuret as one of the best bosses one could hope for, for a series of fruitful discussions, and the AML 2010 and 2011 classes patiently gave it its first test drive.

Quick start

Start the software by clicking the left mouse button or right mouse button to draw a sample. Right click to generate a sample of the selected class in the toolbar (default: 1) Select the “Show Options” icon which will allow you to display model information, confidence/likelihood graph and hide the original sample Mouse wheel will allow you to zoom in and out Alt + drag will allow you to pan the space

  • Select the Algorithm Options icon
  • Select an algorithm icon to open its respective options panel
  • Click the Classify button to run the algorithm against the current data

Import data

There are three different ways to generate data in MLDemos: manually drawing samples, projecting image data through PCA (via the PCAFaces plug-in), or by loading external data. You can drag and drop comma-separated values or other text file-based value tables into the interface. In this case, a Data Load dialog box appears, allowing you to select which columns or rows should be loaded, interpreted as class labels or headers, and so on.

Alternatively, the native data format used by the software is ASCIi-based and contains:

  1. Number of samples followed by dimension #
  2. For each sample, one row contains
    1. Sample values separated by Spaces (floating point, one for each dimension)
    2. Sample class index (integer 0… 255).
    3. Flag value used to terminate the line (integer 0-3) (not currently used)

A simple example is this

43 0.10 0.11 0.12 0 0 0.14 0.91 0.11 0 0 0.43 0.74 0.41 10 0.28 0.34 0.33 10Copy the code

It provides four three-dimensional samples, two from category 0 and one from category 1. (Is there a problem with the original?)

When the file is saved from MLDemos, the software adds the current algorithm parameters (assuming the algorithm is selected), which can be used for demonstration purposes. If no such information exists, the default algorithm parameter is selected.

Drawing some samples by hand, or importing a standard dataset and saving it from MLDemos should give you plenty of examples of file syntax.

Website: mldemos.b4silio.com/