Matthew Mayo

GitHub Python Data Science Spotlight: AutoML, NLP, Visualization, ML Workflows

Let’s take a look at some of the new popular Python libraries and hope this article will help you in your work:

  1. Auto-keras automatic machine learning library

Project link: github.com/jhfjhfj1/au…

Documents: autokeras.com

Getting started: autokeras.com/#example

Auto-keras is an open-source software library for automatic machine learning (AutoML). The ultimate goal of automatic machine learning is to make deep learning models easily applied by industry experts with limited data science knowledge or machine learning background. Auto-keras provides many functions for automating deep learning model architectures and hyperparameters.

  1. Finetune SciKit-Learn style natural language processing model trimmer

Project link: github.com/IndicoDataS…

Documents: finetune. Indico. IO

Getting started: Finetune.indico.io

Finetune provides a pre-training language model for “improving language understanding through generative pre-training” and extends the OpenAI/ Finetune-language-Model library.

  1. GluonNLP – Makes Natural language processing easier

Project link: github.com/dmlc/gluon-…

Documents: the gluon – NLP. Mxnet. IO

Getting Started: github.com/dmlc/gluon-…

GluonNLP speeds up research in natural language processing by making it easier to process text, load data and build neural models.

  1. Animatplot – Python GIF library based on Matplotlib

Project link: github.com/t-makaro/an…

Documents: animatplot. Readthedocs. IO/en/latest

Getting started guide: animatplot. Readthedocs. IO/en/latest/t…

Please note that the examples in the library documentation are relatively simple, and this article refers to the library’s more full-featured and cool-looking sample diagrams listed on GitHub.

  1. MLflow – Open source platform for the machine learning lifecycle

Project link: github.com/mlflow/mlfl…

Documents: mlflow.org/docs/latest…

Getting Started: mlflow.org/docs/latest… MLflow is an open source platform for managing the overall lifecycle of machine learning. The platform provides three main features:

  • MLflow Tracking: Tracking experiments to record and compare machine learning parameters.
  • MLflow Projects: Packaging machine learning code in a reusable, reproducible form for sharing with other data scientists or delivery to production environments.
  • MLflow Models: Manage Models from various machine learning libraries and deploy them to different model services and application platforms.

MLflow implements its functionality by accessing REST apis and CLI, so it is not dependent on a single library and supports multiple machine learning libraries and programming languages, including the Python API for ease of use.