On November 28, 2019, At Flink Forward Asia, Alink, the machine learning algorithm platform, was announced as open source and became the focus of attention of many developers. In February, Alink 1.10 was released, providing support for Flink 1.10 and Flink 1.9. Recently, the Alink team continued to push forward with the release of the latest Alink version 1.1.1, which not only develops new features, but also enhances and improves some existing features.

This article takes a closer look at the new features and fixes in Alink 1.1.1 and shares some usability tips for Alink 1.1.1.

Download and Github to Star github.com/alibaba/Ali…

Alink 1.1.1 Release Note Overview

Alink 1.1.1 Enhancements and New Features:

  • Validation and hints for data column parameters
  • Validation and hints for enumeration type parameters
  • Optimize the speed of data conversion between Alink batch components and Python Dataframe
  • Automatically detects localIp when useRemoveEnv
  • New component to parse CSV, JSON, and KV strings into multiple columns
  • New component WindowGroupByStreamOp to simplify window grouping of streaming data
  • Tokenizer supports string splitting with multiple Spaces
  • Add the FTRL sample

Alink 1.1.1 Fixes and Improvements:

  • Fix dILL version conflicts
  • Fix HasVectorSize alias error
  • Fix mysqlSource error when using collect method

Alink 1.1.1 Feature: github.com/alibaba/Ali…

Usability tips for Alink 1.1.1

When using the Alink algorithm, we often encounter parameters of enumeration type, such as: For ChiSqSelector, the parameter SelectorType can be NumTopFeatures, Percentil, FPR, etc. It is an enumeration type, but when we write the script, we may remember incorrectly, for example, we typed “aaa”, The script code is as follows:

In previous versions of Alink, the following message was displayed:

SelectorType outputs the wrong value AAA, the exception is not obvious and does not indicate which argument is incorrectly written.

After 1.1.1 optimization, which parameter will be filled in incorrectly in the exception message, and what the value may be.

If you are using a Java editor, it is recommended to use methods with enumerated types as arguments, which the editor automatically prompts for selection.

This is often the case when we use the algorithm component. The algorithm has some column name parameters, and we have the possibility of typing errors, as shown in the figure below. Text1 is the text column name.

In version 1.1.1, not only does it throw out which column doesn’t exist, but it also prompts the most likely column name to help the user decide.