On August 22, Apache Flink 1.9.0 was officially released. Back in January this year, Alibaba announced to open source its big data processing engine Blink, which had been polished internally in the past few years, and contribute code to Apache Flink. Currently, Flink 1.9.0 is the first release of Alibaba’s internal version Blink after it was merged into Flink, with 1.5 million lines of code modified. This release not only has major changes in structure, but also is more powerful and perfect in functions and features. This article introduces the major changes and new features of Flink 1.9.0.

Here we briefly review some key points of Alibaba Blink open source:

  • The open source content of Blink is a large number of new functions, performance optimization, stability improvement and other core codes accumulated by Alibaba in stream computing and batch processing based on the open source Flink engine and relying on the internal business of the group
  • Blink is open source as a branch, that is, it will become a branch of Apache Flink project after open source.
  • Blink’s goal is not to become another active project, but to make Flink better. Through open source, we can let everyone know all the details of Blink implementation, improve the efficiency of Blink function merge into Flink, and cooperate with the community more efficiently.

Half a year later, with the release of Flink version 1.9.0, we are proud to announce that the Blink team has fulfilled their promise! Although not all features have been merged back into the community, we and the community have been working together to make great strides toward the original dream of Flink.

Here are a few comparisons of Flink version 1.9.0 with previous releases:

1.7.0 version 1.8.0 comes with version Version 1.9.0
Number of issues solved 428 422 977
Code commit times 969 1094 1964
The code modifies the number of lines Line 260000 Line 230000 Line 1.5 million
Contributor number 112 140 190



  • In terms of the number of issues resolved and the number of code commits, 1.9.0 has already matched or exceeded the previous two versions combined.
  • In terms of the number of lines of code changed, it’s a staggering 1.5 million. Despite some module refactoring and Blink Merge, there’s no denying that version 1.9.0 is the most active release Flink has ever seen.
  • Flink has also attracted more and more contributors in terms of Contributor numbers. I believe there are many users and developers from China, and the community has responded to the call to create a Chinese mailing list.

What changes in 1.9.0 led to such a large number of changes will be explained below.

Infrastructure upgrade

Basically, if there is a major change in the system, it must be due to an architectural upgrade. This time is no exception, and Flink has taken a big step in the direction of streaming batch fusion. First let’s take a look at the architecture diagram of previous versions of Flink:



The architecture diagram on the left will be familiar to readers familiar with Flink. Simply put, Flink has two relatively independent DataStream and DataSet apis on top of its distributed streaming execution engine to describe streaming and batch jobs, respectively. On top of these two apis, a unified STREAMING batch API is provided, namely the Table API and SQL. You can use the same Table API or SQL to describe streaming batch jobs, but at run time you need to tell the Flink engine whether you want to run as a stream or as a batch stream. The Table layer optimizer will optimize the program into a DataStream or DataSet job.

But if we look at the implementation details at the bottom of DataStream and DataSet, the two apis don’t really share much. They have separate translation and optimization processes, and when they do run, they use completely different tasks. Such inconsistencies can be problematic for both users and developers.

From the user’s point of view, when writing a job, they have to choose between two apis that not only have different semantics, but also support different types of connector, which can cause some confusion. Although Table has been unified in the API, the underlying implementation is still based on DataStream and DataSet, so it will also be affected by the previous inconsistency.

From a developer’s point of view, the two processes are relatively independent, making it difficult to reuse code. When we develop some new functions, we often need to develop similar functions twice, and the development path of each API is relatively long and basically belongs to end-to-end modification, which greatly reduces our development efficiency. If two independent technology stacks exist for a long time, it will not only lead to a long-term waste of manpower, but also may eventually lead to the slow development of the whole Flink function.

On the basis of some initial explorations by Blink, we had close discussions with developers in the community, and finally basically finalized Flink’s future technical architecture path.



In the future version of Flink, we will abandon the DataSet API, and the user API will be mainly divided into DataStream API that describes the physical execution plan and Table & SQL that describes the relational plan. DataStream API provides more of a “WHAT you see is what you get” experience. The relationship between the user and the operator is described by the user without too much interference or optimization from the engine. The Table API & SQL continues in its current style, providing a relational expression API that the engine optimizes based on user intent and selects the optimal execution plan. It is worth noting that both apis will provide both streaming and batching capabilities in the future. Both user apis share the same technology stack at the implementation level, such as a common DAG data structure for describing jobs, a common StreamOperator for writing operator logic, and a common streaming distributed execution engine.

Table API & SQL

When open source Blink, the Table module of Blink already uses the new architecture envisioned by Flink in the future. As a result, the Table module in Flink 1.9 was the first to take advantage of the architecture changes. However, in order to minimize the impact on the user experience of previous versions, we needed to find a way for both architectures to coexist.

For this purpose, community developers made a series of efforts, including flip-32 Proposals for Flink Improvement of the Table module, The Java and Scala apis were reviewed for dependency, and the Planner interface was proposed to support a variety of different Planner implementations. The Planner will be responsible for the specific optimization and translation of the Table job into execution diagrams, so we can move the original implementation to Flink Planner and put the code for the new architecture in Blink Planner.



(The Query Processor shown here is the implementation of Planner.)

This approach kills two birds with one stone. Not only does it make the Table module clearer after being split, but more importantly, it does not affect the user experience of the old version.

In version 1.9, we have merged most of the SQL functionality that was originally open sourced from Blink. These are the new functions and performance optimization precipitated by the internal scenes of Ali in recent years, I believe it can promote Flink to a higher level!



For an introduction to specific features, please pay attention to community announcements and usage documents, including our follow-up series of articles.

In addition to the architecture upgrade, the Table module has received several relatively major refactorings and new features in version 1.9, including:

  1. Flip-37: Reconstructing the Table API type system
  2. Flip-29: Table added an API for multi-row, multi-column operations
  3. Flink-10232: Preliminary SQL DDL support
  4. Flip-30: A new unified Catalog API
  5. FLIP-38: Added Python version of Table API

With these new features, and subsequent fixes and refinements, the Flink Table API and SQL will play an increasingly important role in the future.

Batch improvement

Flink’s batching capabilities have improved significantly in version 1.9, with several enhancements to the batching capabilities added following an architecture overhaul.

The first is the error recovery cost of optimizing batch processing: Fine Grained Recovery from Task Failures shows that the optimization was put in place a long time ago and 1.9 finally put some finishing touch to some of the features in the FLIP 1. In the new version, if a batch job has an error, Flink will first calculate the scope of the error, which is called Failover Region. In the batch operation, some nodes can transfer Pipeline data through the network, but other nodes can store the output data in a Blocking way and then read the stored data downstream for data transmission. If the output data of the operator has been completely saved, then there is no need to pull the operator to run, so that error recovery can be controlled in a relatively small range.



If you take the job to the extreme and Shuffle the data everywhere you need it, you will have similar behavior to MapReduce and Spark. However, Flink supports more advanced uses, and you can control whether each Shuffle uses the network for direct connection or file drop.

With file-based Shuffle, it’s easy to wonder if the implementation of Shuffle could be a plugin. Yes, the community is working in that direction: flip-31 (Pluggable Shuffle Service). For example, we can use Yarn’s Auxliary Service as a Shuffle implementation, and we can even write a distributed Service to help batch tasks Shuffle. Recently, Facebook shared some of this work, and inside Ali, we have used this architecture to support hundreds of terabytes of processing per job. With such a plug-in mechanism, Flink can easily connect with these more efficient and flexible implementations, which effectively solves the long-standing problem of Shuffle batch processing.

Stream processing improvement

Stream computing is, after all, Flink’s main area of growth, and there are certainly some improvements to be made in this area in version 1.9. This version adds a very useful feature, FLIP-43 (State Processor API). The access to Flink’s State data and the access to Savepoint composed of State data has always been a popular function among community users. Prior to release 1.9, Flink developed Queryable State, but its use has been limited and ineffective, and has remained limited. This time, the State Processor API provides more flexible access and allows users to perform some of the more technologically sophisticated functions:

  1. Users can use this API to read data from other external systems in advance, save it to a Flink Savepoint format, and then have Flink jobs start from that Savepoint. That way, you can avoid a lot of cold start problems.
  2. Analyze State data directly using Flink’s batch API. State data has always been a black box for users, and users have no idea whether the data stored in it is right or wrong, and whether there are anomalies. With this API, users can analyze State data just like they would any other data.
  3. Dirty data correction. If a piece of dirty data contaminates your State, users can use the API to fix and correct such problems.
  4. State migration. When the user modifies the job logic, he wants to reuse most of the State of the original job, but he wants to do some fine tuning. Then you can use this API to do the job.

All of the above are common requirements and problems in the field of flow computing that have the opportunity to be addressed through this flexible API, so I am personally optimistic about the application prospects of this API.

Speaking of Savepoint, there’s another utility that the community has done, the FLIP 34 (Stop with Savepoint). Flink is known to periodically Checkpoint and maintain a global status snapshot. Let’s say we have a scenario where a user actively pauses between two Checkpoint cycles and then restarts later. Flink then automatically reads the last successfully saved snapshot of global state and starts calculating the data since the last global snapshot. Although this can ensure that the state data is not too much or too little, the output to Sink already has duplicate data. With this capability, Flink takes a global snapshot while pausing the job and stores it to Savepoint. The next startup will start the job from this Savepoint so that Sink will not receive unexpected duplicate data. However, this method cannot solve the problem of repeated output to Sink data caused by automatic Failover during operation.

Hive integration

Hive has been an important force in the Hadoop ecosystem. To better promote Flink’s batch capabilities, integration with Hive is essential. During the development of version 1.9, we were also pleased to have two Apache Hive PMCS to help advance the integration of Flink and Hive.

The first problem to solve is using Flink to read Hive data. Flink now has full access to the Hive Meta Store with the help of the Flip-30’s unified Catalog API. At the same time, we have also added the Connector of Hive, which currently supports CSV, Sequence File, Orc, Parquet and other formats. You only need to configure the HMS access mode to read Hive tables using Flink. On top of this, Flink also adds compatibility for Hive custom functions, such as UDF, UDTF, and UDAF, that can run directly in Flink SQL.

In terms of write support, Flink also supports relatively simple, temporarily can only INSERT INTO a new table. Compatibility with Hive has always been a high priority in the community and will continue to improve in future releases.

conclusion

Flink version 1.9.0 was released smoothly after half a year of intense development. In the process, the Flink community not only welcomed a significant number of Chinese developers and users, but also received a large number of code contributions, which boded well for a good start. In the future, whether functional or ecological, we will continue to increase investment in Flink community and make Flink widely used in China and even the world. We also hope that more developers can join us and join the Flink community to make Apache Flink better and better!


By Yankert (Rooney)

The original link

This article is the original content of the cloud habitat community, shall not be reproduced without permission.