This article is the 15th issue of Flink Weekly. It is compiled by Zhang Cheng and reviewed by Li Benchao. The main contents of this issue include: recent community development progress, email q&A, Flink’s latest community trends and technical articles recommendation, etc.
01 Flink development progress
1.Release
■ [Releases] Li Yu has launched the Flink 1.10.1 RC #3 vote, which has been approved, please see the link below for the latest update.
[1] apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Releas…
■ [Releases] Tzu-Li launched Flink Stateful Functions Release 2.0.0 RC #4, and decided to launch a new RC #4.
[2] apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache…
2.DEV
S alibaba has released SpillableHeapStateBackend preview on flink-packages.org. This state backend was contributed to Apache FLINK in Flink-12692. SpillableHeapStateBackend is a based on the Java heap Statebackend (e.g., FilesystemStatebackend), the coldest state before heap exhaustion will overflow to disk.
[3] flink-packages.org/packages/sp… Issues.apache.org/jira/browse…
3.FLIP
■ [flip-108] Yangze Guo initiated a discussion on class loaders and dependencies. The problem is that mainClassLoader cannot recognize a subclass of ExternalResourceInfo. ExternalResourceInfo is located in the ExternalResourceDriver JAR and is isolated from the mainClassLoader via the PluginManager. Therefore, the program throws a ClassNotFoundExeption. Yangze Guo put forward the following three alternative plans. During the discussion, everyone thought the third plan was better. Yangze Guo initiated a vote to modify the API using the third plan, and the vote was finally passed.
Option 1:
Instead of using the plug-in mechanism, simply load the driver into the mainClassLoader. The downside is that users need to deal with dependency conflicts.
Option 2:
The user is forced to build two separate JARS, one for ExternalResourceDriver and one for ExternalResourceInfo. Then add the JAR containing the ExternalResourceInfo class to the/lib directory. This approach may work, but it may bore the user.
Option 3:
Modify the RuntimeContext#getExternalResourceInfos method to return ExternalResourceInfo, Also add a method like “Properties getInfo()” to the ExternalResourceInfo interface. The return value of this method can be specified by the driver provider and the user.
[4] apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLI… [5] apache-flink-mailing-list-archive.1008284.n3.nabble.com/quot-VOTE-F… [6] apache-flink-mailing-list-archive.1008284.n3.nabble.com/RESULT-VOTE…
4.Discuss
■ [docker] Ismael Mejia’s discussion on whether docker images can be published outside of the official Flink release has been updated.
[7] apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Int…
■ [UDF/UDAF] Leerho initiated a discussion on Flink integration with DataSketches. Arvid Heise suggested putting it in flink-package first.
[8] flink-packages.org/ [9] apache-flink-mailing-list-archive.1008284.n3.nabble.com/Integration…
■ introduction of StatefulSequenceSource in TableFactory StatefulSequenceSource makes Flink SQL testing easier.
[10] apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Int…
■ [Connector] Leonard Xu initiated a discussion on refactoring the Flink JDBC Connector. Leonard Xu explained that after refactoring, we can easily introduce a unified pluggable JDBC dialect for tables and data flows, and we can have better module organization and implementation. So far it has been agreed that Leonard Xu has created Jira. Meanwhile, Flink Hbase Connector also has the same problem, which will be discussed separately in the follow-up.
[11] apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Ref… [12] issues.apache.org/jira/browse… [13] issues.apache.org/jira/browse…
■ [Configuration] Timo initiated a discussion on how to represent the configuration hierarchy in properties (Flink configuration as well as Connector properties) so that the generated file will be valid JSON/YAML.
[14] apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Hie…
■ [Hadoop] Robert Metzge led the discussion on adding support for Hadoop 3 and on whether Hadoop 3 would be supported in a Flink-shaded hadoop way.
[15] apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Add…
Answer questions by email
■ Kcz asked Flink memory Settings in the community (Metaspace OOM). Metaspace OOM is usually caused by the JVM loading too many classes. For example, increasing the number of slots will also result in more classes being loaded. At the same time, there has been some feedback from the community that the default metaspace size of Flink 1.10.0 May not be reasonable. The default value will be increased in 1.10.1. User can then through the taskmanager. Memory. Metaspace. Size set to 256 m give it a try.
[16] apache-flink.147419.n8.nabble.com/flink-metas…
■ Lucas Wu would like to set the parallelism of some Flink SQL jobs separately. SQL is not currently supported on a separate operator set parallelism, can through the table. The exec. Resource. The default – parallelism set the global parallelism.
[17] apache-flink.147419.n8.nabble.com/flink-sql-j… [18] apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-H… [19] apache-flink.147419.n8.nabble.com/MySQL-td301…
■ Wang Lei asked questions about Flink SQL Retraction in the community. Michael Ran and Li Jinsong gave detailed answers. Interested students can refer to the following links.
[20] apache-flink.147419.n8.nabble.com/FlinkSQL-Re…
■ Luan Cooper: “Sink Mode/Upsert Mode” For example, the primary key cannot be specified when writing ElasticSearch using Upsert Mode. This question was answered in detail by students in the community. After Jark Wu replied to flip-95 and Flip-105, the query in the question was supported natively. The core job of the FLIP-95 and FLIP-105 is to recognize update/ DELETE/INSERT messages in the binlog, not just append messages. Expect to see these features in 1.11.
[21] apache-flink.147419.n8.nabble.com/Streaming-S…
■ Hb was asked about time zones, answered by Jinsong Li and Jark Wu. This is a bug. In Blink, timestamp without time zone is used by default, while procTime is still generated with time zone at present. This problem already has a Jira. The community will place a high priority on restoration.
[22] apache-flink.147419.n8.nabble.com/flink-sql-t…
■ 1193216154 Some questions about Flink watermark alignment logic Li Benchao answered the question. Watermark takes the minimum value of each input channel as the watermark of the current subtask. At the same time, Tang Yun added. Because of the minimum value of each input channel, if an upstream channel does not get real data, the watermark sent down will always be long.min_value, which will not trigger the window. The community uses idle source to walk around the problem. The FLIP-27 also had to deal with watermark alignment on the Source side.
[23] apache-flink.147419.n8.nabble.com/flink-water… [24] ci.apache.org/projects/fl… [25] cwiki.apache.org/confluence/…
■ Lec Ssmi: Async IO in UDFs: Async IO in UDFs
[26] apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/async-IO-in…
Events/blog posts/others
■ Flink community will register for the first Meetup online live broadcast for free in 2020. Four technical experts from Kangaroo Cloud, netease Cloud Music, Youzan and Alibaba will share rich technical dry goods.
[27]developer.aliyun.com/live/2772
■ Fabian has published a summary of Flink Foward Virtual 2020 on the Ververica blog.
[28] www.ververica.com/blog/flink-…
■ All of Flink Forward Virtual 2020’s recordings have been posted on Youtube.
[29] www.youtube.com/watch?v=NF0…
■ Marta reflects on the past few months in a community update on the Flink blog.
[30] flink.apache.org/news/2020/0… [31] flink.apache.org/news/2020/0… [32] www.meetup.com/Flink-China… [33] www.meetup.com/futureofdat…
2 minutes quick subscription to Flink Chinese mailing list
Apache Flink Chinese mailing list subscription process:
- Send any email to [email protected]
- Received an official confirmation email
- To subscribe, reply this email confirm
If you subscribe, you will receive Flink’s official Chinese mailing list. You can send your questions to [email protected] or help others answer their questions. Try it!
Flink Weekly is a Weekly newsletter with answers to user questions, community development and proposals, community news and other events, blog posts, and more. Stay tuned. About the author:
Zhang Cheng, basic platform development engineer of Xiaohongshu Technology Department, is mainly engaged in real-time computing platform development based on Flink.