August 29, 2022 -
Jingsong Lee
The Apache Flink community is pleased to announce the release of the Apache Flink Table Store (0.2.0).
Please check out the full documentation for detailed information and user guides.
What is Flink Table Store # Flink Table Store is a data lake storage for streaming updates/deletes changelog ingestion and high-performance queries in real time.
As a new type of updatable data lake, Flink Table Store has the following features:
...
Continue reading »
August 24, 2022 -
Danny Cranmer
The Apache Flink Community is pleased to announce the second bug fix release of the Flink 1.15 series.
This release includes 30 bug fixes, vulnerability fixes, and minor improvements for Flink 1.15. Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see: JIRA.
We highly recommend all users upgrade to Flink 1.15.2.
...
Continue reading »
July 25, 2022 -
Gyula Fora
(@GyulaFora)
Matyas Orhidi
The community has continued to work hard on improving the Flink Kubernetes Operator capabilities since our first production ready release we launched about two months ago.
With the release of Flink Kubernetes Operator 1.1.0 we are proud to announce a number of exciting new features improving the overall experience of managing Flink resources and the operator itself in production environments.
Release Highlights # A non-exhaustive list of some of the more exciting features added in the release:
...
Continue reading »
July 12, 2022 -
Zhipeng Zhang
Dong Lin
The Apache Flink community is excited to announce the release of Flink ML 2.1.0! This release focuses on improving Flink ML’s infrastructure, such as Python SDK, memory management, and benchmark framework, to facilitate the development of performant, memory-safe, and easy-to-use algorithm libraries. We validated the enhanced infrastructure by implementing, benchmarking, and optimizing 10 new algorithms in Flink ML, and confirmed that Flink ML can meet or exceed the performance of selected algorithms from alternative popular ML libraries.
...
Continue reading »
July 11, 2022 -
Yun Gao
Dawid Wysakowicz
Daisy Tsang
Motivation # Flink is a distributed processing engine for both unbounded and bounded streams of data. In recent versions, Flink has unified the DataStream API and the Table / SQL API to support both streaming and batch cases. Since most users require both types of data processing pipelines, the unification helps reduce the complexity of developing, operating, and maintaining consistency between streaming and batch backfilling jobs, like the case for Alibaba.
...
Continue reading »
July 11, 2022 -
Yun Gao
Dawid Wysakowicz
Daisy Tsang
In the first part of this blog, we have briefly introduced the work to support checkpoints after tasks get finished and revised the process of finishing. In this part we will present more details on the implementation, including how we support checkpoints with finished tasks and the revised protocol of the finish process.
Implementation of support Checkpointing with Finished Tasks # As described in part one, to support checkpoints after some tasks are finished, the core idea is to mark the finished operators in checkpoints and skip executing these operators after recovery.
...
Continue reading »
July 6, 2022 -
David Anderson
(@alpinegizmo)
The Apache Flink Community is pleased to announce the first bug fix release of the Flink 1.15 series.
This release includes 62 bug fixes, vulnerability fixes, and minor improvements for Flink 1.15. Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see: JIRA.
We highly recommend all users upgrade to Flink 1.15.1.
...
Continue reading »
June 22, 2022 -
Xingbo Huang
The Apache Flink Community is pleased to announce another bug fix release for Flink 1.14.
This release includes 67 bugs, vulnerability fixes and minor improvements for Flink 1.14. Below you will find a list of all bugfixes and improvements (excluding improvements to the build infrastructure and build stability). For a complete list of all changes see: JIRA.
We highly recommend all users to upgrade to Flink 1.14.5.
Release Artifacts # Maven Dependencies # <dependency> <groupId>org.
...
Continue reading »
June 17, 2022 -
Lijie Wang
Zhu Zhu
Introduction # Deciding proper parallelisms of operators is not an easy work for many users. For batch jobs, a small parallelism may result in long execution time and big failover regression. While an unnecessary large parallelism may result in resource waste and more overhead cost in task deployment and network shuffling.
To decide a proper parallelism, one needs to know how much data each operator needs to process. However, It can be hard to predict data volume to be processed by a job because it can be different everyday.
...
Continue reading »
June 5, 2022 -
Gyula Fora
(@GyulaFora)
Yang Wang
In the last two months since our initial preview release the community has been hard at work to stabilize and improve the core Flink Kubernetes Operator logic. We are now proud to announce the first production ready release of the operator project.
Release Highlights # The Flink Kubernetes Operator 1.0.0 version brings numerous improvements and new features to almost every aspect of the operator.
New v1beta1 API version & compatibility guarantees Session Job Management support Support for Flink 1.
...
Continue reading »