Announcing Apache Flink 1.0.0

March 8, 2016 -

The Apache Flink community is pleased to announce the availability of the 1.0.0 release. The community put significant effort into improving and extending Apache Flink since the last release, focusing on improving the experience of writing and executing data stream processing pipelines in production.

Flink version 1.0.0 marks the beginning of the 1.X.X series of releases, which will maintain backwards compatibility with 1.0.0. This means that applications written against stable APIs of Flink 1.0.0 will compile and run with all Flink versions in the 1. series. This is the first time we are formally guaranteeing compatibility in Flink’s history, and we therefore see this release as a major milestone of the project, perhaps the most important since graduation as a top-level project.

Apart from backwards compatibility, Flink 1.0.0 brings a variety of new user-facing features, as well as tons of bug fixes. About 64 contributors provided bug fixes, improvements, and new features such that in total more than 450 JIRA issues could be resolved.

We encourage everyone to download the release and check out the documentation. Feedback through the Flink mailing lists is, as always, very welcome!

Interface stability annotations #

Flink 1.0.0 introduces interface stability annotations for API classes and methods. Interfaces defined as @Public are guaranteed to remain stable across all releases of the 1.x series. The @PublicEvolving annotation marks API features that may be subject to change in future versions.

Flink’s stability annotations will help users to implement applications that compile and execute unchanged against future versions of Flink 1.x. This greatly reduces the complexity for users when upgrading to a newer Flink release.

Out-of-core state support #

Flink 1.0.0 adds a new state backend that uses RocksDB to store state (both windows and user-defined key-value state). RocksDB is an embedded key/value store database, originally developed by Facebook. When using this backend, active state in streaming programs can grow well beyond memory. The RocksDB files are stored in a distributed file system such as HDFS or S3 for backups.

Savepoints and version upgrades #

Savepoints are checkpoints of the state of a running streaming job that can be manually triggered by the user while the job is running. Savepoints solve several production headaches, including code upgrades (both application and framework), cluster maintenance and migration, A/B testing and what-if scenarios, as well as testing and debugging. Read more about savepoints at the data Artisans blog.

Library for Complex Event Processing (CEP) #

Complex Event Processing has been one of the oldest and more important use cases from stream processing. The new CEP functionality in Flink allows you to use a distributed general-purpose stream processor instead of a specialized CEP system to detect complex patterns in event streams. Get started with CEP on Flink.

Enhanced monitoring interface: job submission, checkpoint statistics and backpressure monitoring #

The web interface now allows users to submit jobs. Previous Flink releases had a separate service for submitting jobs. The new interface is part of the JobManager frontend. It also works on YARN now.

Backpressure monitoring allows users to trigger a sampling mechanism which analyzes the time operators are waiting for new network buffers. When senders are spending most of their time for new network buffers, they are experiencing backpressure from their downstream operators. Many users requested this feature for understanding bottlenecks in both batch and streaming applications.

Improved checkpointing control and monitoring #

The checkpointing has been extended by a more fine-grained control mechanism: In previous versions, new checkpoints were triggered independent of the speed at which old checkpoints completed. This can lead to situations where new checkpoints are piling up, because they are triggered too frequently.

The checkpoint coordinator now exposes statistics through our REST monitoring API and the web interface. Users can review the checkpoint size and duration on a per-operator basis and see the last completed checkpoints. This is helpful for identifying performance issues, such as processing slowdown by the checkpoints.

Improved Kafka connector and support for Kafka 0.9 #

Flink 1.0 supports both Kafka 0.8 and 0.9. With the new release, Flink exposes Kafka metrics for the producers and the 0.9 consumer through Flink’s accumulator system. We also enhanced the existing connector for Kafka 0.8, allowing users to subscribe to multiple topics in one source.

Changelog and known issues #

This release resolves more than 450 issues, including bug fixes, improvements, and new features. See the complete changelog and known issues.

List of contributors #

  • Abhishek Agarwal
  • Ajay Bhat
  • Aljoscha Krettek
  • Andra Lungu
  • Andrea Sella
  • Chesnay Schepler
  • Chiwan Park
  • Daniel Pape
  • Fabian Hueske
  • Filipe Correia
  • Frederick F. Kautz IV
  • Gabor Gevay
  • Gabor Horvath
  • Georgios Andrianakis
  • Greg Hogan
  • Gyula Fora
  • Henry Saputra
  • Hilmi Yildirim
  • Hubert Czerpak
  • Jark Wu
  • Johannes
  • Jun Aoki
  • Jun Aoki
  • Kostas Kloudas
  • Li Chengxiang
  • Lun Gao
  • Martin Junghanns
  • Martin Liesenberg
  • Matthias J. Sax
  • Maximilian Michels
  • Márton Balassi
  • Nick Dimiduk
  • Niels Basjes
  • Omer Katz
  • Paris Carbone
  • Patrice Freydiere
  • Peter Vandenabeele
  • Piotr Godek
  • Prez Cannady
  • Robert Metzger
  • Romeo Kienzler
  • Sachin Goel
  • Saumitra Shahapure
  • Sebastian Klemke
  • Stefano Baghino
  • Stephan Ewen
  • Stephen Samuel
  • Subhobrata Dey
  • Suneel Marthi
  • Ted Yu
  • Theodore Vasiloudis
  • Till Rohrmann
  • Timo Walther
  • Trevor Grant
  • Ufuk Celebi
  • Ulf Karlsson
  • Vasia Kalavri
  • fversaci
  • madhukar
  • qingmeng.wyh
  • ramkrishna
  • rtudoran
  • sahitya-pavurala
  • zhangminglei