Apache Flink 0.6 available

August 26, 2014 -

We are happy to announce the availability of Flink 0.6. This is the first release of the system inside the Apache Incubator and under the name Flink. Releases up to 0.5 were under the name Stratosphere, the academic and open source project that Flink originates from.

Apache Flink is a general-purpose data processing engine for clusters. It runs on YARN clusters on top of data stored in Hadoop, as well as stand-alone. Flink currently has programming APIs in Java and Scala. Jobs are executed via Flink’s own runtime engine. Flink features:

Robust in-memory and out-of-core processing: once read, data stays in memory as much as possible, and is gracefully de-staged to disk in the presence of memory pressure from limited memory or other applications. The runtime is designed to perform very well both in setups with abundant memory and in setups where memory is scarce.

POJO-based APIs: when programming, you do not have to pack your data into key-value pairs or some other framework-specific data model. Rather, you can use arbitrary Java and Scala types to model your data.

Efficient iterative processing: Flink contains explicit “iterate” operators that enable very efficient loops over data sets, e.g., for machine learning and graph applications.

A modular system stack: Flink is not a direct implementation of its APIs but a layered system. All programming APIs are translated to an intermediate program representation that is compiled and optimized via a cost-based optimizer. Lower-level layers of Flink also expose programming APIs for extending the system.

Data pipelining/streaming: Flink’s runtime is designed as a pipelined data processing engine rather than a batch processing engine. Operators do not wait for their predecessors to finish in order to start processing data. This results to very efficient handling of large data sets.

Release 0.6 #

Flink 0.6 builds on the latest Stratosphere 0.5 release. It includes many bug fixes and improvements that make the system more stable and robust, as well as breaking API changes.

The full release notes are available here.

Download the release here.

Contributors #

  • Wilson Cao
  • Ufuk Celebi
  • Stephan Ewen
  • Jonathan Hasenburg
  • Markus Holzemer
  • Fabian Hueske
  • Sebastian Kunert
  • Vikhyat Korrapati
  • Aljoscha Krettek
  • Sebastian Kruse
  • Raymond Liu
  • Robert Metzger
  • Mingliang Qi
  • Till Rohrmann
  • Henry Saputra
  • Chesnay Schepler
  • Kostas Tzoumas
  • Robert Waury
  • Timo Walther
  • Daniel Warneke
  • Tobias Wiens