Running Apache Flink on Kubernetes with KUDO

November 6, 2019 - Gerred Dillon

A common use case for Apache Flink is streaming data analytics together with Apache Kafka, which provides a pub/sub model and durability for data streams. To achieve elastic scalability, both are typically deployed in clustered environments, and increasingly on top of container orchestration platforms like Kubernetes. The Operator pattern provides an extension mechanism to Kubernetes that captures human operator knowledge about an application, like Flink, in software to automate its operation. ...

Continue reading »

Apache Flink 1.9.1 Released

October 18, 2019 - Jark Wu (@JarkWu)

The Apache Flink community released the first bugfix version of the Apache Flink 1.9 series. This release includes 96 fixes and minor improvements for Flink 1.9.0. The list below includes a detailed list of all fixes and improvements. We highly recommend all users to upgrade to Flink 1.9.1. Updated Maven dependencies: <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.9.1</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.9.1</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.9.1</version> </dependency> You can find the binaries on the updated Downloads page. ...

Continue reading »

The State Processor API: How to Read, write and modify the state of Flink applications

September 13, 2019 - Seth Wiesman (@sjwiesman) Fabian Hueske (@fhueske)

Whether you are running Apache FlinkⓇ in production or evaluated Flink as a computation framework in the past, you’ve probably found yourself asking the question: How can I access, write or update state in a Flink savepoint? Ask no more! Apache Flink 1.9.0 introduces the State Processor API, a powerful extension of the DataSet API that allows reading, writing and modifying state in Flink’s savepoints and checkpoints. In this post, we explain why this feature is a big step for Flink, what you can use it for, and how to use it. ...

Continue reading »

Apache Flink 1.8.2 Released

September 11, 2019 - Jark Wu (@JarkWu)

The Apache Flink community released the second bugfix version of the Apache Flink 1.8 series. This release includes 23 fixes and minor improvements for Flink 1.8.1. The list below includes a detailed list of all fixes and improvements. We highly recommend all users to upgrade to Flink 1.8.2. Updated Maven dependencies: <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.8.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.8.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.8.2</version> </dependency> You can find the binaries on the updated Downloads page. ...

Continue reading »

Flink Community Update - September'19

September 5, 2019 - Marta Paes (@morsapaes)

This has been an exciting, fast-paced year for the Apache Flink community. But with over 10k messages across the mailing lists, 3k Jira tickets and 2k pull requests, it is not easy to keep up with the latest state of the project. Plus everything happening around it. With that in mind, we want to bring back regular community updates to the Flink blog. The first post in the series takes you on an little detour across the year, to freshen up and make sure you’re all up to date. ...

Continue reading »

Apache Flink 1.9.0 Release Announcement

August 22, 2019 -

The Apache Flink community is proud to announce the release of Apache Flink 1.9.0. The Apache Flink project’s goal is to develop a stream processing system to unify and power many forms of real-time and offline data processing applications as well as event-driven applications. In this release, we have made a huge step forward in that effort, by integrating Flink’s stream and batch processing capabilities under a single, unified runtime. ...

Continue reading »

Flink Network Stack Vol. 2: Monitoring, Metrics, and that Backpressure Thing

July 23, 2019 - Nico Kruber Piotr Nowojski

In a previous blog post, we presented how Flink’s network stack works from the high-level abstractions to the low-level details. This second blog post in the series of network stack posts extends on this knowledge and discusses monitoring network-related metrics to identify effects such as backpressure or bottlenecks in throughput and latency. Although this post briefly covers what to do with backpressure, the topic of tuning the network stack will be further examined in a future post. ...

Continue reading »

Apache Flink 1.8.1 Released

July 2, 2019 - Jincheng Sun (@sunjincheng121)

The Apache Flink community released the first bugfix version of the Apache Flink 1.8 series. This release includes more than 40 fixes and minor improvements for Flink 1.8.1. The list below includes a detailed list of all improvements, sub-tasks and bug fixes. We highly recommend all users to upgrade to Flink 1.8.1. Updated Maven dependencies: <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.8.1</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.8.1</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.8.1</version> </dependency> You can find the binaries on the updated Downloads page. ...

Continue reading »

A Practical Guide to Broadcast State in Apache Flink

June 26, 2019 - Fabian Hueske (@fhueske)

Since version 1.5.0, Apache Flink features a new type of state which is called Broadcast State. In this post, we explain what Broadcast State is, and show an example of how it can be applied to an application that evaluates dynamic patterns on an event stream. We walk you through the processing steps and the source code to implement this application in practice. What is Broadcast State? # The Broadcast State can be used to combine and jointly process two streams of events in a specific way. ...

Continue reading »

A Deep-Dive into Flink's Network Stack

June 5, 2019 - Nico Kruber

Flink’s network stack is one of the core components that make up the flink-runtime module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network stack between TaskManagers relies on a much lower-level API using Netty. ...

Continue reading »