PyFlink: The integration of Pandas into PyFlink

04 Aug 2020 Jincheng Sun (@sunjincheng121) & Markos Sfikas (@MarkSfik)

The Apache Flink community put some great effort into integrating Pandas with PyFlink in the latest Flink version 1.11. Some of the added features include support for Pandas UDF and the conversion between Pandas DataFrame and Table. In this article, we will introduce how these functionalities work and how to use them with a step-by-step example.

Continue reading »

Flink Community Update - July'20

27 Jul 2020 Marta Paes (@morsapaes)

As July draws to an end, we look back at a monthful of activity in the Flink community, including two releases (!) and some work around improving the first-time contribution experience in the project. Also, events are starting to pick up again, so we've put together a list of some great events you can (virtually) attend in August!

Continue reading »

Sharing is caring - Catalogs in Flink SQL

23 Jul 2020 Dawid Wysakowicz (@dwysakowicz)

With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data formats and similar cumbersome tasks.

Continue reading »

Application Deployment in Flink: Current State and the new Application Mode

14 Jul 2020 Kostas Kloudas (@kkloudas)

With the rise of stream processing and real-time analytics as a critical tool for modern businesses, an increasing number of organizations build platforms with Apache Flink at their core and offer it internally as a service. Many talks with related topics from companies like Uber, Netflix and Alibaba in the latest editions of Flink Forward further illustrate this trend.

Continue reading »

Apache Flink 1.11.0 Release Announcement

06 Jul 2020 Marta Paes (@morsapaes)

The Apache Flink community is proud to announce the release of Flink 1.11.0! More than 200 contributors worked on over 1.3k issues to bring significant improvements to usability as well as new features to Flink users across the whole API stack. We're particularly excited about unaligned checkpoints to cope with high backpressure scenarios, a new source API that simplifies and unifies the implementation of (custom) sources, and support for Change Data Capture (CDC) and other common use cases in the Table API/SQL. Read on for all major new features and improvements, important changes to be aware of and what to expect moving forward!

Continue reading »

  • Previous
  • Page: 1 of 13