Apache Flink supports a broad ecosystem and works seamlessly with many other data processing projects and frameworks.
Connectors provide code for interfacing with various third-party systems.
Currently these systems are supported:
To run an application using one of these connectors, additional third party components are usually required to be installed and launched, e.g., the servers for the message queues. Further instructions for these can be found in the corresponding subsections.
This is a list of third party packages (i.e., libraries, system extensions, or examples) built on Flink. The Flink community collects links to these packages but does not maintain them. Thus, they do not belong to the Apache Flink project, and the community cannot give any support for them. Is your project missing? Please let us know on the user/dev mailing list.
Apache Zeppelin is a web-based notebook that enables interactive data analytics and can be used with Flink as an execution engine (next to other engines). See also Jim Dowling’s Flink Forward talk about Zeppelin on Flink.
Cascading enables a user to build complex workflows easily on Flink and other execution engines. Cascading on Flink is built by dataArtisans and Driven, Inc. See Fabian Hueske’s Flink Forward talk for more details.
Apache Beam is an open-source, unified programming model that you can use to create a data processing pipeline. Flink is one of the back-ends supported by the Beam programming model.
Alluxio is an open-source memory-speed virtual distributed storage that enables applications to efficiently share data and access data across different storage systems in a unified namespace. Here is an example of using Flink to access data through Alluxio.
Python Examples on Flink
A collection of examples using Apache Flink’s Python API.
WordCount Example in Clojure
A small WordCount example on how to write a Flink program in Clojure.
Anomaly Detection and Prediction in Flink
flink-htm is a library for anomaly detection and prediction in Apache Flink. The algorithms are based on Hierarchical Temporal Memory (HTM) as implemented by the Numenta Platform for Intelligent Computing (NuPIC).
Apache Ignite is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time. See Flink sink streaming connector to inject data into Ignite cache.
Tink temporal graph library
Tink is a temporal graph library built on top of Flink. It allows for temporal graph analytics like different interpretations of the shortest temporal path algorithm and metrics like temporal betweenness and temporal closeness. This library was the result of the Thesis of Wouter Ligtenberg.