Apache Flink® — Stateful Computations over Data Streams

Apache Flink® — Stateful Computations over Data Streams #

All streaming use cases
  • Event-driven Applications
  • Stream & Batch Analytics
  • Data Pipelines & ETL
Learn more
Guaranteed correctness
  • Exactly-once state consistency
  • Event-time processing
  • Sophisticated late data handling
Learn more
Layered APIs
  • SQL on Stream & Batch Data
  • DataStream API & DataSet API
  • ProcessFunction (Time & State)
Learn more
Operational Focus
  • Flexible deployment
  • High-availability setup
  • Savepoints
Learn more
Scales to any use case
  • Scale-out architecture
  • Support for very large state
  • Incremental checkpointing
Learn more
Excellent Performance
  • Low latency
  • High throughput
  • In-Memory computing
Learn more

Latest Blog Posts #

Apache Flink Kubernetes Operator 1.5.0 Release Announcement
The Apache Flink community is excited to announce the release of Flink Kubernetes Operator 1.5.0! The release focuses on improvements to the job autoscaler that was introduced in the previous release and general operational hardening of the operator. We encourage you to download the release and share your feedback with the community through the Flink mailing lists or JIRA! We hope you like the new release and we’d be eager to learn about your experience with it.

Howto test a batch source with the new Source framework
Introduction # The Flink community has designed a new Source framework based on FLIP-27 lately. This article is the continuation of the howto create a batch source with the new Source framework article . Now it is time to test the created source ! As the previous article, this one was built while implementing the Flink batch source for Cassandra. Unit testing the source # Testing the serializers # Example Cassandra SplitSerializer and SplitEnumeratorStateSerializer

Howto migrate a real-life batch pipeline from the DataSet API to the DataStream API
Introduction # The Flink community has been deprecating the DataSet API since version 1.12 as part of the work on FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API) . This blog article illustrates the migration of a real-life batch DataSet pipeline to a batch DataStream pipeline. All the code presented in this article is available in the tpcds-benchmark-flink repo. The use case shown here is extracted from a broader work comparing Flink performances of different APIs by implementing TPCDS queries using these APIs.