Apache Flink 2.2.0: Advancing Real-Time Data + AI and Empowering Stream Processing for the AI Era

December 4, 2025 - Hang Ruan

The Apache Flink PMC is proud to announce the release of Apache Flink 2.2.0. Flink 2.2.0 further enriches AI capabilities, enhances materialized tables and the Connector framework, and improves batch processing and PyFlink support. This release brings together 73 global contributors, implements 9 FLIPs (Flink Improvement Proposals), and resolves over 220 issues.

Apache Flink 2.2 ushers in the AI era by seamlessly integrating real-time data processing with artificial intelligence. It introduces ML_PREDICT for large language model inference and VECTOR_SEARCH for real-time vector similarity search, empowering streaming AI applications. Enhanced features like Materialized Tables, Delta Joins, balanced task scheduling, and smarter connectors (including rate limiting and skew-aware split assignment) significantly boost performance, scalability, and reliability—laying a robust foundation for intelligent, low-latency data pipelines. We extend our gratitude to all contributors for their invaluable support!

Let’s dive into the highlights.

Flink SQL Improvements #

Realtime AI Function #

Apache Flink has supported leveraging LLM capabilities through the ML_PREDICT function in Flink SQL since version 2.1, enabling users to perform semantic analysis in a simple and efficient way. In Flink 2.2, the Table API also supports model inference operations that allow you to integrate machine learning models directly into your data processing pipelines. You can create models with specific providers (like OpenAI) and use them to make predictions on your data.

Example:

  • Creating and Using Models
// 1. Set up the local environment
EnvironmentSettings settings = EnvironmentSettings.inStreamingMode();
TableEnvironment tEnv = TableEnvironment.create(settings);

// 2. Create a source table from in-memory data
Table myTable = tEnv.fromValues(
    ROW(FIELD("text", STRING())),
    row("Hello"),
    row("Machine Learning"),
    row("Good morning")
);

// 3. Create model
tEnv.createModel(
    "my_model",
    ModelDescriptor.forProvider("openai")
        .inputSchema(Schema.newBuilder().column("input", STRING()).build())
        .outputSchema(Schema.newBuilder().column("output", STRING()).build())
        .option("endpoint", "https://api.openai.com/v1/chat/completions")
        .option("model", "gpt-4.1")
        .option("system-prompt", "translate to chinese")
        .option("api-key", "<your-openai-api-key-here>")
        .build()
);

Model model = tEnv.fromModel("my_model");

// 4. Use the model to make predictions
Table predictResult = model.predict(myTable, ColumnList.of("text"));

// 5. Async prediction example
Table asyncPredictResult = model.predict(
    myTable, 
    ColumnList.of("text"), 
    Map.of("async", "true")
);

More Information

Apache Flink has supported leveraging LLM capabilities through the ML_PREDICT function in Flink SQL since version 2.1. This integration has been technically validated in scenarios such as log classification and real-time question-answering systems. However, the current architecture allows Flink to only use embedding models to convert unstructured data (e.g., text, images) into high-dimensional vector features, which are then persisted to downstream storage systems. It lacks real-time online querying and similarity analysis capabilities for vector spaces. The VECTOR_SEARCH function is provided in Flink 2.2 to enable users to perform streaming vector similarity searches and real-time context retrieval directly within Flink.

Take the following SQL statements as an example:

-- Basic usage
SELECT * FROM 
input_table, LATERAL VECTOR_SEARCH(
  TABLE vector_table,
  input_table.vector_column,
  DESCRIPTOR(index_column),
  10
);

-- With configuration options
SELECT * FROM 
input_table, LATERAL VECTOR_SEARCH(
  TABLE vector_table,
  input_table.vector_column,
  DESCRIPTOR(index_column),
  10,
  MAP['async', 'true', 'timeout', '100s']
);

-- Using named parameters
SELECT * FROM 
input_table, LATERAL VECTOR_SEARCH(
  SEARCH_TABLE => TABLE vector_table,
  COLUMN_TO_QUERY => input_table.vector_column,
  COLUMN_TO_SEARCH => DESCRIPTOR(index_column),
  TOP_K => 10,
  CONFIG => MAP['async', 'true', 'timeout', '100s']
);

-- Searching with contant value
SELECT * 
FROM VECTOR_SEARCH(
  TABLE vector_table,
  ARRAY[10, 20],
  DESCRIPTOR(index_column),
  10,
);

More Information

Materialized Table #

Materialized Table is a new table type introduced in Flink SQL, aimed at simplifying both batch and stream data pipelines, providing a consistent development experience. By specifying data freshness and query when creating Materialized Table, the engine automatically derives the schema for the materialized table and creates corresponding data refresh pipeline to achieve the specified freshness.

From Flink 2.2, the FRESHNESS clause is not a mandatory part of the CREATE MATERIALIZED TABLE and CREATE OR ALTER MATERIALIZED TABLE DDL statements. Flink 2.2 introduces a new MaterializedTableEnricher interface. This provides a formal extension point for customizable default logic, allowing advanced users and vendors to implement “smart” default behaviors (e.g., inferring freshness from upstream tables).

Besides this, users can use DISTRIBUTED BY orDISTRIBUTED INTO to support bucketing concept for Materialized tables. Users can use SHOW MATERIALIZED TABLES to show all Materialized tables.

Take the following SQL statements as an example:

CREATE MATERIALIZED TABLE my_materialized_table
    PARTITIONED BY (ds)
    DISTRIBUTED INTO 5 BUCKETS
    FRESHNESS = INTERVAL '1' HOUR
    AS SELECT
        ds
    FROM
     ...

More Information

SinkUpsertMaterializer V2 #

SinkUpsertMaterializer is an operator in Flink that reconciles out of order changelog events before sending them to an upsert sink. Performance of this operator degrades exponentially in some cases. Flink 2.2 introduces a new implementation that is optimized for such cases.

More Information

Delta Join #

In 2.1, Apache Flink has introduced a new delta join operator to mitigate the challenges caused by big state in regular joins. It replaces the large state maintained by regular joins with a bidirectional lookup-based join that directly reuses data from the source tables.

Flink 2.2 enhances support for converting more SQL patterns into delta joins. Delta joins now support consuming CDC sources without DELETE operations, and allow projection and filter operations after the source. Additionally, delta joins include support for caching, which helps reduce requests to external storage.

Currently, tables with Apache Fluss (Incubating) can be used as source tables for delta joins. You can find the table definitions and example query that can be optimized into delta joins in Fluss.

More Information

SQL Types #

Before Flink 2.2, row types defined in SQL e.g. SELECT CAST(f AS ROW&lt;i NOT NULL&gt;) did ignore the NOT NULL constraint. This was more aligned with the SQL standard but caused many type inconsistencies and cryptic error message when working on nested data. For example, it prevented using rows in computed columns or join keys. The new behavior takes the nullability into consideration. The config option table.legacy-nested-row-nullability allows to restore the old behavior if required, but it is recommended to update existing queries that ignored constraints before.

Casting to TIME type now considers the correct precision (0-3). Casting incorrect strings to time (e.g. where the hour component is higher than 24) leads to a runtime exception now. Casting between BINARY and VARBINARY should now correctly consider the target length.

More Information

Use UniqueKeys instead of Upsertkeys for state management #

This is considerable optimization and an breaking change for the StreamingMultiJoinOperator. As noted in the release notes, the operator was launched in an experimental state for Flink 2.1 since we’re working on relevant optimizations that could be breaking changes.

More Information

Runtime #

Balanced Tasks Scheduling #

Flink 2.2 introduced a balanced tasks scheduling strategy to achieve task load balancing for TMs and reducing job bottlenecks.

More Information

Enhanced Job History Retention Policies for HistoryServer #

Before Flink 2.2, HistoryServer supports only a quantity-based job archive retention policy and is insufficient for scenarios, requiring time-based retention or combined rules. Users can use the new configuration historyserver.archive.retained-ttl combining with historyserver.archive.retained-jobs to fulfill more scenario requirements.

More Information

Metrics #

Since 2.2.0 users can now assign custom metric variables for each operator/transformation used in the Job. Those variables are later converted to tags/labels by the metric reporters, allowing users to tab/label specific operator’s metrics. For example, you can use this to name and differentiate sources.

Users can now control the level of details of checkpoint spans via traces.checkpoint.span-detail-level. Highest levels report tree of spans for each task and subtask. Reported custom spans can now contain children spans. See more details in Traces.

More Information

Connectors #

Introduce RateLimiter for Scan Source #

Flink jobs frequently exchange data with external systems, which consumes their network bandwidth and CPU. When these resources are scarce, pulling data too aggressively can disrupt other workloads, such as the MySQL CDC connector. In Flink 2.2, we introduce a RateLimiter interface to provide request rate limiting for Scan Sources and connector developers can integrate with rate limiting frameworks to implement their own read restriction strategies. This feature is currently only available in the DataStream API.

More Information

Balanced splits assignment #

SplitEnumerator is responsible for assigning splits, but it lacks visibility into the actual runtime status or distribution of these splits. This makes it impossible for SplitEnumerator to guarantee that the sharding is evenly distributed, and data skew is very likely to occur. From Flink 2.2, SplitEnumerator has the information of the splits distribution and provides the ability to evenly assign splits at runtime. For example, this feature can be used to address data skew issues in the Kafka connector.

More Information

Others Improvements #

In Flink 2.2, we have added support of async function in Python DataStream API. This enables Python users to efficiently query external services in their Flink jobs, e.g. large-sized LLM which is typically deployed in a standalone GPU cluster, etc.

Furthermore, we have provided comprehensive support to ensure the stability of external service access. On one hand, we support limiting the number of concurrent requests sent to the external service to avoid overwhelming it. On the other hand, we have also added retry support to tolerate temporary unavailability which maybe caused by network jitter or other transient issues.

Here is a simple example showing how to use it:

from typing import List
from ollama import AsyncClient

from pyflink.common import Types, Time, Row
from pyflink.datastream import (
    StreamExecutionEnvironment,
    AsyncDataStream,
    AsyncFunction,
    RuntimeContext,
    CheckpointingMode,
)


class AsyncLLMRequest(AsyncFunction[Row, str]):

    def __init__(self, host, port):
        self._host = host
        self._port = port
  
    def open(self, runtime_context: RuntimeContext):
        self._client = AsyncClient(host='{}:{}'.format(self._host, self._port))

    async def async_invoke(self, value: Row) -> List[str]:
        message = {"role": "user", "content": value.question}
        question_id = value.id
        ollam_response = await self._client.chat(model="qwen3:4b", messages=[message])
        return [
            f"Question ID {question_id}: response: {ollam_response['message']['content']}"
        ]

    def timeout(self, value: Row) -> List[str]:
        # return a default value in case timeout
        return [f"Timeout for this question: {value.a}"]


def main(output_path):
    env = StreamExecutionEnvironment.get_execution_environment()
    env.enable_checkpointing(30000, CheckpointingMode.EXACTLY_ONCE)
    ds = env.from_collection(
        [
            ("Who are you?", 0),
            ("Tell me a joke", 1),
            ("Tell me the result of comparing 0.8 and 0.11", 2),
        ],
        type_info=Types.ROW_NAMED(["question", "id"], [Types.STRING(), Types.INT()]),
    )

    result_stream = AsyncDataStream.unordered_wait(
        data_stream=ds,
        async_function=AsyncLLMRequest(),
        timeout=Time.seconds(100),
        capacity=1000,
        output_type=Types.STRING(),
    )

    # define the sink
    result_stream.print()

    # submit for execution
    env.execute()


if __name__ == "__main__":
    main(known_args.output)

More Information

Upgrade commons-lang3 to version 3.18.0 #

Upgrade commons-lang3 from 3.12.0 to 3.18.0 to mitigate CVE-2025-48924.

More Information

Upgrade protobuf-java from 3.x to 4.32.1 with compatibility patch for parquet-protobuf #

Flink now uses protobuf-java 4.32.1 (corresponding to Protocol Buffers version 32), upgrading from protobuf-java 3.21.7 (Protocol Buffers version 21). This major upgrade enables:

  • Protobuf Editions Support: Full support for the new edition = "2023" and edition = "2024" syntax introduced in Protocol Buffers v27+. Editions provide a unified approach that combines proto2 and proto3 functionality with fine-grained feature control.
  • Improved Proto3 Field Presence: Better handling of optional fields in proto3 without the limitations of older protobuf versions, eliminating the need to set protobuf.read-default-values to true for field presence checking.
  • Enhanced Performance: Leverages performance improvements and bug fixes from 11 Protocol Buffers releases (versions 22-32).
  • Modern Protobuf Features: Access to newer protobuf capabilities including Edition 2024 features and improved runtime behavior.

Users with existing proto2 and proto3 .proto files will continue to work without changes.

More Information

Upgrade Notes #

The Flink community tries to ensure that upgrades are as seamless as possible. However, certain changes may require users to make adjustments to certain parts of the program when upgrading to version 2.2. Please refer to the release notes for a comprehensive list of adjustments to make and issues to check during the upgrading process.

List of Contributors #

The Apache Flink community would like to express gratitude to all the contributors who made this release possible:

Alan Sheinberg, Aleksandr Iushmanov, AlexYinHan, Arvid Heise, CuiYanxiang, David Hotham, David Radley, Dawid Wysakowicz, Dian Fu, Etienne Chauchot, Ferenc Csaky, Gabor Somogyi, Gustavo de Morais, Hang Ruan, Hao Li, Hongshun Wang, Jackeyzhe, Jakub Stejskal, Jiaan Geng, Jinkun Liu, Juntao Zhang, Kaiqi Dong, Khaled Hammouda, Kumar Mallikarjuna, Kunni, Laffery, Mario Petruccelli, Martijn Visser, Mate Czagany, Maximilian Michels, Mika Naylor, Mingliang Liu, Myracle, Naci Simsek, Natea Eshetu Beshada, Niharika Sakuru, Pan Yuepeng, Piotr Nowojski, Poorvank,Ramin Gharib, Roc Marshal, Roman Khachatryan, Ron, Rosa Kang, Rui Fan, Sergey Nuyanzin, Shengkai, Stefan Richter, Stepan Stepanishchev, Swapnil Aher, Timo Walther, Xingcan Cui, Xuyang, Yuepeng Pan, Yunfeng Zhou, Zakelly, Zhanghao Chen, dylanhz, gong-flying, hejufang, lincoln lee, lincoln-lil, mateczagany, morvenhuang, noorall, r-sidd, sxnan, voonhous, xia rui, xiangyu0xf, yangli1206, yunfengzhou-hub, zhou