Apache Flink 1.11.3 Released

December 18, 2020 - Xintong Song

The Apache Flink community released the third bugfix version of the Apache Flink 1.11 series.

This release includes 151 fixes and minor improvements for Flink 1.11.2. The list below includes a detailed list of all fixes and improvements.

We highly recommend all users to upgrade to Flink 1.11.3.

Updated Maven dependencies:

<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-java</artifactId>
  <version>1.11.3</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-streaming-java_2.11</artifactId>
  <version>1.11.3</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-clients_2.11</artifactId>
  <version>1.11.3</version>
</dependency>

You can find the binaries on the updated Downloads page.

List of resolved issues:

Sub-task

[FLINK-17393] - Improve the `FutureCompletingBlockingQueue` to wakeup blocking put() more elegantly.
[FLINK-18604] - HBase ConnectorDescriptor can not work in Table API
[FLINK-18673] - Calling ROW() in a UDF results in UnsupportedOperationException
[FLINK-18680] - Improve RecordsWithSplitIds API
[FLINK-18916] - Add a "Operations" link(linked to dev/table/tableApi.md) under the "Python API" -> "User Guide" -> "Table API" section
[FLINK-18918] - Add a "Connectors" document under the "Python API" -> "User Guide" -> "Table API" section
[FLINK-18922] - Add a "Catalogs" link (linked to dev/table/catalogs.md) under the "Python API" -> "User Guide" -> "Table API" section
[FLINK-18926] - Add a "Environment Variables" document under the "Python API" -> "User Guide" -> "Table API" section
[FLINK-19162] - Allow Split Reader based sources to reuse record batches
[FLINK-19205] - SourceReaderContext should give access to Configuration and Hostbame
[FLINK-20397] - Pass checkpointId to OperatorCoordinator.resetToCheckpoint().

Bug

[FLINK-9992] - FsStorageLocationReferenceTest#testEncodeAndDecode failed in CI
[FLINK-13733] - FlinkKafkaInternalProducerITCase.testHappyPath fails on Travis
[FLINK-15170] - WebFrontendITCase.testCancelYarn fails on travis
[FLINK-16246] - Exclude "SdkMBeanRegistrySupport" from dynamically loaded AWS connectors
[FLINK-16268] - Failed to run rank over window with Hive built-in functions
[FLINK-16768] - HadoopS3RecoverableWriterITCase.testRecoverWithStateWithMultiPart hangs
[FLINK-17341] - freeSlot in TaskExecutor.closeJobManagerConnection cause ConcurrentModificationException
[FLINK-17458] - TaskExecutorSubmissionTest#testFailingScheduleOrUpdateConsumers
[FLINK-17677] - FLINK_LOG_PREFIX recommended in docs is not always available
[FLINK-17825] - HA end-to-end gets killed due to timeout
[FLINK-18128] - CoordinatedSourceITCase.testMultipleSources gets stuck
[FLINK-18196] - flink throws `NullPointerException` when executeCheckpointing
[FLINK-18222] - "Avro Confluent Schema Registry nightly end-to-end test" unstable with "Kafka cluster did not start after 120 seconds"
[FLINK-18815] - AbstractCloseableRegistryTest.testClose unstable
[FLINK-18818] - HadoopRenameCommitterHDFSTest.testCommitOneFile[Override: false] failed with "java.io.IOException: The stream is closed"
[FLINK-18836] - Python UDTF doesn't work well when the return type isn't generator
[FLINK-18915] - FIXED_PATH(dummy Hadoop Path) with WriterImpl may cause ORC writer OOM
[FLINK-19022] - AkkaRpcActor failed to start but no exception information
[FLINK-19121] - Avoid accessing HDFS frequently in HiveBulkWriterFactory
[FLINK-19135] - (Stream)ExecutionEnvironment.execute() should not throw ExecutionException
[FLINK-19138] - Python UDF supports directly specifying input_types as DataTypes.ROW
[FLINK-19140] - Join with Table Function (UDTF) SQL example is incorrect
[FLINK-19151] - Flink does not normalize container resource with correct configurations when Yarn FairScheduler is used
[FLINK-19154] - Application mode deletes HA data in case of suspended ZooKeeper connection
[FLINK-19170] - Parameter naming error
[FLINK-19201] - PyFlink e2e tests is instable and failed with "Connection broken: OSError"
[FLINK-19227] - The catalog is still created after opening failed in catalog registering
[FLINK-19237] - LeaderChangeClusterComponentsTest.testReelectionOfJobMaster failed with "NoResourceAvailableException: Could not allocate the required slot within slot request timeout"
[FLINK-19244] - CSV format can't deserialize null ROW field
[FLINK-19250] - SplitFetcherManager does not propagate errors correctly
[FLINK-19253] - SourceReaderTestBase.testAddSplitToExistingFetcher hangs
[FLINK-19258] - Fix the wrong example of "csv.line-delimiter" in CSV documentation
[FLINK-19280] - The option "sink.buffer-flush.max-rows" for JDBC can't be disabled by set to zero
[FLINK-19281] - LIKE cannot recognize full table path
[FLINK-19291] - Fix exception for AvroSchemaConverter#convertToSchema when RowType contains multiple row fields
[FLINK-19295] - YARNSessionFIFOITCase.checkForProhibitedLogContents found a log with prohibited string
[FLINK-19300] - Timer loss after restoring from savepoint
[FLINK-19321] - CollectSinkFunction does not define serialVersionUID
[FLINK-19338] - New source interface cannot unregister unregistered source
[FLINK-19361] - Make HiveCatalog thread safe
[FLINK-19398] - Hive connector fails with IllegalAccessError if submitted as usercode
[FLINK-19401] - Job stuck in restart loop due to excessive checkpoint recoveries which block the JobMaster
[FLINK-19423] - Fix ArrayIndexOutOfBoundsException when executing DELETE statement in JDBC upsert sink
[FLINK-19433] - An Error example of FROM_UNIXTIME function in document
[FLINK-19448] - CoordinatedSourceITCase.testEnumeratorReaderCommunication hangs
[FLINK-19535] - SourceCoordinator should avoid fail job multiple times.
[FLINK-19557] - Issue retrieving leader after zookeeper session reconnect
[FLINK-19585] - UnalignedCheckpointCompatibilityITCase.test:97->runAndTakeSavepoint: "Not all required tasks are currently running."
[FLINK-19587] - Error result when casting binary type as varchar
[FLINK-19618] - Broken link in docs
[FLINK-19629] - Fix NullPointException when deserializing map field with null value for Avro format
[FLINK-19675] - The plan of is incorrect when Calc contains WHERE clause, composite fields access and Python UDF at the same time
[FLINK-19695] - Writing Table with RowTime Column of type TIMESTAMP(3) to Kafka fails with ClassCastException
[FLINK-19717] - SourceReaderBase.pollNext may return END_OF_INPUT if SplitReader.fetch throws
[FLINK-19740] - Error in to_pandas for table containing event time: class java.time.LocalDateTime cannot be cast to class java.sql.Timestamp
[FLINK-19741] - InternalTimeServiceManager fails to restore due to corrupt reads if there are other users of raw keyed state streams
[FLINK-19748] - KeyGroupRangeOffsets#KeyGroupOffsetsIterator should skip key groups that don't have a defined offset
[FLINK-19750] - Deserializer is not opened in Kafka consumer when restoring from state
[FLINK-19755] - Fix CEP documentation error of the example in 'After Match Strategy' section
[FLINK-19775] - SystemProcessingTimeServiceTest.testImmediateShutdown is instable
[FLINK-19777] - Fix NullPointException for WindowOperator.close()
[FLINK-19790] - Writing MAP<STRING, STRING> to Kafka with JSON format produces incorrect data.
[FLINK-19806] - Job may try to leave SUSPENDED state in ExecutionGraph#failJob()
[FLINK-19816] - Flink restored from a wrong checkpoint (a very old one and not the last completed one)
[FLINK-19852] - Managed memory released check can block IterativeTask
[FLINK-19867] - Validation fails for UDF that accepts var-args
[FLINK-19894] - Use iloc for positional slicing instead of direct slicing in from_pandas
[FLINK-19901] - Unable to exclude metrics variables for the last metrics reporter.
[FLINK-19906] - Incorrect result when compare two binary fields
[FLINK-19907] - Channel state (upstream) can be restored after emission of new elements (watermarks)
[FLINK-19909] - Flink application in attach mode could not terminate when the only job is canceled
[FLINK-19948] - Calling NOW() function throws compile exception
[FLINK-20013] - BoundedBlockingSubpartition may leak network buffer if task is failed or canceled
[FLINK-20018] - pipeline.cached-files option cannot escape ':' in path
[FLINK-20033] - Job fails when stopping JobMaster
[FLINK-20050] - SourceCoordinatorProviderTest.testCheckpointAndReset failed with NullPointerException
[FLINK-20063] - File Source requests an additional split on every restore.
[FLINK-20064] - Broken links in the documentation
[FLINK-20065] - UnalignedCheckpointCompatibilityITCase.test failed with AskTimeoutException
[FLINK-20068] - KafkaSubscriberTest.testTopicPatternSubscriber failed with unexpected results
[FLINK-20069] - docs_404_check doesn't work properly
[FLINK-20076] - DispatcherTest.testOnRemovedJobGraphDoesNotCleanUpHAFiles does not test the desired functionality
[FLINK-20079] - Modified UnalignedCheckpointITCase...MassivelyParallel fails
[FLINK-20081] - ExecutorNotifier should run handler in the main thread when receive an exception from the callable.
[FLINK-20143] - use `yarn.provided.lib.dirs` config deploy job failed in yarn per job mode
[FLINK-20165] - YARNSessionFIFOITCase.checkForProhibitedLogContents: Error occurred during initialization of boot layer java.lang.IllegalStateException: Module system already initialized
[FLINK-20175] - Avro Confluent Registry SQL format does not support adding nullable columns
[FLINK-20183] - Fix the default PYTHONPATH is overwritten in client side
[FLINK-20193] - SourceCoordinator should catch exception thrown from SplitEnumerator.start()
[FLINK-20194] - KafkaSourceFetcherManager.commitOffsets() should handle the case when there is no split fetcher.
[FLINK-20200] - SQL Hints are not supported in "Create View" syntax
[FLINK-20213] - Partition commit is delayed when records keep coming
[FLINK-20221] - DelimitedInputFormat does not restore compressed filesplits correctly leading to dataloss
[FLINK-20222] - The CheckpointCoordinator should reset the OperatorCoordinators when fail before the first checkpoint.
[FLINK-20223] - The RecreateOnResetOperatorCoordinator and SourceCoordinator executor thread should use the user class loader.
[FLINK-20243] - Remove useless words in documents
[FLINK-20262] - Building flink-dist docker image does not work without python2
[FLINK-20266] - New Sources prevent JVM shutdown when running a job
[FLINK-20270] - Fix the regression of missing ExternallyInducedSource support in FLIP-27 Source.
[FLINK-20277] - flink-1.11.2 ContinuousFileMonitoringFunction cannot restore from failure
[FLINK-20284] - Error happens in TaskExecutor when closing JobMaster connection if there was a python UDF
[FLINK-20285] - LazyFromSourcesSchedulingStrategy is possible to schedule non-CREATED vertices
[FLINK-20333] - Flink standalone cluster throws metaspace OOM after submitting multiple PyFlink UDF jobs.
[FLINK-20351] - Execution.transitionState does not properly log slot location
[FLINK-20382] - Exception thrown from JobMaster.startScheduling() may be ignored.
[FLINK-20396] - Add "OperatorCoordinator.resetSubtask()" to fix order problems of "subtaskFailed()"
[FLINK-20404] - ZooKeeper quorum fails to start due to missing log4j library
[FLINK-20413] - Sources should add splits back in "resetSubtask()", rather than in "subtaskFailed()".
[FLINK-20418] - NPE in IteratorSourceReader
[FLINK-20442] - Fix license documentation mistakes in flink-python.jar
[FLINK-20492] - The SourceOperatorStreamTask should implement cancelTask() and finishTask()
[FLINK-20554] - The Checkpointed Data Size of the Latest Completed Checkpoint is incorrectly displayed on the Overview page of the UI

New Feature

[FLINK-19934] - [FLIP-27 source] add new API: SplitEnumeratorContext.runInCoordinatorThread(Runnable)

Improvement

[FLINK-16753] - Exception from AsyncCheckpointRunnable should be wrapped in CheckpointException
[FLINK-18139] - Unaligned checkpoints checks wrong channels for inflight data.
[FLINK-18500] - Make the legacy planner exception more clear when resolving computed columns types for schema
[FLINK-18545] - Sql api cannot specify flink job name
[FLINK-18715] - add cpu usage metric of jobmanager/taskmanager
[FLINK-19193] - Recommend stop-with-savepoint in upgrade guidelines
[FLINK-19225] - Improve code and logging in SourceReaderBase
[FLINK-19245] - Set default queue capacity for FLIP-27 source handover queue to 2
[FLINK-19251] - Avoid confusing queue handling in "SplitReader.handleSplitsChanges()"
[FLINK-19252] - Jaas file created under io.tmp.dirs - folder not created if not exists
[FLINK-19265] - Simplify handling of 'NoMoreSplitsEvent'
[FLINK-19339] - Support Avro's unions with logical types
[FLINK-19523] - Hide sensitive command-line configurations
[FLINK-19569] - Upgrade ICU4J to 67.1+
[FLINK-19677] - TaskManager takes abnormally long time to register with JobManager on Kubernetes
[FLINK-19698] - Add close() method and onCheckpointComplete() to the Source.
[FLINK-19892] - Replace __metaclass__ field with metaclass keyword
[FLINK-20049] - Simplify handling of "request split".
[FLINK-20055] - Datadog API Key exposed in Flink JobManager logs
[FLINK-20142] - Update the document for CREATE TABLE LIKE that source table from different catalog is supported
[FLINK-20152] - Document which execution.target values are supported
[FLINK-20156] - JavaDocs of WatermarkStrategy.withTimestampAssigner are wrong wrt Java 8
[FLINK-20169] - Move emitting MAX_WATERMARK out of SourceOperator processing loop
[FLINK-20207] - Improve the error message printed when submitting the pyflink jobs via 'flink run'
[FLINK-20296] - Explanation of keyBy was broken by find/replace of deprecated forms of keyBy

Test

[FLINK-18725] - "Run Kubernetes test" failed with "30025: provided port is already allocated"

Task

[FLINK-20455] - Add check to LicenseChecker for top level /LICENSE files in shaded jars