Apache Flink 1.1.4 Released

December 21, 2016 -

The Apache Flink community released the next bugfix version of the Apache Flink 1.1 series.

This release includes major robustness improvements for checkpoint cleanup on failures and consumption of intermediate streams. We highly recommend all users to upgrade to Flink 1.1.4.


You can find the binaries on the updated Downloads page.

Note for RocksDB Backend Users #

We updated Flink’s RocksDB dependency version from 4.5.1 to 4.11.2. Between these versions some of RocksDB’s internal configuration defaults changed that would affect the memory footprint of running Flink with RocksDB. Therefore, we manually reset them to the previous defaults. If you want to run with the new Rocks 4.11.2 defaults, you can do this via:

RocksDBStateBackend backend = new RocksDBStateBackend("...");
// Use the new default options. Otherwise, the default for RocksDB 4.5.1
// `PredefinedOptions.DEFAULT_ROCKS_4_5_1` will be used.

Sub-task #

  • [FLINK-4510] - Always create CheckpointCoordinator
  • [FLINK-4984] - Add Cancellation Barriers to BarrierTracker and BarrierBuffer
  • [FLINK-4985] - Report Declined/Canceled Checkpoints to Checkpoint Coordinator

Bug #

  • [FLINK-2662] - CompilerException: "Bug: Plan generation for Unions picked a ship strategy between binary plan operators."
  • [FLINK-3680] - Remove or improve (not set) text in the Job Plan UI
  • [FLINK-3813] - YARNSessionFIFOITCase.testDetachedMode failed on Travis
  • [FLINK-4108] - NPE in Row.productArity
  • [FLINK-4506] - CsvOutputFormat defaults allowNullValues to false, even though doc and declaration says true
  • [FLINK-4581] - Table API throws "No suitable driver found for jdbc:calcite"
  • [FLINK-4586] - NumberSequenceIterator and Accumulator threading issue
  • [FLINK-4619] - JobManager does not answer to client when restore from savepoint fails
  • [FLINK-4727] - Kafka 0.9 Consumer should also checkpoint auto retrieved offsets even when no data is read
  • [FLINK-4862] - NPE on EventTimeSessionWindows with ContinuousEventTimeTrigger
  • [FLINK-4932] - Don't let ExecutionGraph fail when in state Restarting
  • [FLINK-4933] - ExecutionGraph.scheduleOrUpdateConsumers can fail the ExecutionGraph
  • [FLINK-4977] - Enum serialization does not work in all cases
  • [FLINK-4991] - TestTask hangs in testWatchDogInterruptsTask
  • [FLINK-4998] - ResourceManager fails when num task slots > Yarn vcores
  • [FLINK-5013] - Flink Kinesis connector doesn't work on old EMR versions
  • [FLINK-5028] - Stream Tasks must not go through clean shutdown logic on cancellation
  • [FLINK-5038] - Errors in the "cancelTask" method prevent closeables from being closed early
  • [FLINK-5039] - Avro GenericRecord support is broken
  • [FLINK-5040] - Set correct input channel types with eager scheduling
  • [FLINK-5050] - JSON.org license is CatX
  • [FLINK-5057] - Cancellation timeouts are picked from wrong config
  • [FLINK-5058] - taskManagerMemory attribute set wrong value in FlinkShell
  • [FLINK-5063] - State handles are not properly cleaned up for declined or expired checkpoints
  • [FLINK-5073] - ZooKeeperCompleteCheckpointStore executes blocking delete operation in ZooKeeper client thread
  • [FLINK-5075] - Kinesis consumer incorrectly determines shards as newly discovered when tested against Kinesalite
  • [FLINK-5082] - Pull ExecutionService lifecycle management out of the JobManager
  • [FLINK-5085] - Execute CheckpointCoodinator's state discard calls asynchronously
  • [FLINK-5114] - PartitionState update with finished execution fails
  • [FLINK-5142] - Resource leak in CheckpointCoordinator
  • [FLINK-5149] - ContinuousEventTimeTrigger doesn't fire at the end of the window
  • [FLINK-5154] - Duplicate TypeSerializer when writing RocksDB Snapshot
  • [FLINK-5158] - Handle ZooKeeperCompletedCheckpointStore exceptions in CheckpointCoordinator
  • [FLINK-5172] - In RocksDBStateBackend, set flink-core and flink-streaming-java to "provided"
  • [FLINK-5173] - Upgrade RocksDB dependency
  • [FLINK-5184] - Error result of compareSerialized in RowComparator class
  • [FLINK-5193] - Recovering all jobs fails completely if a single recovery fails
  • [FLINK-5197] - Late JobStatusChanged messages can interfere with running jobs
  • [FLINK-5214] - Clean up checkpoint files when failing checkpoint operation on TM
  • [FLINK-5215] - Close checkpoint streams upon cancellation
  • [FLINK-5216] - CheckpointCoordinator's 'minPauseBetweenCheckpoints' refers to checkpoint start rather then checkpoint completion
  • [FLINK-5218] - Eagerly close checkpoint streams on cancellation
  • [FLINK-5228] - LocalInputChannel re-trigger request and release deadlock
  • [FLINK-5229] - Cleanup StreamTaskStates if a checkpoint operation of a subsequent operator fails
  • [FLINK-5246] - Don't discard unknown checkpoint messages in the CheckpointCoordinator
  • [FLINK-5248] - SavepointITCase doesn't catch savepoint restore failure
  • [FLINK-5274] - LocalInputChannel throws NPE if partition reader is released
  • [FLINK-5275] - InputChanelDeploymentDescriptors throws misleading Exception if producer failed/cancelled
  • [FLINK-5276] - ExecutionVertex archiving can throw NPE with many previous attempts
  • [FLINK-5285] - CancelCheckpointMarker flood when using at least once mode
  • [FLINK-5326] - IllegalStateException: Bug in Netty consumer logic: reader queue got notified by partition about available data, but none was available
  • [FLINK-5352] - Restore RocksDB 1.1.3 memory behavior

Improvement #

  • [FLINK-3347] - TaskManager (or its ActorSystem) need to restart in case they notice quarantine
  • [FLINK-3787] - Yarn client does not report unfulfillable container constraints
  • [FLINK-4445] - Ignore unmatched state when restoring from savepoint
  • [FLINK-4715] - TaskManager should commit suicide after cancellation failure
  • [FLINK-4894] - Don't block on buffer request after broadcastEvent
  • [FLINK-4975] - Add a limit for how much data may be buffered during checkpoint alignment
  • [FLINK-4996] - Make CrossHint @Public
  • [FLINK-5046] - Avoid redundant serialization when creating the TaskDeploymentDescriptor
  • [FLINK-5123] - Add description how to do proper shading to Flink docs.
  • [FLINK-5169] - Make consumption of input channels fair
  • [FLINK-5192] - Provide better log config templates
  • [FLINK-5194] - Log heartbeats on TRACE level
  • [FLINK-5196] - Don't log InputChannelDescriptor
  • [FLINK-5198] - Overwrite TaskState toString
  • [FLINK-5199] - Improve logging of submitted job graph actions in HA case
  • [FLINK-5201] - Promote loaded config properties to INFO
  • [FLINK-5207] - Decrease HadoopFileSystem logging
  • [FLINK-5249] - description of datastream rescaling doesn't match the figure
  • [FLINK-5259] - wrong execution environment in retry delays example
  • [FLINK-5278] - Improve Task and checkpoint logging

New Feature #

  • [FLINK-4976] - Add a way to abort in flight checkpoints

Task #

  • [FLINK-4778] - Update program example in /docs/setup/cli.md due to the change in FLINK-2021