Apache Flink 1.3.2 Released

05 Aug 2017

The Apache Flink community released the second bugfix version of the Apache Flink 1.3 series.

This release includes more than 60 fixes and minor improvements for Flink 1.3.1. The list below includes a detailed list of all fixes.

We highly recommend all users to upgrade to Flink 1.3.2.

Important Notice:

A user reported a bug in the FlinkKafkaConsumer (FLINK-7143) that is causing incorrect partition assignment in large Kafka deployments in the presence of inconsistent broker metadata. In that case multiple parallel instances of the FlinkKafkaConsumer may read from the same topic partition, leading to data duplication. In Flink 1.3.2 this bug is fixed but incorrect assignments from Flink 1.3.0 and 1.3.1 cannot be automatically fixed by upgrading to Flink 1.3.2 via a savepoint because the upgraded version would resume the wrong partition assignment from the savepoint. If you believe you are affected by this bug (seeing messages from some partitions duplicated) please refer to the JIRA issue for an upgrade path that works around that.

Before attempting the more elaborate upgrade path, we would suggest to check if you are actually affected by this bug. We did not manage to reproduce it in various testing clusters and according to the reporting user, it only appeared in rare cases on their very large setup. This leads us to believe that most likely only a minority of setups would be affected by this bug.

Notable changes:

  • The default Kafka version for Flink Kafka Consumer 0.10 was bumped from 0.10.0.1 to 0.10.2.1.
  • Some default values for configurations of AWS API call behaviors in the Flink Kinesis Consumer were adapted for better default consumption performance: 1) SHARD_GETRECORDS_MAX default changed to 10,000, and 2) SHARD_GETRECORDS_INTERVAL_MILLIS default changed to 200ms.

Updated Maven dependencies:

<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-java</artifactId>
  <version>1.3.2</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-streaming-java_2.10</artifactId>
  <version>1.3.2</version>
</dependency>
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-clients_2.10</artifactId>
  <version>1.3.2</version>
</dependency>

You can find the binaries on the updated Downloads page.

List of resolved issues:

Sub-task

  • [FLINK-6665] - Pass a ScheduledExecutorService to the RestartStrategy
  • [FLINK-6667] - Pass a callback type to the RestartStrategy, rather than the full ExecutionGraph
  • [FLINK-6680] - App & Flink migration guide: updates for the 1.3 release

Bug

  • [FLINK-5488] - yarnClient should be closed in AbstractYarnClusterDescriptor for error conditions
  • [FLINK-6376] - when deploy flink cluster on the yarn, it is lack of hdfs delegation token.
  • [FLINK-6541] - Jar upload directory not created
  • [FLINK-6654] - missing maven dependency on "flink-shaded-hadoop2-uber" in flink-dist
  • [FLINK-6655] - Misleading error message when HistoryServer path is empty
  • [FLINK-6742] - Improve error message when savepoint migration fails due to task removal
  • [FLINK-6774] - build-helper-maven-plugin version not set
  • [FLINK-6806] - rocksdb is not listed as state backend in doc
  • [FLINK-6843] - ClientConnectionTest fails on travis
  • [FLINK-6867] - Elasticsearch 1.x ITCase still instable due to embedded node instability
  • [FLINK-6918] - Failing tests: ChainLengthDecreaseTest and ChainLengthIncreaseTest
  • [FLINK-6945] - TaskCancelAsyncProducerConsumerITCase.testCancelAsyncProducerAndConsumer instable test case
  • [FLINK-6964] - Fix recovery for incremental checkpoints in StandaloneCompletedCheckpointStore
  • [FLINK-6965] - Avro is missing snappy dependency
  • [FLINK-6987] - TextInputFormatTest fails when run in path containing spaces
  • [FLINK-6996] - FlinkKafkaProducer010 doesn't guarantee at-least-once semantic
  • [FLINK-7005] - Optimization steps are missing for nested registered tables
  • [FLINK-7011] - Instable Kafka testStartFromKafkaCommitOffsets failures on Travis
  • [FLINK-7025] - Using NullByteKeySelector for Unbounded ProcTime NonPartitioned Over
  • [FLINK-7034] - GraphiteReporter cannot recover from lost connection
  • [FLINK-7038] - Several misused "KeyedDataStream" term in docs and Javadocs
  • [FLINK-7041] - Deserialize StateBackend from JobCheckpointingSettings with user classloader
  • [FLINK-7132] - Fix BulkIteration parallelism
  • [FLINK-7133] - Fix Elasticsearch version interference
  • [FLINK-7137] - Flink table API defaults top level fields as nullable and all nested fields within CompositeType as non-nullable
  • [FLINK-7143] - Partition assignment for Kafka consumer is not stable
  • [FLINK-7154] - Missing call to build CsvTableSource example
  • [FLINK-7158] - Wrong test jar dependency in flink-clients
  • [FLINK-7177] - DataSetAggregateWithNullValuesRule fails creating null literal for non-nullable type
  • [FLINK-7178] - Datadog Metric Reporter Jar is Lacking Dependencies
  • [FLINK-7180] - CoGroupStream perform checkpoint failed
  • [FLINK-7195] - FlinkKafkaConsumer should not respect fetched partitions to filter restored partition states
  • [FLINK-7216] - ExecutionGraph can perform concurrent global restarts to scheduling
  • [FLINK-7225] - Cutoff exception message in StateDescriptor
  • [FLINK-7226] - REST responses contain invalid content-encoding header
  • [FLINK-7231] - SlotSharingGroups are not always released in time for new restarts
  • [FLINK-7234] - Fix CombineHint documentation
  • [FLINK-7241] - Fix YARN high availability documentation
  • [FLINK-7255] - ListStateDescriptor example uses wrong constructor
  • [FLINK-7258] - IllegalArgumentException in Netty bootstrap with large memory state segment size
  • [FLINK-7266] - Don't attempt to delete parent directory on S3
  • [FLINK-7268] - Zookeeper Checkpoint Store interacting with Incremental State Handles can lead to loss of handles
  • [FLINK-7281] - Fix various issues in (Maven) release infrastructure

Improvement

  • [FLINK-6365] - Adapt default values of the Kinesis connector
  • [FLINK-6575] - Disable all tests on Windows that use HDFS
  • [FLINK-6682] - Improve error message in case parallelism exceeds maxParallelism
  • [FLINK-6789] - Remove duplicated test utility reducer in optimizer
  • [FLINK-6874] - Static and transient fields ignored for POJOs
  • [FLINK-6898] - Limit size of operator component in metric name
  • [FLINK-6937] - Fix link markdown in Production Readiness Checklist doc
  • [FLINK-6940] - Clarify the effect of configuring per-job state backend
  • [FLINK-6998] - Kafka connector needs to expose metrics for failed/successful offset commits in the Kafka Consumer callback
  • [FLINK-7004] - Switch to Travis Trusty image
  • [FLINK-7032] - Intellij is constantly changing language level of sub projects back to 1.6
  • [FLINK-7069] - Catch exceptions for each reporter separately
  • [FLINK-7149] - Add checkpoint ID to 'sendValues()' in GenericWriteAheadSink
  • [FLINK-7164] - Extend integration tests for (externalised) checkpoints, checkpoint store
  • [FLINK-7174] - Bump dependency of Kafka 0.10.x to the latest one
  • [FLINK-7211] - Exclude Gelly javadoc jar from release
  • [FLINK-7224] - Incorrect Javadoc description in all Kafka consumer versions
  • [FLINK-7228] - Harden HistoryServerStaticFileHandlerTest
  • [FLINK-7233] - TaskManagerHeapSizeCalculationJavaBashTest failed on Travis
  • [FLINK-7287] - test instability in Kafka010ITCase.testCommitOffsetsToKafka
  • [FLINK-7290] - Make release scripts modular