The State of Flink on Docker

August 20, 2020 - Robert Metzger (@rmetzger_)

With over 50 million downloads from Docker Hub, the Flink docker images are a very popular deployment option.

The Flink community recently put some effort into improving the Docker experience for our users with the goal to reduce confusion and improve usability.

Let’s quickly break down the recent improvements:

Looking into the future, there are already some interesting potential improvements lined up:

How do I get started? #

This is a short tutorial on how to start a Flink Session Cluster with Docker.

Flink Session cluster can be used to run multiple jobs. Each job needs to be submitted to the cluster after it has been deployed. To deploy a Flink Session cluster with Docker, you need to start a JobManager container. To enable communication between the containers, we first set a required Flink configuration property and create a network:

FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"
docker network create flink-network

Then we launch the JobManager:

docker run \
       --rm \
       --name=jobmanager \
       --network flink-network \
       -p 8081:8081 \
       --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
       flink:1.11.1 jobmanager

and one or more TaskManager containers:

docker run \
      --rm \
      --name=taskmanager \
      --network flink-network \
      --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
      flink:1.11.1 taskmanager

You now have a fully functional Flink cluster running! You can access the the web front end here: localhost:8081.

Let’s now submit one of Flink’s example jobs:

# 1: (optional) Download the Flink distribution, and unpack it
wget https://archive.apache.org/dist/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar xf flink-1.11.1-bin-scala_2.12.tgz
cd flink-1.11.1

# 2: Start the Flink job
./bin/flink run ./examples/streaming/TopSpeedWindowing.jar

The main steps of the tutorial are also recorded in this short screencast:

Demo video

Next steps: Now that you’ve successfully completed this tutorial, we recommend you checking out the full Flink on Docker documentation for implementing more advanced deployment scenarios, such as Job Clusters, Docker Compose or our native Kubernetes integration.

Conclusion #

We encourage all readers to try out Flink on Docker to provide the community with feedback to further improve the experience. Please refer to the user@flink.apache.org (remember to subscribe first) for general questions and our issue tracker for specific bugs or improvements, or ideas for contributions!