Does anyone where\'s the official docker images for Hadoop, e.g. YARN, HDFS? I\'d like to use them within a docker image.
There is not currently an official Hadoop Docker image, but there are many User contributed images on the Docker Hub, including this one with over 100K Pulls.
I don't know if it is an official image, but you can take a look at https://github.com/big-data-europe/docker-hadoop.
This blog post teaches how to use it !
It's important to check if the chosen image includes only Hadoop.
(I'm not sure about Cloudera image mentioned above).
Check out the alternatives below:
Sequenceiq:
Image (+1M pulls)
Github repo.
Site
Pull with:
docker pull sequenceiq/hadoop-docker
Uhopper:
Image(1M+ pulls)
Bitbucket repo
Site
Pull with:
docker pull uhopper/hadoop
Big data europe:
Image (10K+ pulls)
Github repo
Site
Pull with:
docker pull bde2020/hadoop-base
Parrot Stream:
Image (1.2K+ pulls)
Github repo
Site
Pull with:
docker pull parrotstream/hadoop
Bonus:
Check out this tutorial on how to build Hadoop docker image.
Cloudera now provide their Quickstart VM as a Docker image for single-node deployments: