Purpose of specifying several UNIX commands in a single RUN instruction in Dockerfile

后端 未结 2 522
心在旅途
心在旅途 2021-01-28 06:21

I have noticed that many Dockerfiles try to minimize the number of instructions by several UNIX commands in a single RUN instruction. So i

相关标签:
2条回答
  • 2021-01-28 07:00

    In addition to the space savings, it's also about correctness

    Consider your first dockerfile (a common mistake when working with debian-like systems which utilize apt):

    FROM ubuntu 
    MAINTAINER demousr@example.com 
    
    RUN apt-get update 
    RUN apt-get install –y nginx 
    CMD ["echo", "Image created"] 
    

    If two or more images follow this pattern, a cache hit could cause the image to be unbuildable due to cached metadata

    • let's say I built an image which looks similar to that ~a few weeks ago
    • now I'm building this image today. there's a cache present up until the RUN apt-get update line
    • the docker build will reuse that cached layer (since the dockerfile and base image are identical) up to the RUN apt-get update
    • when the RUN apt-get install line runs, it will use the cached apt metadata (which is now weeks out of date and likely will error)
    0 讨论(0)
  • 2021-01-28 07:05

    Roughly speaking, a Docker image contains some metadata & an array of layers, and a running container is built upon these layers by adding a container layer (read-and-write), the layers from the underlying image being read-only at that point.

    These layers can be stored in the disk in different ways depending on the configured driver. For example, the following image taken from the official Docker documentation illustrates the way the files changed in these different layers are taken into account with the OverlayFS storage driver:

    Next, the Dockerfile instructions RUN, COPY, and ADD create layers, and the best practices mentioned on the Docker website specifically recommend to merge consecutive RUN commands in a single RUN command, to reduce the number of layers, and thereby reduce the size of the final image:

    https://docs.docker.com/develop/dev-best-practices/

    […] try to reduce the number of layers in your image by minimizing the number of separate RUN commands in your Dockerfile. You can do this by consolidating multiple commands into a single RUN line and using your shell’s mechanisms to combine them together. […]

    See also: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/

    Moreover, in your example:

    RUN apt-get update -y -q
    RUN apt-get install -y nginx
    

    if you do docker build -t your-image-name . on this Dockerfile, then edit the Dockerfile after a while, add another package beyond nginx, then do again docker build -t your-image-name ., due to the Docker cache mechanism, the apt-get update -y -q won't be executed again, so the APT cache will be obsolete. So this is another upside for merging the two RUN commands.

    0 讨论(0)
提交回复
热议问题