How can I prevent a Dockerfile instruction from being cached?

26,713

Solution 1

A build-time argument can be specified to forcibly break the cache from that step onwards. For example, in your Dockerfile, put

ARG CACHE_DATE=not_a_date

and then give this argument a fresh value on every new build. The best, of course, is the timestamp.

docker build --build-arg CACHE_DATE=$(date +%Y-%m-%d:%H:%M:%S) ...

Make sure the value is a string without any spaces, otherwise docker client will falsely take it as multiple arguments.

See a detailed discussion on Issue 22832.

Solution 2

docker build --no-cache would invalidate the cache for all the commands.

Dockerfile ADD command used to have the cache invalidated. Although it has been improved in recent docker version:

Docker is supposed to checksum any file added through ADDand then decide if it should use the cache or not.

So if the file added has changed, the cache should be invalidated for the ADD command.


Issue 1326 mentions other tips:

This worked.

RUN yum -y install firefox #redo

So it looks like Docker will re-run the step (and all the steps below it) if the string I am passing to RUN command changes in anyway - even it's just a comment.

The docker cache is used only, and only if none of his ancestor has changed (this behavior makes sense, as the next command will add change to the previous layer).

The cache is used if there isn't any character which has changed (so even a space is enough to invalidate a cache).

Share:
26,713
Henrik Sachse
Author by

Henrik Sachse

Software Developer :: Java :: Spring :: Containers :: Security :: Linux

Updated on October 23, 2020

Comments

  • Henrik Sachse
    Henrik Sachse over 3 years

    In my Dockerfile I use curl or ADD to download the latest version of an archive like:

    FROM debian:jessie
    ...
    RUN apt-get install -y curl
    ...
    RUN curl -sL http://example.com/latest/archive.tar.gz --output archive.tar.gz
    ...
    ADD http://example.com/latest/archive2.tar.gz
    ...
    

    The RUN statement that uses curl or ADD creates its own image layer. That will be used as a cache for future executions of docker build.

    Question: How can I disable caching for that instructions?

    It would be great to get something like cache invalidation working there. E.g. by using HTTP ETags or by querying the last modified header field. That would give the possibility to do a quick check based on the HTTP headers to decide whether a cached layer could be used or not.

    I know that some dirty tricks could help e.g. executing a download shell script in the RUN statement instead. Its filename will be changed before the docker build is triggered by our build system. And I could do the HTTP checks inside that script. But then I need to store either the last used ETag or the last modified to a file somewhere. I am wondering whether there is some more clean and native Docker functionality that I could use, here.

  • Adrian Mouat
    Adrian Mouat over 8 years
    Recent docker version? That comment was from over a year ago :)
  • VonC
    VonC over 8 years
    @Adr I agree. In "docker time", that seems so long ago.
  • Siyuan Zhang
    Siyuan Zhang almost 8 years
    Just tried with Docker version 1.12.0-rc2, it's not working, still cache the instruction
  • d33tah
    d33tah almost 6 years
    Or better yet, feed it with $RANDOM ?
  • Ruifeng Ma
    Ruifeng Ma almost 6 years
    With timestamp we can be 100% rest assured that the value will always be unique, besides the timestamp information can be handy if it's going to be used somewhere in the container.
  • Ruifeng Ma
    Ruifeng Ma over 5 years
    Haven't followed the new releases of Docker. But the idea behind this solution is to try to provide a new value to break the cache each time the build is run, which can be implemented in lots of ways. As long as long Docker caching mechanism does not change, a similar way can be found.
  • the-bug
    the-bug about 4 years
    Some hint: RUN yum -y install firefox #redo will only work with run, as #redo is not a dockerfile comment. I accidentally broke stuff using this with e.g. WORKDIR...
  • Torsten Bronger
    Torsten Bronger about 3 years
    A decent random generator should never yield collisions (within the lifespan of the universe), and has an arbitrarily high change frequency.
  • Torsten Bronger
    Torsten Bronger about 3 years
    FWIW, ARG CACHE_DATE= (without not_a_date) was sufficient for me.
  • Benyamin Limanto
    Benyamin Limanto about 2 years
    This also working with podman-compose with podman engine rootless, not only with docker. Thanks @RuifengMa !