This blog post will help you understand the difference between two similar Dockerfile directives (ADD and COPY), how they came to be what they are, and our advice on which directive you should use. When building a Docker image from a Dockerfile, you can select two commands to ADD directories/files to the image: ADD and COPY. Both instructions follow the same basic form and accomplish almost the same thing:
ADD <src>... <dest>
COPY <src>... <dest>
Copy the code
In both cases, directories or files () are copied and added to the file system of the container at the specified path.
So if both directives are the same, why do they both exist, and which one should you use? Please read carefully to find out.
If you’re not interested in the nuances of ADD and COPY and just want to answer the question “Which one should I use?” , all you need to know is: Use COPY.
Origin story
Unlike the COPY directive, ADD has been part of Docker since the beginning and supports several other tricks besides copying files from the build context.
The ADD directive allows you to use a URL as a parameter. When a URL is provided, the file is downloaded from the URL and copied to.
ADD http://foo.com/bar.go /tmp/main.go
Copy the code
The above files will be downloaded from the specified URL and added to the container’s file system/TMP /main.go. Another form allows you to simply specify a destination directory for the downloaded files:
ADD http://foo.com/bar.go /tmp/
Copy the code
Since arguments end with a trailing “/”, Docker deduces the filename from the URL and adds it to the specified directory. In this case, a file named/TMP /bar.go will be added to the container’s file system.
Another feature of ADD is the ability to automatically unzip compressed files. If the argument is a local file that recognizes compression formats (tar, gzip, bzip2, etc.), it will be unzipped to the specified location in the container file system.
ADD /foo.tar.gz /tmp/
Copy the code
The command above causes the contents of the foo.tar.gz archive to be unpacked to the/TMP directory of the container. Interestingly, THE URL download and unzip functions do not work together. Any compressed files copied over the URL are not automatically decompressed.
magic
Obviously, there’s a lot of functionality behind the simple ADD command. While this makes ADD very flexible, it doesn’t make it particularly predictable. Here’s a quote from December 2013 about the problem recorded with the ADD command:
In my opinion, the current ADD directive is pretty amazing. It can add local and remote files. It will sometimes extract a file, it will sometimes not extract a file. If a file is a tar package to be copied, it is unzipped accidentally. If the file is a tar package in a compressed format that needs to be decompressed, it is accidentally copied. - amluto
The consensus seems to be that ADD is trying to do too much and confusing users. Obviously, no one wanted to break backward compatibility with the existing use of ADD, so it was decided to ADD a new directive that was more predictable.
Class ADD, should be more concise
When Docker 1.0 was released, the new COPY directive was included. Unlike ADD, COPY copies files and folders directly from the build context into the container.
COPY does not support URLS as arguments, so it cannot be used to download files from remote locations. Anything that you want to copy into a container must exist in the local build context.
In addition, COPY does nothing special for compressed files. If you copy the archive, it falls into the container exactly as it appeared in the build context, without trying to decompress it.
COPY is really just a stripped-down version of ADD, designed to satisfy most “COPY files to containers” use cases without any side effects.
Which one?
If it’s not obvious by now, the Docker team recommends using COPY in almost all cases.
Really, the only reason to use ADD is because when you have a zipped file, you definitely want to automatically unzip it into the image. Ideally, ADD would be renamed something like EXTRACT to really bring this into the Docker ecosystem (again, this is unlikely to happen for backwards compatibility reasons). Okay, but isn’t getting packages from remote urls still useful? Technically, yes, but in most cases, you’d probably be better off running curl or wGET. Consider the following example:
ADD http://foo.com/package.tar.bz2 /tmp/
RUN tar -xjf /tmp/package.tar.bz2 \
&& make -C /tmp/package \
&& rm /tmp/package.tar.bz2
Copy the code
Here we have an ADD directive that retrieves a package from a URL, and then a RUN directive that unpacks it, builds it, and then tries to clean up the downloaded archive.
Unfortunately, because the package retrieval and RM commands are in separate mirroring layers, we don’t actually save any space in the final mirroring (see my optimizing Docker mirroring article for a more detailed explanation of this phenomenon).
In this case, you might as well do the following:
RUN curl http://foo.com/package.tar.bz2 \
| tar -xjC /tmp/package \
&& make -C /tmp/package
Copy the code
The curl command is used to download the compressed package, and then it is piped to tar. This way we don’t leave compressed files on the file system we need to clean up.
There may still be legitimate reasons to add remote files to the image, but this should be an explicit decision and not your default choice.
Ultimately, here’s the rule: Use COPY (unless you’re sure you need ADD).
The original address: https://www.ctl.io/developers/blog/post/dockerfile-add-vs-copy/