Originally posted on my blog: Lailin.xyz /post/51252.

Why do we still need to slim down Docker images when storage is so cheap?

Advantages of small mirror images

  1. Speed up build/deployment

    Although storage resources are relatively cheap, network I/O is limited. When bandwidth is limited, the time difference between a 1 GB image and a 10 MB image may be minutes or seconds. This time is especially valuable when a failure occurs and the service is scheduled to another node.

  2. Improve security and reduce attack area

    A smaller mirror means fewer useless applications, which can greatly reduce the number of targets for attack

  3. Reduce storage overhead

Principles of making small mirror images

  1. Choose the smallest base image

  2. Reduce layers and go to files unless necessary

    In the actual process of image making, it is not advisable to merge layers blindly. We need to learn to make full use of Docker cache mechanism to extract common layers and accelerate construction.

    • Dependency files and actual code files are layered separately
    • The team/company uses common base mirrors, etc
  3. Use multi-phase builds

    For example, when a program written by Golang actually runs, it only needs a binary file. For Node, it might just end up running some packaged JS files without having to include thousands of dependencies in node_modules

Base image

  • distroless

    “Distroless” images contain only your application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution.

    Distroless is a Google app that contains only the runtime environment, not the package manager, shell, etc. This is a good option if your program has no other dependencies

  • alpine

    Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox.

    Alpine is a secure Linux distribution based on Musl, Busybox. Although sparrow is small, it has all the features, less than 10M, but it contains a package manager and shell environment, which will be very useful in our actual use of debugging.

    Note, however, that alpine’s use of a smaller musLC instead of glibc will render some applications unusable and require recompilation

  • scratch

    Scratch is a blank image that is typically used for base image builds, such as the Dockerfile for Alpine mirroring that starts from Scratch

    FROM scratch
    ADD alpine-minirootfs-20190228-x86_64.tar.gz /
    CMD ["/bin/sh"]
    Copy the code
  • busybox

Distroless is generally more secure, but you may have dependencies and debugging issues in practice. Alpine is smaller, has a built-in package manager, and is more comfortable with usage, but MusLC can cause compatibility issues. In general, I choose Alpine as the base image.

In addition, in Docker Hub, we can find that commonly used Debian images also provide small images that only contain basic functions

Base image contrast

Pull the base image directly here and check the size of the image. If you look at alpine, it’s only around 5M, 1/20th the size of Debian

Alpine Latest 5CB3aa00f899 3 weeks ago 5.53MB Debian latest 0AF60a5C6dd0 3 weeks ago 101MB Ubuntu 18.04 47b19964fb50 7 Weeks ago 88.1MB Ubuntu Latest 47b19964Fb50 7 weeks ago 88.1MB Alpine 3.8 3f53bb00AF94 3 months ago 4.41MBCopy the code

In practice, the base images of different languages will provide some tags made with different base images. Let’s take ruby’s image as an example to check the differences between different base images. You can see that the default latest image is 881MB, while Alpine’s is just under 50MB, a significant difference

Ruby Latest A5d26127d8d0 4 weeks ago 881MB Ruby Alpine 8d8f7d19d1fa 4 weeks ago 47.1MB Ruby Slim 5dd4d3C99da 4 weeks ago 125MBCopy the code

Reduce layers and go to files unless necessary

  1. Delete files without crossing lines
# dockerfile 1
FROM alpine

RUNWget HTTP: / / https://github.com/mohuishou/scuplus-wechat/archive/1.0.0.zip
# dockerfile 2
FROM alpine

RUNWget HTTP: / / https://github.com/mohuishou/scuplus-wechat/archive/1.0.0.zipRUNThe rm 1.0.0. Zip
# dockerfile 3
FROM alpine

RUNWget https://github.com/mohuishou/scuplus-wechat/archive/1.0.0.zip && rm 1.0.0. ZipCopy the code
test   3  351a80e99c22        5 seconds ago        5.53MB
test   2  ad27e625b8e5        49 seconds ago       6.1MB
test   1  165e2e0df1d3        About a minute ago   6.1MB
Copy the code

It can be found that 1,2 are the same size, but 3 is 0.5MB smaller. This is because almost every command line of docker generates a layer, when deleting files: Since the bottom layers are read-only, AUFS uses the Whiteout mechanism when deleting files in these layers. This is done by creating a corresponding whiteout hidden file in a writable directory in the upper layer, so deleting files in the current layer only hides the file in the upper layer

  1. Use a single line command

In addition to deleting statements that need to be placed on a single line, because of the layer mechanism, some of the common statements that our installation relies on are also best generated using the RUN command, reducing the final number of layers

  1. Separate dependency packages and source code programs to take full advantage of the layer’s cache

    This is a best practice. In the actual development process, our dependency packages tend to change little, but the source code we are developing changes more frequently. If our actual code is only 10M, but dependencies have 1G, if we COPY directly when copying. This will invalidate the cache of this layer every time the code is changed, resulting in wasted COPY and push time to the mirror repository. By separating COPY statements, each push can only change the code layer that we frequently change, instead of attaching dependencies together

  2. Using dockerignore.

    With Git, you can ignore files with.gitignore and docker builds. Dockerignore ignores files in docker context, which not only reduces the import of unnecessary files, but also improves security. Avoid packing some configuration files into the image

Multistage construction

A multi-phase build is also a form of layer reduction, with a multi-phase build, the final image can contain only the last generated executable and the necessary runtime dependencies, greatly reducing the image size.

In the language, for example, in the process of actual operation you just need to finally compile the generated binary files, and the language as well as the expansion pack, the province code file is unnecessary, but we the dependence is a must, at compile time by this time you can use the multi-stage build way, reduce the resulting mirror volume

# Use golang image as Builder image
FROM golang:1.12 as builder

WORKDIR /go/src/github.com/go/helloworld/

COPY app.go .

RUN go build -o app .

Use the Alpine image as the final base image after compiling
FROM alpine:latest as prod

RUN apk --no-cache add ca-certificates

WORKDIR /root/

# Copy the compiled binaries from builder
COPY --from=builder /go/src/github.com/go/helloworld/app .

CMD ["./app"]
Copy the code

Due to the length of this article, multi-phase builds are not covered here, but you can refer to multi-phase builds for details

Qi Yin skills

  1. Using Dive to view the layers of docker images can help you analyze reducing image volume

  2. Using Docker-Slim automatically helps you reduce image size, which is useful for Web applications

  3. Remove dependencies when installing software

# ubuntuApt-get install-y -- no-install-recommends#alpine
apk add --no-cache &&  apk del build-dependencies

# centos
yum install -y ... && yum clean all
Copy the code
  1. Use the –flatten parameter to reduce layers (not recommended)

  2. Compress the layer using docker-Squash

Examples in different languages

Add in the…

Ruby(Rails)

  1. Install only the dependencies needed for production

  2. Delete unneeded dependent files

bundle install --without development:test:assets -j4 --retry 3 --path=vendor/bundle \
    # Remove unneeded files (cached *.gem, *.o, *.c)&& rm - rf vendor/bundle/ruby / 2.5.0 / cache / *. The gem \ && find vendor/bundle/ruby / 2.5.0 / gems / - name"*.c"- delete \ && find vendor/bundle/ruby / 2.5.0 / gems / - name"*.o" -delete
Copy the code
  1. Delete front-endnode_modulesAnd cache files
rm -rf node_modules tmp/cache app/assets vendor/assets spec
Copy the code

The above can be combined with a multi-phase build

Golang

When Golang is down to a single binary after a multi-stage build, the only way to optimize is to use a tool like UPX to compress the size of the binary

The resources

  1. Three tips for slimming Docker container Images
  2. We can talk base image | Docker thin body
  3. Docker – From Getting Started to Doing This is a great open source Docker book
  4. The basic principle of Docker
  5. Ruby on Rails — Smaller docker images

License

  • Mohuishou [email protected]
  • Link to this article: lailin.xyz/ Post /51252….
  • Copyright Notice: All articles on this blog are licensed BY-NC-SA unless otherwise stated. Reprint please indicate the source!