The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_173
In general, when you want to choose a base image for your Python development environment, most people choose Alpine. Why? Because it’s so small, it’s only about 5 MEgabytes (compared to the Ubuntu series, which is closer to 100 MEgabytes), but the truth of the matter is that we didn’t choose the base image just to get a taste of Python syntax. On top of that, we need to debug and install various extensions, possibly many tripartite dependencies. Alpine is not a good choice in this environment, so let’s take a look at installing and compiling Python in Alpine and Ubuntu, respectively.
First pull Alpine and Ubuntu images respectively:
Docker Pull Ubuntu :18.04 Docker Pull AlpineCopy the code
After the drawing, it can be seen that there is indeed a significant difference in volume:
REPOSITORY TAG IMAGE ID CREATED SIZE Ubuntu 18.04 6526a1858e5d 2 weeks ago 64.2MB alpine latest A24bb4013296 3 months A line 5.57 MBCopy the code
Ubuntu takes up 64MB, while Alpine takes up just 5.57 MB.
If your python application needs to do some scientific calculations and display the data graphically, you’ll need the help of the matplotlib and Pandas libraries
FROM python:3.7-slim
RUN pip install --no-cache-dir matplotlib pandas
Copy the code
Then run the image script:
docker build -f Dockerfile.ubuntu -t 'ubuntu-mat' .
Copy the code
As you can see, the compiled image jumped from 60mb to 263mb.
liuyue:blog liuyue$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE ubuntu-mat latest 401f0425ce63 About a minute ago 263MBCopy the code
There is no problem in use.
Now, let’s try Alpine and see if it has any speed and size advantages over Ubuntu
Write Dockerfile. Alpine:
FROM python:3.7-alpine
RUN pip install --no-cache-dir matplotlib pandas
Copy the code
Compiling an image script
docker build -f Dockerfile.alpine -t 'alpine-mat' .
Copy the code
During compilation, we will find an error:
Liuyue :blog Liuyue $docker build -f Dockerfile. Alpine -t 'alpine-mat'. Sending build context to Docker daemon 112.1kB Step 1/2: Pull FROM Python :3.7-alpine 3.7-alpine: Pulling FROM library/python df20FA9351A1: Pull complete 36b3ADc4FF6f: Pull complete 4db9de03f499: Pull complete cd38a04a61f4: Pull complete 6bbb0c43b470: Pull complete Digest: sha256:d1375bf0b889822c603622dc137b24fb7064e6c1863de8cc4262b61901ce4390 Status: Downloaded newer Image for Python: 2.7-alpine --> 078114EDb6be Step 2/2: RUN pip install --no-cache-dir matplotlib pandas ---> Running in 6d3c44420e5c Collecting matplotlib Downloading Matplotlib-3.3.1.tar. gz (38.8 MB) ERROR: Command errored out with exit status 1: Command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-40p0g06u/matplotlib/setup.py'"'"'; __file__='"'"'/tmp/pip-install-40p0g06u/matplotlib/setup.py'"'"'; f=getattr(tokenize, '"'"'open'"'"', open)(__file__); code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"'); f.close(); exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-zk64hzam cwd: /tmp/pip-install-40p0g06u/matplotlib/Copy the code
How did this happen? If you look closely at the Ubuntu-based build above, you will see that it downloads the tripartite-library installation package as Matplotlib-3.1.2-cp38-cp38-ManyLinux1_x86_64. WHL, which is a pre-compiled binary installation package. Alpine can only download the source code (matplotlib-3.1.2.tar.gz), which is Alpine’s fatal problem: the standard Linux installation package doesn’t work on Alpine Linux at all.
Most Linux distributions use the GNU version of the standard C library (Glibc), which is required for almost all C-based scripting languages, including Python. But Alpine Linux uses MUSL, and those binary packages are compiled for Glibc, so Alpine disables Linux package support. Most Python packages now include binary installers on PyPI, greatly speeding up installation times. But if you’re using Alpine Linux, you’ll need to compile all the C source for every Python package you use.
This means you need to figure out the dependencies of each system library yourself. Dockerfile.alpine:
FROM python:3.7-alpine
RUN apk --update add gcc build-base freetype-dev libpng-dev openblas-dev
RUN pip install --no-cache-dir matplotlib pandas
Copy the code
Recompile:
docker build -f Dockerfile.alpine -t 'alpine-mat' .
Copy the code
After a long compilation and installation, about half an hour, because we all know that compiling and installing from source is much slower than installing from the installation package, now look at the compiled image:
REPOSITORY TAG IMAGE ID CREATED SIZE
alpine-mat latest 601f0425ce63 About a minute ago 873MB
Copy the code
At 873 megabytes, Alpine’s pride in being small and light has gone.
While Alpine’s MusL kernel is technically mostly compatible with glibc used by other Linux distributions, in practice the difference can cause problems. When these problems do occur, they are not easy to fix. Alpine’s thread default stack size is small, for example, which can cause Python to crash and slow down Python applications.
Conclusion: In a local environment, if you just want to “play around,” there’s nothing wrong with Alpine as a base image, but if you want to deploy your Python application to a production environment, especially if multiple compilations are required to deploy distributed systems, it makes more sense to opt for the older Ubuntu.
The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_173