Huawei Docker practice team is a masterpiece, this article is mainly excerpts of notes, divided into two parts, update, welcome to exchange
Introduction to the
Concept to clarify
Docker
inLXC
What work has been done
LXC meaning
LXC
User mode toolLinux
Container
Short for Kernel Container Technology
Usually referred to as the second, which features + cross-host deployment + application centric + automated build + version management + component reuse + sharing + tool ecosystem
Docker
Containers and VMS
- Virtual machine: is the perfect solution for partitioning hardware resources, using hardware virtualization technology, through a
hypervisor
Layer to achieve complete isolation of resources; - Container: virtualization at the operating system level, using the kernel
Cgroup
andNamespace
Feature, through software virtualization, just the process itself can be isolated from other processes, without any assistance
Compare the summary of virtual machines
features | The container | The virtual machine |
---|---|---|
Start the | Second level | Minutes of class |
The hard disk to use | As a general rule, beMB |
As a general rule, beGB |
performance | Close to the native | Weaker than |
System support | Supports thousands of containers on a single machine | Usually dozens |
About Container Technology
The past and present of container technology
About Container Technology
Container technology mainly includes Cgroup and Namespace
Namespace
Also known as namespaces, mainly for access isolation. The principle is to abstract a class of resources, encapsulate them together and provide them to a container. Each container has its own abstraction for these resources and cannot be seen from each other, so access isolation can be achieved.Cgroup
是control
group
Control group, also known as control group, is mainly to do resource control. The principle is to put a group of processes in a control group, and to control the available resources of this group of processes by allocating specified available resources to the control group.
Understand containers in a minute
Composition of container
Container = Cgroup + Namespace + RootFS + Container Engine (user-mode tool)
Cgroup
: Resource controlNamespace
: Access isolationrootfs
: File system- Container engines: Lifecycle control
How containers are created
A code:
pid = clone(fun, stack, flags, clone_arg);
(flags: CLONE_NEWPID | CLONE_NEWNS |
CLONE_NEWUSER | CLONE_NEWNET |
CLONE_NEWIPC | CLONE_NEWUTS |
...)
Copy the code
Through the Clone system call and passing in the clone flag corresponding to each Namespace, a new sub-process is created, which has its own PID, mount, User, NET, IPC and UTS Namespace
Code 2:
echo $pid > /sys/fs/cgroup/cpu/tasks echo $pid > /sys/fs/cgroup/cpuset/tasks echo $pid > /sys/fs/cgroup/blkio/tasks echo $pid > /sys/fs/cgroup/memory/tasks echo $pid > /sys/fs/cgroup/devices/tasks echo $pid > /sys/fs/cgroup/freezer/tasksCopy the code
Write the process PID generated in code 1 into each Cgroup subsystem, so that the process can be controlled by the corresponding Cgroup subsystem
Code 3:
fun() { ... pivot_root("path_of_rootfs/", path); . exec("/bin/bash"); . }Copy the code
The fun function is executed by the new process generated above. In the Fun function, the process enters a new rootfs through pivot_root system call, and then executes the “/bin/bash” program in the new Namespace, Cgroup, and RootFS through exec system call
Cgroup
introduce
Cgroup
What is the
Cgroup is short for Control Group. It is a feature provided by the Linux kernel. It is used to limit and isolate the use of system resources by a group of processes, that is, to perform Qos for resources, including CPU, memory, block I/O, and network bandwidth.
The subsystems implemented in Cgroup and their functions are as follows:
devices
: Device permission controlcpuset
: Assigns the specifiedCPU
And memory nodescpu
Control:CPU
Occupancy ratecpuacct
Statistics:CPU
usagememory
: Limits the upper limit of memory usagefreezer
: Freeze (pause)Cgroup
In the processnet
_cls
: cooperate withtc
(traffic
controller
) limit network bandwidthnet
_piro
: Sets the network traffic priority of the processhuge
_tlb
: limiting theHugeTLB
The use ofperf
_event
That would allowPerf
Tool based onCgroup
Perform performance monitoring in groups
Namespace
introduce
Namespace
What is the
Namespace encapsulates the global resources of the kernel, so that each Namespace has an independent resource. Therefore, different processes do not interfere with each other in using the same resource in their own Namespace
Currently, the Linux kernel implements a total of six namespaces
IPC
: isolationSystem
V
IPC
andPOSIX
The message queueNetwork
: Isolates network resourcesMount
: Quarantine files hanging on the systemPID
: Quarantine processID
UTS
: Isolates the host name and domain nameUser
: Quarantine usersID
And groupID
understandDocker
The mirror
Docker image
The concept is introduced
Docker image is a read-only template used to start containers. It is a rootFS required for container startup, similar to the image used by virtual machines.
Docker Remote-dockerhub.com/namespace/bar:latest of image representation
Remote-dockerhub
.com
:Remote
image
hub
To centrally store imagesWeb
Server address (if missing, use the default mirror library, i.eDocker
Official Mirror library)namespace
:Namespace
, similar to theGithub
Is a collection of all mirrors in a user or organizationbar
:Repository
, similar to theGithub
Warehouse, a warehouse can have many mirrors, different mirrors throughtag
To distinguish betweenlatest
:Tag
, similar to theGit
In the warehousetag
, commonly used to distinguish between different versions of the same imageLayer
The: mirror consists of a series of layers, each in 64-bit hexadecimal, similar toGit
In the warehousecommit
Image
ID
: indicates the top layer of the mirrorlayer
ID
That’s the mirror imageID
.Repo
:tag
Provides names that are easily recognizable to humans, whileID
Facilitates script processing and image operation
useDocker
The mirror
Docker is embedded with a series of commands to create, manage, upload and download images. You can call REST API to send related commands to Docker Daemon, or use CLI commands provided by the client to complete operations.
Lists the local mirror
`docker` `images`
Copy the code
Build
Create an image.
- Direct download
docker
pull
busybox
- Import mirror
docker save -o busybox.tar busybox
exportbusybox
forbusybox.tar
docker load -i busybox.tar
Importing the image
- Making a New image
docker import
The archive used to import the root file system and change it toDocker
The mirror. Often used to makeDocker
Base mirroring, by contrast,docker
export
Export an image as an archive of the root file system
Ship
: Transfers a mirror
Image transport is the bridge between development and deployment. Docker image warehouse can be used for transmission, Docker export/ Docker Save generated tar package to achieve, or Docker image template file Dockerfile for indirect transmission.
Run
To:image
Start a container for the template
To start the container, use the docker run command
- Logical inconsistency between commands,
docker
ps
List the containers,docker
images
List the mirror docker
inspect
View details about containers and images
Docker
image
Organizational structure of
Docker image contains data and necessary metadata. The data consists of image layers, and the metadata is some JSON files that describe the relationship between the image layers and some configuration information of the container.
When writing copy
When the parent forks the child, the kernel does not allocate memory for the child (although basic process control blocks and stacks are still required), but lets the parent share memory. When one of the two modifiesshared memory, a page-missing exception is triggered, resulting in a true memory allocation. This not only speeds up the creation of child processes, but also reduces memory consumption.
warehouse
What is a warehouse
Repository is used to centrally store Docker images and support image distribution and update
Composition of warehouse
The name of the repository is usually two parts separated by a slash, preceded by the user name and followed by the mirror name.
Warehouse mirror
The repository contains a set of images, separated by tags. A complete image path usually consists of the server address, the repository name, and the tag
registry.hub.docker.com/official/ubuntu:14.04
Copy the code
It represents the official Ubuntu image on Docker Hub with release 14.04
docker
push
localhost
: 5000 /official
/ubuntu
:14.04 Uploading an image to a local Private vault. If the server address is not specified, the image is uploaded to the official vault by defaultDocker
Hub
docker
pull
ubuntu
:14.04 Download image, do not write the server address from the default officialDocker
Hub
downloaddocker
search
localhost
: 5000 /ubuntu
The query image
Take a look atDocker
Hub
Docker
Hub
advantages
- For developers to provide a large number of
Docker
Image, for free download study and use - Have a sound account management system, to provide users with paid expansion
- Servers are deployed in distributed mode and support load balancing
- You can upload, download, query, delete, and set image properties
- Supports online image compilation
- The backend uses distributed storage for Dr And backup
- Its core is
Docker
distribution
, design and maintenance in the open source community, will continue to update and improve - Enterprise Edition available
Docker
Hub
, to provide one-stop solutions for enterprise users
Account management system
- Users can edit their own registration information, such as password email, etc
- Create and delete a mirror for a user
- Paid users can create and set private images
- Create and maintain organizations and add team members
- Users can follow each other
The warehouse service
Registry
Function and Architecture
Registry is designed to create, store, distribute, and update images
- Mirror storage Mirror data is stored in
Registry
The backend, similar to local image storage, is separated into multiple image layers and stored in their respective directoriestar
Format. The manifest file is also retained (manifest
) and mirror signature files (signature
) etc. - Image creation, distribution, and updating of local users and
Registry
Through betweenRegistry
API
Image transfer.Registry
API
Is a series ofHTTP
/HTTPS
Request, used to send user requests toRegistry
And receivesRegistry
Response, request The response contains the interaction of mirrored data
Registry characteristics
- Quickly upload and download images
- The design is novel and high performance
- The deployment of convenient
- There are detailed and complete
Registry
API
documentation - The backend supports multiple distributed cloud storage solutions (
s
3,azure
) and local file systems, and interfaces exist as plug-ins and are easy to configure - Manifest file (
Manifest
) records mirror information as metadata - In order to
Webhook
Way to realize the notification system - Implemented locally
TLS
To supportHTTPS
Security access - A configurable authentication module is available
- There are health check modules
- The manifest files and summary files for managing images are clearly formatted to label images more clearly
- With a perfect image cache mechanism, image download more quickly
Registry
API
API Description: Registry API follows REST design standards, is used for communication between Registry and Docker Engine, and implements Registry image distribution, which is an important part of Docker Registry
The objects transmitted by the API mainly include block data (BLOB) and Manifests of the mirror layer.
The Manifest file is a JSON format file that records metadata information of an image and is compatible with image information of V1 version
{ "name": <name>, "tag": <tag>, "fsLayers": [ { "blobSum": <tarsum> }, ... ] , "history": [...] , "signatures": [...] }Copy the code
Authentication mechanism
The authentication mechanism is a new feature added after Registry V2 release to validate user requests for permissions. The Docker Engine, Registry, and Auth Service work together to verify and control access rights
Deploying a private repository
Running a private repository
Docker Private Registry is used to build private warehouses and manage private Docker images. Advantages of deploying private services:
- Can independently develop and operate private warehouse
- Save bandwidth resources
- Have a separate account management system
- Added customization function
The premise of setting up a Private repository is to deploy the Docker Private Registry. Docker run -d –hostname localhost –name registry-v2 -v docker run -d –hostname localhost –name registry-v2 -v / opt/data/distribution: / var/lib/registry/docker/registry/v2 -p 5000:5000 registry: 2.0
Building a reverse proxy
In practice, exposing the host port is not secure. If Registry is not configured with an access proxy, any user can access it directly through the port, so it needs to be designed with an HTTPS reverse proxy. This approach uses a proxy server to accept HTTPS requests from users, forward the requests to Registry servers on the internal network, and return Registry response results to users.
Index
And warehouse advanced functions
Index
Function and composition
The Index function:
- management
Docker
Private
Hub
Register users and authenticate user access rights - Save records and update user information as well
token
Equal check information Docker
Metadata (metadata
) storage- Record historical data about mirror operations performed by users
- Provide operation interface
Web
UI
, users can easily access and update resources
Index consists of control unit, authentication module, database, health check, and log system
Docker
network
Docker
Network status quo
Libnetwork proposes a new Container Network Model (CNM), which defines a standard API for configuring networks for containers. The underlying API can adapt to various Network drivers. Three important concepts of CNM
- Sandbox is an isolated network operating environment that holds the configuration of the container network stack, including the network interface, routing table, and
DNS
Configuration management. Endpoint
Add sandboxes to a network,Endpoint
The implementation of can be a pairveth
pair
orOVS
Internal port, currentLibnetwork
Using theveth
pair
- Network A network consists of a group that can communicate with each other
Endpoint
. Network implementations can beLinux
bridge
,vlan
Etc.
From the conceptual perspective of CNM, the emergence of Libnetwork enables Docker to have the ability of multi-subnet across hosts, and different containers within the same subnet can run on different hosts
Libnetwork drivers
bridge
Docker
The default container network driverhost
The container is shared with the hostNetwork
Namespace
, sharing the same set of network protocol stack, routing table andiptables
Rules and so on, the container and the host see the same network viewnull
The network inside the container is empty. You need to manually configure network interfaces and routes for the containerremote
Docker
The implementation of network plug-ins,Remote
driver
makeLibnetwork
Can be achieved byHTTP
RESTful
API
Network solutions for connecting to third parties,
The SDN scheme similar to SocketPlane can replace the native network implementation of Docker as long as it implements the agreed HTTP URL processing function and the underlying network interface configuration method
overlay
Docker
Native cross-host multi-subnet network scheme
Basic Network Configuration
Docker
The network study
Docker five container network modes
none
No network functions are configured for the containercontainer
Shared with another running containerNetwork
Namespace
, sharing the same network viewhost
Sharing with a hostRoot
Network
Namespace
Containers are considered insecure because they have full permissions to manipulate the host’s protocol stack, routing table, firewall, etcbridge
Docker
The design of theNAT
A network modeloverlay
Docker
Native cross-host multi-subnet model
Network solutions advanced
Weave
Flannel
SocketPlane
Container Volume Management
Docker
Volume Management Basics
Add a new data volume
Docker run -d/TMP /data –name busyboxtest busybox docker run -d/TMP /data –name busyboxtest busybox You can run the docker inspect command to view the location of the data volume on the host
Mount the host directory as a data volume
The -v parameter can be used to create data volumes. Docker run -d -v /host/data:/data –name busyboxtest busybox Docker run -d -v /host/data:/data –name busybox busybox The preceding command mounts the /host/data directory of the host to the /data directory of the container
It can also be mounted read-only
docker run -d -v /host/data:/data:ro --name busyboxtest busybox
If the /data path already exists in the container, Docker overwrites the directory with the contents of /host/data, as the mount command does
Data volume backup, dump, and migration
Docker run –rm –volumes-from dbdata -v $(PWD):/backup ubuntu tar CVF Run –rm –volumes-from dbdata -v $(PWD)
The above command creates a container that mounts the DBData volume and the current directory of the host to the /backup directory of the container. Then run the tar command in the container to package the contents of the dbdata volume and save it to the backup.tar file in the /backup directory.
After the container is executed, the backup file is displayed in the current directory of the host. You can then restore the backup files to the current container or a newly created container to complete data backup and migration
Docker
Volume management issues
- Only local data volumes are supported
- Lack of effective management of the data volume life cycle
Using volume plug-ins
Introduction to Volume Plug-in
Developers can develop volume plug-ins according to their own needs, which can be more convenient and flexible to mount local or remote storage volumes to the local container, providing more abundant functions (such as snapshot, backup, etc.) than Docker’s own volume management.
An existing volume plug-in
Convoy
A standalone plug-in based on local storageFlocker
Supports multiple background storage drivers
Docker
API
aboutDocker
API
REST
Introduction to the
REST Representational State Transfer Generally speaking, an architectural design that meets REST is a RESTful architecture
Docker
security
Deep understanding ofDocker
The safety of the
Docker
The safety of
Docker security is mainly reflected in the following aspects
Docker
Container safety: This refers to whether the container is hazardoushost
Or other containers- Image security: How can users ensure that downloaded images are trusted and not tampered with
Docker
daemon
The security of: How to ensure that sent todaemon
The command is issued by a trusted user.
Docker
Container Security
The security policy
Cgroup
How does Docker use cgroups
- limit
CPU
- Limited memory
- Restricted block device
I/O
ulimit
Linux has a ulimit directive that limits certain types of resources, This includes the size of core dump files, the size of process data segments, the size of files that can be created, the size of resident memory sets, the number of open files, the size of process stack, CPU time, the maximum number of threads for a single user, and the maximum virtual memory for a process
Container networking
If the isolation of access containers is insufficient, the trusted and untrusted containers can be networked on different networks to reduce risks
Container + full virtualization
If you run a container in a full virtualization environment (for example, running a container on a VM), the container is protected by the VM even if it is compromised
Image signature
When a publisher pushes an image to a remote repository, Docker signs the image with the private key, and then when someone pulls the image, Docker uses the publisher’s public key to verify that the image is consistent with the published image, has been tampered with, and is the latest version
Log audit
Currently, the supported types are None, json-file, Syslog, Gelf, and Fluentd. The default value is Json-file
monitoring
When using containers, you should monitor the information about containers. If anomalies are found, you can take measures to rectify them in time
File system-level protection
Docker can set the container’s root file system to read-only mode. The advantage of read-only mode is that even if the container and host use the same file system, there is no need to worry about affecting or even destroying the host’s root file system
capability
Breaks the concept of power user/normal user in Linux and allows ordinary users to do work that only power users can do
SELinux
Security-enhanced Linux THE NSA’s implementation of mandatory access control in which a process accesses only those files required for its task
AppArmor
Its main function is to set the access control permission of an executable program
Seccomp
Secure Computing Mode (Seccomp) is a security feature provided by the Linux kernel. Seccomp can implement the sandbox mechanism of application programs and restrict system calls by whitelist or blacklist
grsecurity
It can be used to control resource access permissions
A few withDocker
Safety related projects
Notary
ensureserver
andclient
Use trusted connections to interact with each otherdocker-bench-security
Check whether the user’s production environment meets the requirementsDocker
Safety practices of
Security reinforcement
The host escape
Exploit vulnerabilities in the VM software or software running in the VM to attack or control the VM host operating system
Afterword.
This book links to Docker advanced and practical
If there is a mistake in the article welcome everyone comment pointed out, or add my wechat friends together to communicate GM4118679254, if there is a need for the PDF version of this book can be private letter to me