Huawei Docker practice team is a masterpiece, this article is mainly excerpts of notes, divided into two parts, update, welcome to exchange

Introduction to the

Concept to clarify

DockerinLXCWhat work has been done

LXC meaning

  • LXCUser mode tool
  • Linux ContainerShort for Kernel Container Technology

Usually referred to as the second, which features + cross-host deployment + application centric + automated build + version management + component reuse + sharing + tool ecosystem

DockerContainers and VMS

  • Virtual machine: is the perfect solution for partitioning hardware resources, using hardware virtualization technology, through ahypervisorLayer to achieve complete isolation of resources;
  • Container: virtualization at the operating system level, using the kernelCgroupandNamespaceFeature, through software virtualization, just the process itself can be isolated from other processes, without any assistance

Compare the summary of virtual machines

features The container The virtual machine
Start the Second level Minutes of class
The hard disk to use As a general rule, beMB As a general rule, beGB
performance Close to the native Weaker than
System support Supports thousands of containers on a single machine Usually dozens

About Container Technology

The past and present of container technology

About Container Technology

Container technology mainly includes Cgroup and Namespace

  • NamespaceAlso known as namespaces, mainly for access isolation. The principle is to abstract a class of resources, encapsulate them together and provide them to a container. Each container has its own abstraction for these resources and cannot be seen from each other, so access isolation can be achieved.
  • Cgroupcontrol groupControl group, also known as control group, is mainly to do resource control. The principle is to put a group of processes in a control group, and to control the available resources of this group of processes by allocating specified available resources to the control group.

Understand containers in a minute

Composition of container

Container = Cgroup + Namespace + RootFS + Container Engine (user-mode tool)

  • Cgroup: Resource control
  • Namespace: Access isolation
  • rootfs: File system
  • Container engines: Lifecycle control

How containers are created

A code:

pid = clone(fun, stack, flags, clone_arg);
(flags: CLONE_NEWPID | CLONE_NEWNS |
CLONE_NEWUSER | CLONE_NEWNET |
CLONE_NEWIPC | CLONE_NEWUTS |
...)
Copy the code

Through the Clone system call and passing in the clone flag corresponding to each Namespace, a new sub-process is created, which has its own PID, mount, User, NET, IPC and UTS Namespace

Code 2:

echo $pid > /sys/fs/cgroup/cpu/tasks echo $pid > /sys/fs/cgroup/cpuset/tasks echo $pid > /sys/fs/cgroup/blkio/tasks echo  $pid > /sys/fs/cgroup/memory/tasks echo $pid > /sys/fs/cgroup/devices/tasks echo $pid > /sys/fs/cgroup/freezer/tasksCopy the code

Write the process PID generated in code 1 into each Cgroup subsystem, so that the process can be controlled by the corresponding Cgroup subsystem

Code 3:

fun() { ... pivot_root("path_of_rootfs/", path); . exec("/bin/bash"); . }Copy the code

The fun function is executed by the new process generated above. In the Fun function, the process enters a new rootfs through pivot_root system call, and then executes the “/bin/bash” program in the new Namespace, Cgroup, and RootFS through exec system call

Cgroupintroduce

CgroupWhat is the

Cgroup is short for Control Group. It is a feature provided by the Linux kernel. It is used to limit and isolate the use of system resources by a group of processes, that is, to perform Qos for resources, including CPU, memory, block I/O, and network bandwidth.

The subsystems implemented in Cgroup and their functions are as follows:

  • devices: Device permission control
  • cpuset: Assigns the specifiedCPUAnd memory nodes
  • cpuControl:CPUOccupancy rate
  • cpuacctStatistics:CPUusage
  • memory: Limits the upper limit of memory usage
  • freezer: Freeze (pause)CgroupIn the process
  • net_cls: cooperate withtc(traffic controller) limit network bandwidth
  • net_piro: Sets the network traffic priority of the process
  • huge_tlb: limiting theHugeTLBThe use of
  • perf_eventThat would allowPerfTool based onCgroupPerform performance monitoring in groups

Namespaceintroduce

NamespaceWhat is the

Namespace encapsulates the global resources of the kernel, so that each Namespace has an independent resource. Therefore, different processes do not interfere with each other in using the same resource in their own Namespace

Currently, the Linux kernel implements a total of six namespaces

  • IPC: isolationSystem V IPCandPOSIXThe message queue
  • Network: Isolates network resources
  • Mount: Quarantine files hanging on the system
  • PID: Quarantine processID
  • UTS: Isolates the host name and domain name
  • User: Quarantine usersIDAnd groupID

understandDockerThe mirror

Docker imageThe concept is introduced

Docker image is a read-only template used to start containers. It is a rootFS required for container startup, similar to the image used by virtual machines.

Docker Remote-dockerhub.com/namespace/bar:latest of image representation

  • Remote-dockerhub.com: Remote image hubTo centrally store imagesWebServer address (if missing, use the default mirror library, i.eDockerOfficial Mirror library)
  • namespace: Namespace, similar to theGithubIs a collection of all mirrors in a user or organization
  • bar: Repository, similar to theGithubWarehouse, a warehouse can have many mirrors, different mirrors throughtagTo distinguish between
  • latest: Tag, similar to theGitIn the warehousetag, commonly used to distinguish between different versions of the same image
  • LayerThe: mirror consists of a series of layers, each in 64-bit hexadecimal, similar toGitIn the warehousecommit
  • Image ID: indicates the top layer of the mirrorlayer IDThat’s the mirror imageID.Repo:tagProvides names that are easily recognizable to humans, whileIDFacilitates script processing and image operation

useDockerThe mirror

Docker is embedded with a series of commands to create, manage, upload and download images. You can call REST API to send related commands to Docker Daemon, or use CLI commands provided by the client to complete operations.

Lists the local mirror

`docker` `images`
Copy the code

BuildCreate an image.

  • Direct downloaddocker pull busybox
  • Import mirror
    • docker save -o busybox.tar busyboxexportbusyboxforbusybox.tar
    • docker load -i busybox.tarImporting the image
  • Making a New imagedocker importThe archive used to import the root file system and change it toDockerThe mirror. Often used to makeDockerBase mirroring, by contrast,docker exportExport an image as an archive of the root file system

Ship: Transfers a mirror

Image transport is the bridge between development and deployment. Docker image warehouse can be used for transmission, Docker export/ Docker Save generated tar package to achieve, or Docker image template file Dockerfile for indirect transmission.

RunTo:imageStart a container for the template

To start the container, use the docker run command

  • Logical inconsistency between commands,docker psList the containers,docker imagesList the mirror
  • docker inspectView details about containers and images

Docker imageOrganizational structure of

Docker image contains data and necessary metadata. The data consists of image layers, and the metadata is some JSON files that describe the relationship between the image layers and some configuration information of the container.

When writing copy

When the parent forks the child, the kernel does not allocate memory for the child (although basic process control blocks and stacks are still required), but lets the parent share memory. When one of the two modifiesshared memory, a page-missing exception is triggered, resulting in a true memory allocation. This not only speeds up the creation of child processes, but also reduces memory consumption.

warehouse

What is a warehouse

Repository is used to centrally store Docker images and support image distribution and update

Composition of warehouse

The name of the repository is usually two parts separated by a slash, preceded by the user name and followed by the mirror name.

Warehouse mirror

The repository contains a set of images, separated by tags. A complete image path usually consists of the server address, the repository name, and the tag

registry.hub.docker.com/official/ubuntu:14.04
Copy the code

It represents the official Ubuntu image on Docker Hub with release 14.04

  • docker push localhost: 5000 /official/ubuntu:14.04 Uploading an image to a local Private vault. If the server address is not specified, the image is uploaded to the official vault by defaultDocker Hub
  • docker pull ubuntu:14.04 Download image, do not write the server address from the default officialDocker Hubdownload
  • docker search localhost: 5000 /ubuntuThe query image

Take a look atDocker Hub

Docker Hubadvantages

  • For developers to provide a large number ofDockerImage, for free download study and use
  • Have a sound account management system, to provide users with paid expansion
  • Servers are deployed in distributed mode and support load balancing
  • You can upload, download, query, delete, and set image properties
  • Supports online image compilation
  • The backend uses distributed storage for Dr And backup
  • Its core isDocker distribution, design and maintenance in the open source community, will continue to update and improve
  • Enterprise Edition availableDocker Hub, to provide one-stop solutions for enterprise users

Account management system

  • Users can edit their own registration information, such as password email, etc
  • Create and delete a mirror for a user
  • Paid users can create and set private images
  • Create and maintain organizations and add team members
  • Users can follow each other

The warehouse service

RegistryFunction and Architecture

Registry is designed to create, store, distribute, and update images

  • Mirror storage Mirror data is stored inRegistryThe backend, similar to local image storage, is separated into multiple image layers and stored in their respective directoriestarFormat. The manifest file is also retained (manifest) and mirror signature files (signature) etc.
  • Image creation, distribution, and updating of local users andRegistryThrough betweenRegistry APIImage transfer.Registry APIIs a series ofHTTP/HTTPSRequest, used to send user requests toRegistryAnd receivesRegistryResponse, request The response contains the interaction of mirrored data

Registry characteristics

  • Quickly upload and download images
  • The design is novel and high performance
  • The deployment of convenient
  • There are detailed and completeRegistry APIdocumentation
  • The backend supports multiple distributed cloud storage solutions (s3,azure) and local file systems, and interfaces exist as plug-ins and are easy to configure
  • Manifest file (Manifest) records mirror information as metadata
  • In order toWebhookWay to realize the notification system
  • Implemented locallyTLSTo supportHTTPSSecurity access
  • A configurable authentication module is available
  • There are health check modules
  • The manifest files and summary files for managing images are clearly formatted to label images more clearly
  • With a perfect image cache mechanism, image download more quickly

Registry API

API Description: Registry API follows REST design standards, is used for communication between Registry and Docker Engine, and implements Registry image distribution, which is an important part of Docker Registry

The objects transmitted by the API mainly include block data (BLOB) and Manifests of the mirror layer.

The Manifest file is a JSON format file that records metadata information of an image and is compatible with image information of V1 version

{ "name": <name>, "tag": <tag>, "fsLayers": [ { "blobSum": <tarsum> }, ... ] , "history": [...] , "signatures": [...] }Copy the code

Authentication mechanism

The authentication mechanism is a new feature added after Registry V2 release to validate user requests for permissions. The Docker Engine, Registry, and Auth Service work together to verify and control access rights

Deploying a private repository

Running a private repository

Docker Private Registry is used to build private warehouses and manage private Docker images. Advantages of deploying private services:

  • Can independently develop and operate private warehouse
  • Save bandwidth resources
  • Have a separate account management system
  • Added customization function

The premise of setting up a Private repository is to deploy the Docker Private Registry. Docker run -d –hostname localhost –name registry-v2 -v docker run -d –hostname localhost –name registry-v2 -v / opt/data/distribution: / var/lib/registry/docker/registry/v2 -p 5000:5000 registry: 2.0

Building a reverse proxy

In practice, exposing the host port is not secure. If Registry is not configured with an access proxy, any user can access it directly through the port, so it needs to be designed with an HTTPS reverse proxy. This approach uses a proxy server to accept HTTPS requests from users, forward the requests to Registry servers on the internal network, and return Registry response results to users.

IndexAnd warehouse advanced functions

IndexFunction and composition

The Index function:

  • managementDocker Private HubRegister users and authenticate user access rights
  • Save records and update user information as welltokenEqual check information
  • DockerMetadata (metadata) storage
  • Record historical data about mirror operations performed by users
  • Provide operation interfaceWeb UI, users can easily access and update resources

Index consists of control unit, authentication module, database, health check, and log system

Dockernetwork

DockerNetwork status quo

Libnetwork proposes a new Container Network Model (CNM), which defines a standard API for configuring networks for containers. The underlying API can adapt to various Network drivers. Three important concepts of CNM

  • Sandbox is an isolated network operating environment that holds the configuration of the container network stack, including the network interface, routing table, andDNSConfiguration management.
  • EndpointAdd sandboxes to a network,EndpointThe implementation of can be a pairveth pairorOVSInternal port, currentLibnetworkUsing theveth pair
  • Network A network consists of a group that can communicate with each otherEndpoint. Network implementations can beLinux bridge,vlanEtc.

From the conceptual perspective of CNM, the emergence of Libnetwork enables Docker to have the ability of multi-subnet across hosts, and different containers within the same subnet can run on different hosts

Libnetwork drivers

  • bridge DockerThe default container network driver
  • hostThe container is shared with the hostNetwork Namespace, sharing the same set of network protocol stack, routing table andiptablesRules and so on, the container and the host see the same network view
  • nullThe network inside the container is empty. You need to manually configure network interfaces and routes for the container
  • remote DockerThe implementation of network plug-ins,Remote drivermakeLibnetworkCan be achieved byHTTP RESTful APINetwork solutions for connecting to third parties,

The SDN scheme similar to SocketPlane can replace the native network implementation of Docker as long as it implements the agreed HTTP URL processing function and the underlying network interface configuration method

  • overlay DockerNative cross-host multi-subnet network scheme

Basic Network Configuration

DockerThe network study

Docker five container network modes

  • noneNo network functions are configured for the container
  • containerShared with another running containerNetwork Namespace, sharing the same network view
  • hostSharing with a hostRoot Network NamespaceContainers are considered insecure because they have full permissions to manipulate the host’s protocol stack, routing table, firewall, etc
  • bridge DockerThe design of theNATA network model
  • overlay DockerNative cross-host multi-subnet model

Network solutions advanced

  • Weave
  • Flannel
  • SocketPlane

Container Volume Management

DockerVolume Management Basics

Add a new data volume

Docker run -d/TMP /data –name busyboxtest busybox docker run -d/TMP /data –name busyboxtest busybox You can run the docker inspect command to view the location of the data volume on the host

Mount the host directory as a data volume

The -v parameter can be used to create data volumes. Docker run -d -v /host/data:/data –name busyboxtest busybox Docker run -d -v /host/data:/data –name busybox busybox The preceding command mounts the /host/data directory of the host to the /data directory of the container

It can also be mounted read-only

docker run -d -v /host/data:/data:ro --name busyboxtest busybox

If the /data path already exists in the container, Docker overwrites the directory with the contents of /host/data, as the mount command does

Data volume backup, dump, and migration

Docker run –rm –volumes-from dbdata -v $(PWD):/backup ubuntu tar CVF Run –rm –volumes-from dbdata -v $(PWD)

The above command creates a container that mounts the DBData volume and the current directory of the host to the /backup directory of the container. Then run the tar command in the container to package the contents of the dbdata volume and save it to the backup.tar file in the /backup directory.

After the container is executed, the backup file is displayed in the current directory of the host. You can then restore the backup files to the current container or a newly created container to complete data backup and migration

DockerVolume management issues

  • Only local data volumes are supported
  • Lack of effective management of the data volume life cycle

Using volume plug-ins

Introduction to Volume Plug-in

Developers can develop volume plug-ins according to their own needs, which can be more convenient and flexible to mount local or remote storage volumes to the local container, providing more abundant functions (such as snapshot, backup, etc.) than Docker’s own volume management.

An existing volume plug-in

  • ConvoyA standalone plug-in based on local storage
  • FlockerSupports multiple background storage drivers

Docker API

aboutDocker API

RESTIntroduction to the

REST Representational State Transfer Generally speaking, an architectural design that meets REST is a RESTful architecture

Dockersecurity

Deep understanding ofDockerThe safety of the

DockerThe safety of

Docker security is mainly reflected in the following aspects

  • DockerContainer safety: This refers to whether the container is hazardoushostOr other containers
  • Image security: How can users ensure that downloaded images are trusted and not tampered with
  • Docker daemonThe security of: How to ensure that sent todaemonThe command is issued by a trusted user.

DockerContainer Security

The security policy

Cgroup

How does Docker use cgroups

  • limitCPU
  • Limited memory
  • Restricted block deviceI/O

ulimit

Linux has a ulimit directive that limits certain types of resources, This includes the size of core dump files, the size of process data segments, the size of files that can be created, the size of resident memory sets, the number of open files, the size of process stack, CPU time, the maximum number of threads for a single user, and the maximum virtual memory for a process

Container networking

If the isolation of access containers is insufficient, the trusted and untrusted containers can be networked on different networks to reduce risks

Container + full virtualization

If you run a container in a full virtualization environment (for example, running a container on a VM), the container is protected by the VM even if it is compromised

Image signature

When a publisher pushes an image to a remote repository, Docker signs the image with the private key, and then when someone pulls the image, Docker uses the publisher’s public key to verify that the image is consistent with the published image, has been tampered with, and is the latest version

Log audit

Currently, the supported types are None, json-file, Syslog, Gelf, and Fluentd. The default value is Json-file

monitoring

When using containers, you should monitor the information about containers. If anomalies are found, you can take measures to rectify them in time

File system-level protection

Docker can set the container’s root file system to read-only mode. The advantage of read-only mode is that even if the container and host use the same file system, there is no need to worry about affecting or even destroying the host’s root file system

capability

Breaks the concept of power user/normal user in Linux and allows ordinary users to do work that only power users can do

SELinux

Security-enhanced Linux THE NSA’s implementation of mandatory access control in which a process accesses only those files required for its task

AppArmor

Its main function is to set the access control permission of an executable program

Seccomp

Secure Computing Mode (Seccomp) is a security feature provided by the Linux kernel. Seccomp can implement the sandbox mechanism of application programs and restrict system calls by whitelist or blacklist

grsecurity

It can be used to control resource access permissions

A few withDockerSafety related projects

  • NotaryensureserverandclientUse trusted connections to interact with each other
  • docker-bench-securityCheck whether the user’s production environment meets the requirementsDockerSafety practices of

Security reinforcement

The host escape

Exploit vulnerabilities in the VM software or software running in the VM to attack or control the VM host operating system

Afterword.

This book links to Docker advanced and practical

If there is a mistake in the article welcome everyone comment pointed out, or add my wechat friends together to communicate GM4118679254, if there is a need for the PDF version of this book can be private letter to me