0%

Kubernetes-CRI

阅读更多

1 容器

1.1 容器发展史

container_timeline

1.2 docker组件介绍

docker(/usr/bin/docker:在早期版本中(1.11.x之前),docker的所有功能都用这一个二进制完成,同时充当docker-cli以及docker-daemon1.11.x版本之后,docker只充当docker-cli这一角色

dockerd(/usr/bin/dockerd:以Docker Engine API对外提供服务,可以通过三类socket进行API调用,分别是:unixtcpfd(默认用的是unix。实际上,dockerd只是一个中间层,容器、镜像的生命周期管理动作,实际上是由containerd完成的,dockerdcontainerd之间通过grpc通信

containerd(/usr/bin/docker-containerdcontainerd1.11.x引入的组件,是OCI规范的标准实现。功能大致包括

  1. 镜像管理
  2. 容器管理(实际上利用RunC
  3. 存储管理
  4. 网络管理
  5. 命名空间管理

RunC(/usr/bin/docker-runc:与其他组件不同,RunC是个可执行程序(提供创建和运行容器的cli工具),而并非一个常驻进程。它的主要职责就是运行容器。RunC直接与容器所依赖的cgroup/namespace等进行交互,负责为容器配置cgroup/namespace等启动容器所需的环境,创建启动容器的相关进程

containerd-ctr(/usr/bin/docker-containerd-ctr:可以直接与containerd交互的客户端,类似于docker-clidockerd之间的关系

containerd-shim(/usr/bin/docker-containerd-shimcontainerd并不直接操作容器,而是通过containerd-shim来间接地操作容器或与容器通信。从进程关系上讲,containerd-shim进程由containerd进程拉起,容器进程由containerd-shim进程拉起(每个容器进程都会对应一个独立的containerd-shim进程)。containerd-shim主要起到以下作用:

  1. 由于containerd-shim与容器是父子进程关系,因此可以拿到容器的stdoutstderr,可以将容器的stdinstdout转存到日志文件中(/var/lib/docker/containers/<容器id>/<容器id>-json.log)。与此对应的功能是docker logs <container>kubectl logs <pod> -c <container>
  2. 由于containerd-shim与容器是父子进程关系,因此可以拿到容器的stdinstdoutstderr,通过socket将容器的stdinstdoutstderr暴露给外界,以提供流式传输的功能。与此对应的功能是docker exec -it <container>kubectl exec -it <pod>
  3. 用于追踪容器的exit code。在detached模式下,RunC在启动完容器后便退出了,此时容器进程的父子进程关系将会调整。如果没有containerd-shim,那么容器的父进程将会变成containerd,如果containerd重启了或意外退出了,那么容器的父进程又会进一步变成pid = 1的进程,于是容器的状态信息将会全部丢失;引入containerd-shim后,容器的父进程就变成了containerd-shim,而containerd-shim会等待容器运行直至退出,从而能够捕获到容器的exit code

1.3 OCI

以下是维基百科对于OCI的定义:

The Open Container Initiative (OCI) is a Linux Foundation project to design open standards for operating-system-level virtualization, most importantly Linux containers. There are currently two specifications in development and in use: Runtime Specification (runtime-spec) and the Image Specification (image-spec).

OCI develops runc, a container runtime that implements their specification and serves as a basis for other higher-level tools. runC was first released in July 2015 as version 0.0.1

翻译过来就是:

OCI是Linux基础项目,旨在为操作系统层级的虚拟化技术(最主要的就是Linux容器)设计开放标准。当前有两个规范正在开发和使用中:运行时规范(runtime-spec)和映像规范(image-spec

OCI开发了Runc,它一个容器运行时,该运行时实现了OCI规范,并作为其他高级工具的基础。Runc于2015年7月首次发布,版本为0.0.1。

2 CRI

CRIKubernetes 1.5中引入,在此之前,Kubernetesdocker强耦合(在代码中直接硬编码调用docker-api)。虽然docker是容器领域中最受瞩目的项目,但它并不是容器领域中的唯一选择,不同的容器实现方案都有其各自的优势,Kubernetes为了在容器运行时的选择上更具灵活性,因此需要与docker进行解耦,而软件如何解耦?那就加一层接口咯,这层接口就叫做CRI。如此一来,docker就是满足CRI的一个实现,只要各个容器方案都实现了CRI接口,Kubernetes就能完成容器运行时的自由切换

整体架构大致如下(图中仅包含部分CRI实现,以及部分OCI实现)

relationship

其中

  • cri-o是由Kubernetes孵化的项目,天然支持CRI
  • cri-containerd是为了在不改造containerd的前提下,让containerd支持CRI规范
  • docker-shim是为了在不改造docker的前提下,让docker支持CRI规范

Kubernetes对于CRI的定义,可以参考kubernetes/cri-api,主要包含两部分RuntimeService以及ImageService

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Runtime service defines the public APIs for remote container runtimes
service RuntimeService {
// Version returns the runtime name, runtime version, and runtime API version.
rpc Version(VersionRequest) returns (VersionResponse) {}

// RunPodSandbox creates and starts a pod-level sandbox. Runtimes must ensure
// the sandbox is in the ready state on success.
rpc RunPodSandbox(RunPodSandboxRequest) returns (RunPodSandboxResponse) {}
// StopPodSandbox stops any running process that is part of the sandbox and
// reclaims network resources (e.g., IP addresses) allocated to the sandbox.
// If there are any running containers in the sandbox, they must be forcibly
// terminated.
// This call is idempotent, and must not return an error if all relevant
// resources have already been reclaimed. kubelet will call StopPodSandbox
// at least once before calling RemovePodSandbox. It will also attempt to
// reclaim resources eagerly, as soon as a sandbox is not needed. Hence,
// multiple StopPodSandbox calls are expected.
rpc StopPodSandbox(StopPodSandboxRequest) returns (StopPodSandboxResponse) {}
// RemovePodSandbox removes the sandbox. If there are any running containers
// in the sandbox, they must be forcibly terminated and removed.
// This call is idempotent, and must not return an error if the sandbox has
// already been removed.
rpc RemovePodSandbox(RemovePodSandboxRequest) returns (RemovePodSandboxResponse) {}
// PodSandboxStatus returns the status of the PodSandbox. If the PodSandbox is not
// present, returns an error.
rpc PodSandboxStatus(PodSandboxStatusRequest) returns (PodSandboxStatusResponse) {}
// ListPodSandbox returns a list of PodSandboxes.
rpc ListPodSandbox(ListPodSandboxRequest) returns (ListPodSandboxResponse) {}

// CreateContainer creates a new container in specified PodSandbox
rpc CreateContainer(CreateContainerRequest) returns (CreateContainerResponse) {}
// StartContainer starts the container.
rpc StartContainer(StartContainerRequest) returns (StartContainerResponse) {}
// StopContainer stops a running container with a grace period (i.e., timeout).
// This call is idempotent, and must not return an error if the container has
// already been stopped.
// The runtime must forcibly kill the container after the grace period is
// reached.
rpc StopContainer(StopContainerRequest) returns (StopContainerResponse) {}
// RemoveContainer removes the container. If the container is running, the
// container must be forcibly removed.
// This call is idempotent, and must not return an error if the container has
// already been removed.
rpc RemoveContainer(RemoveContainerRequest) returns (RemoveContainerResponse) {}
// ListContainers lists all containers by filters.
rpc ListContainers(ListContainersRequest) returns (ListContainersResponse) {}
// ContainerStatus returns status of the container. If the container is not
// present, returns an error.
rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {}
// UpdateContainerResources updates ContainerConfig of the container.
rpc UpdateContainerResources(UpdateContainerResourcesRequest) returns (UpdateContainerResourcesResponse) {}
// ReopenContainerLog asks runtime to reopen the stdout/stderr log file
// for the container. This is often called after the log file has been
// rotated. If the container is not running, container runtime can choose
// to either create a new log file and return nil, or return an error.
// Once it returns error, new container log file MUST NOT be created.
rpc ReopenContainerLog(ReopenContainerLogRequest) returns (ReopenContainerLogResponse) {}

// ExecSync runs a command in a container synchronously.
rpc ExecSync(ExecSyncRequest) returns (ExecSyncResponse) {}
// Exec prepares a streaming endpoint to execute a command in the container.
rpc Exec(ExecRequest) returns (ExecResponse) {}
// Attach prepares a streaming endpoint to attach to a running container.
rpc Attach(AttachRequest) returns (AttachResponse) {}
// PortForward prepares a streaming endpoint to forward ports from a PodSandbox.
rpc PortForward(PortForwardRequest) returns (PortForwardResponse) {}

// ContainerStats returns stats of the container. If the container does not
// exist, the call returns an error.
rpc ContainerStats(ContainerStatsRequest) returns (ContainerStatsResponse) {}
// ListContainerStats returns stats of all running containers.
rpc ListContainerStats(ListContainerStatsRequest) returns (ListContainerStatsResponse) {}

// UpdateRuntimeConfig updates the runtime configuration based on the given request.
rpc UpdateRuntimeConfig(UpdateRuntimeConfigRequest) returns (UpdateRuntimeConfigResponse) {}

// Status returns the status of the runtime.
rpc Status(StatusRequest) returns (StatusResponse) {}
}

// ImageService defines the public APIs for managing images.
service ImageService {
// ListImages lists existing images.
rpc ListImages(ListImagesRequest) returns (ListImagesResponse) {}
// ImageStatus returns the status of the image. If the image is not
// present, returns a response with ImageStatusResponse.Image set to
// nil.
rpc ImageStatus(ImageStatusRequest) returns (ImageStatusResponse) {}
// PullImage pulls an image with authentication config.
rpc PullImage(PullImageRequest) returns (PullImageResponse) {}
// RemoveImage removes the image.
// This call is idempotent, and must not return an error if the image has
// already been removed.
rpc RemoveImage(RemoveImageRequest) returns (RemoveImageResponse) {}
// ImageFSInfo returns information of the filesystem that is used to store images.
rpc ImageFsInfo(ImageFsInfoRequest) returns (ImageFsInfoResponse) {}
}

3 参考