Container runtimes
sources:
OCI Runtime spec
Every OCI-compatible container runtime must implement the runtime-spec
The mandatory components of a OCI container are:
- the configuration: a
config.json
file (JSON schema) - the filesystem bundle: unix filesystem
runc
The most basic, low-level runtime (donated from Docker to OCI)
run a container with runc
1) runc spec
: generates the confg.json
file with all the cgroups / namespaces / mounts to create in order to run the container
3) mkdir rootfs; docker export $(docker create busybox) | tar -xf - -C rootfs
: creates the filesystem bundle
2) sudo runc run toto
: runs a toto
container based on the conf + bundle
containerd
Built upon runc (communicates with runc via socket)
Adds new features like: pushing and pulling images (registries), volume management, network management (CNI), GRPC API, CLI client (ctr
).
ctr oci spec
: generates the config.json
. Note that it has more features than the basic runc-generated config
containerd and dockerd
dockerd is built upon containerd. dockerd is not a container runtime! It's a developer-facing tool (building images etc)
this is about the same difference between podman and buildah: podman can run containers but not build them directly, buildah can only build.
podman
Built upon runc.
Build upon libpod
(golang library + CLI): does not require any daemon (unlike Docker with dockerd). See Docker vs Podman
- podman CLI is modeled after Docker CLI and can be used as a drop-in replacement for docker
- podman also exposes a RESTful API when it's running
nvidia-container-runtime
nvidia-container-runtime is a low-level runtime (modified runc
)
allows GPU-enabled containers for CUDA processing. Enjoy melting your bare metal cluster.
Kubernetes CRI
CRI (Container Runtime Interface) is a spec for Kubernetes Kubelet (the daemon running on every K8S node). The goal: making K8S extendable without the dev knowing the kubelet internals && enabling almost any container runtime on kubelet .
To be more specific: CRI is a gRPC/Protobuf API. (Example: a K8S pod makes a gRPC request to the CRI API to start multiple container)
CRI-O
With CRI-O runtime, Kubernetes can run any OCI container.
Docker
Historically, K8S supported only Docker as a runtime (not containerd, just docker - when docker itself was a monlithic application)
This is the component which is getting removed in K8S 1.20 release note:
Annexes
Docker vs Podman
Docker uses a client-server architecture
1 | docker run --rm -it busybox |
We see clearly the docker-cli -> dockerd -> containerd architecture (and the socket connecting to runc).
1 2 3 4 | $ ps faux | grep docker root 18297 0.0 0.0 708436 6284 ? Sl 23:56 0:00 \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/51cbef396fdf91d549d15011ff8ed12c3fe444a6e7046f1cf702b529edb11f16 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc root 453 0.0 1.5 1452072 118728 ? Ssl 09:25 0:15 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock hed 18281 0.3 0.6 1285368 54000 pts/4 Sl+ 23:56 0:00 | \_ docker run --rm -it busybox |
Podman manages to run without a central daemon by using a fork() architecture.
1 | podman run --rm -it busybox |
We see that there's a parent-child relationship between the podman processes. (we also see that podman containers live in var/lib/containers
)
1 2 3 4 5 | $ ps faux | grep podman root 17900 0.0 0.0 15560 7160 pts/4 S+ 23:55 0:00 | \_ sudo podman run --rm -it busybox root 17901 3.0 0.7 1493832 62204 pts/4 Sl+ 23:55 0:00 | \_ podman run --rm -it busybox hed 14097 0.0 0.2 50428 23172 ? S 21:46 0:00 podman root 18026 0.0 0.0 81312 1908 ? Ssl 23:55 0:00 /usr/bin/conmon --api-version 1 -c f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec -u f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec/userdata -p /var/run/containers/storage/overlay-containers/f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec/userdata/pidfile -n priceless_brahmagupta --exit-dir /var/run/libpod/exits --socket-dir-path /var/run/libpod/socket -s -l k8s-file:/var/lib/containers/storage/overlay-containers/f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec/userdata/ctr.log --log-level error --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/var/run/containers/storage/overlay-containers/f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec/userdata/oci-log -t --conmon-pidfile /var/run/containers/storage/overlay-containers/f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg f2a418363107e7a7e6bbaeb2a499d11d0939e20dae8350627b43babb887c07ec |