Sunday, April 5, 2020

Kubernetes series - Part II - virtualization vs containerization



The main goal of virtualization/containerization is to provide a certain level of isolation of host resources(CPU/ Memory/ Network/ Filesystem etc) used by each service/application running on the host.
Before we go into details of container/containerization/virtualization, let's look at the origin of virtualization.

Chroot environments

The UNIX/LINUX chroot command (stands for change root) allows to create file system level isolation in a host. chroot command changes the root of the filesystem for a process and it's child processes.This would help to run an unprivileged service like a web server in a protected environment(chroot jail). The web server process can see only files under the root directory of the chroot environment.

But this provides only a crud level of file system isolation. The web server can still hog other resources of the host like CPU/Memory/Network.

Chroot feature was available as early as 1979 in version 7 of UNIX. This feature was added to BSD in 1982.

FreeBSD jail

FreeBSD enahcened the chroot environment feature and provided the jail feature in 2000. This feature enabled lightweight virtualization. Each virtual environment running on a shared host has its own processes, files, network and users. And the overhead of a jail is < 10MB.

The freeBSD jail doesn’t provide true virtualization. The OS kernel is still shared by all virtual environments. Also there is no way to limit CPU/Memory usage by each jail. Still freeBSD jail is a very popular solution used in production.

Virtualization

Virtualization provides strong isolation compared to FreeBSD. Basically instead of installing an Operating System you would install a software layer called hypervisor on a host. Then you would log in to the hypervisor and create virtual machines. And each virtual machine will have its own Operating System. So nothing is shared between two virtual machines running on a host. The hypervisor is interacting with the hardware resources of the host machine and allocates these resources to the virtual machines.

Containerization

Container is an advanced version of chroot and FreeBSD jail. It's built on top of two Linux kernel features called control groups (cgroups) and Linux namespaces.

Google originally developed Control Groups(cgroups) and contributed to the Linux kernel in 2008.  Control groups (cgroups)  allow you to isolate resources like CPU, Memory, Network and block IO used by a group of processes. It also allows you to stop and start(control)  a group of processes. This feature along with the Linux namespaces are the foundation for containers. Linux namespaces allow you to isolate network and process space.

Docker leveraged cgroups and namespaces to come up with containers. Docker is the most popular containerization technology. Docker ecosystem makes it easy to build and share images. And create/manage containers using the image.

Docker makes it very easy to create an image. All you need to do is to create a file(Dockerfile) and specify all the dependencies in it. Then use the docker command to create an image based on the Dockerfile. Then you can create a container using the created image. The container is executed in the docker container runtime engine.

The docker engine is a client-server application. The docker container runtime aka the docker daemon(dockerd). This is the server part. There are two types of clients available. Docker client aka docker command line interface (docker). This client is used to create images and create containers using the image. Also used to manage the lifecycle of the containers.
 

Docker REST APIs and SDKs. These interfaces can be used to interact with docker engines.

After creating the image, you can upload the image to dockerhub. This is a repository to store and share images. You can version the images too.

Containers are light-weight compared to virtual machines. Containers share the host operating system kernel. Whereas each virtual machine has its own operating system. A good analogy is process vs thread. There is some overhead to spawn a new process. Similarly there is some overhead to start a virtual machine, and need to bring up the operating system. Threads are lightweight processes. Threads are created inside the process and they share the resources of the process. So there is less overhead. Similarly containers share the host operating system kernel, so starting up a container is faster than starting up a virtual machine.


Lets look at the definition from docker.com : “A container is a standard unit of software that packages up code and all its dependencies.”

Packaging an application and it's dependencies together helps to keep the dev/stage/pre-prod/prod environments as similar as possible. This is very important for Microservice architecture bases applications.



No comments: