Introduction to Docker

Docker is a software that implements OS level virtualization. Such virtualization allows applications to run within so called containers. The container includes the application and all system and library dependencies required to successfully run it. Specifically, the sole purpose of a container is to run the single included application in a highly reproducible way. In contrast, if running an application normally (on the system outside a container), it may potentially fail if the dependencies to the application was updated since the application itself was designed and compiled. It is clear that this risk poses a significant problem, as software would then degrade over time. The OS-level virtualization technique is then an efficient solution to the problem, as the application always is accompanied with a known environment.

When working with microservices for IoT and robotics OS-level virtualization is of course a crucial tool, as it allows each microservice to function properly within their own respective container.

An added benefit of OS-level virtualization, and the implementation Docker, is that it also give possibilities when it comes to deployment of software, as containers, with the application and the pre-known environment, is spawned from an image file. This image can easily be transferred, and even be hosted on an online public registry.

Installing Docker

Note! If you are using OpenDLV Desktop, Docker is already installed and you can skip to the next section.

First, you need to install docker on your Linux. Start by opening the terminal and then install a small tool called curl used to download resources from the internet:

sudo apt-get install curl

Next, download and install the Docker key, to verify later downloads to be genuine (the key is piped into the program apt-key which is run as the super user):

curl -fsSL | sudo apt-key add

Next, add the Ubuntu software catalog from Docker:

sudo add-apt-repository \
  "deb [arch=amd64,arm64,armhf] $(lsb_release -cs) stable"

Here, the backslash (\) is used to break the line into two lines. It is not required but might increase readability, so you could just write it all in one line if you want.

The architectures amd64, arm64, and armhf refers to different types of processors (CPUs), and $(lsb_release -cs) is an environmental variable that exists in Ubuntu. You can type it into another terminal to see the value, which in this case is your current version of Ubuntu.

Next, go ahead and install Docker:

sudo apt-get update
sudo apt-get install docker-ce
sudo usermod -aG docker $USER

The last command adds your Linux user into the group docker, allowing it to use Docker. To make that take effect, you need to logout of Ubuntu and login again.

Test that Docker works

You can test Docker by using a hello-world image that exists in the official Docker image registry. When using this image to test Docker it also demonstrates one of the purposes of using Docker: an easy way of distributing images online and being able to download and run them on different computers.

docker run hello-world

If you get the message Hello from Docker! your installation was successful.

Install and run applications inside Docker containers

Now it is time to use Docker with the prime checker application. Open a terminal and go into the opendlv-logic-primchecker folder and run the following command:

Note! If using OpenDLV Desktop 16.0 there is a problem running these steps. Please read them through carefully, and then continue on the next section.

docker run --rm -ti -v ${PWD}:/opt/sources ubuntu:20.04 /bin/bash

Here, docker run means that you start a new container, a new Linux system that runs inside an isolated environment. The container uses a pre-defined Docker image as a starting point, in this case an image called ubuntu:22.04. The image is a static representation of the system, a collection of files, libraries, and executables forming a runnable Linux environment. As soon as Docker starts a container from the image, the container run-time environment leaves the static image state to form the dynamic container state where an application runs inside the container with the capability of changing its internal state. The argument --rm means that the container (and its state) is automatically removed as soon as the application running inside the container is stopped. The next argument, -ti, means that the container should run in terminal and interactive mode, allowing printout and user input. Next, the -v argument means volume and it maps files outside the container into the container. In this case, the current folder, given by the pre-defined ${PWD} variable, is mapped into the folder /opt/sources inside the running container. Such mapping allows files created or modified inside the container to be saved outside, and vice versa. Note that the /opt/sources folder will not exist on your normal Linux system, but only inside the running Docker container. Finally, /bin/bash is the program that should be started inside the container. A container is always restricted to running only one program as a starting point, and then that single program may start additional processes. The /bin/bash program is a so called shell program, the type of program that runs inside a terminal window, that will make the Docker container act as a terminal started from inside the container.

With the above command, you are now running a terminal from inside the isolated Docker container, started from the ubuntu:22.04 Docker image. The container contains all files and programs expected from an Ubuntu 22.04 system, even though the maintainers of the Docker image have slimmed it down slightly to make it more suitable to run as a containerized system. Remember that this environment is completely fresh and separated from your normal Linux system, so none of you applications, tools or files that you hade outside are present here, except for the files that you mapped into the /opt/sources folder. To verify that the files were mapped properly, run the following:

ls /opt/sources

The file list should show the files from the prime checker folder. Now, the next step is to try to compile and run the program within the Docker container. Since there are no tools, such as the compiler or Libcluon, installed inside the container yet, the first step is to do so. Proceed by running the following commands:

apt-get update
apt-get install build-essential cmake software-properties-common

add-apt-repository -y ppa:chrberger/libcluon
apt-get update
apt-get install libcluon

Note that you do not need to run these commands with sudo (super-user) as you did on the host system. This is because the Docker container does not by default have any users inside, so everything is run as the root user with the maximum level of permissions. The reason for this simplification, which in normal systems would pose a huge security problem, is that the scope of the Docker container is minimized and isolated already so that the root user's influence is very limited (the root user is still king, but for a very limited kingdom with no influence on the real world outside). When the needed tools are installed, the next step is to compile and run the prime checker program. Run the following commands:

mkdir /tmp/build && cd /tmp/build
cmake /opt/sources
make && make test
Tip! Using && is a way to run a second command after the first one. It could as well have been done as separate commands.

The first command creates a folder inside the /tmp folder (a folder always present in Linux systems, dedicated for temporary files), and then changes into that folder using cd. The second command initializes CMake for the prime checker program, now keeping the build folder separated from the source folder. Then, on the last to commands, the program is compiled, tested, and executed.

When the program is still started, go into a second terminal (not connected to a Docker container) and run the following:

cd opendlv-logic-primechecker/build

This will start a new instance of the program, and as before it should be possible to send data between the programs by using the integrated UDP multicast features. However, it can quickly be concluded that no data flows between the programs, further demonstrating the principle of isolation of Docker containers. This is in general a very good feature, since Docker is often intended for running a large number of unrelated applications on the same computer (typically a server) and normally one does not want any unexpected interference between applications running in different containers.

In the case of running microservices on a cyber-physical system, such as for IoT or a robot, is however different. In such case there should typically be no restrictions on the network level, so that microservices and hardware, such as sensors connected to the network, could work together. To fix this, start by stopping the running Docker container by first closing the program using Ctrl+c and then stop /bin/bash by pressing Ctrl+d. Then spawn a new Docker container by using the following command:

docker run --rm -ti --net=host -v $PWD:/opt/sources ubuntu:20.04 /bin/bash

This time, the argument --net=host was added, meaning that the network should be completely non-isolated from the host. However, as you might have suspected already, this new Docker container does not contain any of the tools or files that was prepared in the previous environment. In fact, everything that was done before was lost the moment you hit Ctrl+d to close the previous container (as the --rm option was given). This is in principle a good thing, and exactly how one should use Docker, always forcing a known state of the underlying Linux system. So, go ahead and redo the previous steps to get back to the desired container state, with the following commands:

apt-get update
apt-get install build-essential cmake software-properties-common

add-apt-repository -y ppa:chrberger/libcluon
apt-get update
apt-get install libcluon

mkdir /tmp/build && cd /tmp/build
cmake /opt/sources
make && make test

Again, test this with the program running in the other terminal to verify that it now works. When done, type Ctrl+d in the terminal to close /bin/bash and by that the container itself.

Tip! The two commands docker images and docker ps are useful for listing the available Docker images and Docker containers respectively.

Automating software engineering with Docker

So far it was demonstrated how a container could be started and how it could be used to build and run the application from a know system state. However, as noticed it was quite tedious to do this manually every time, and not very useful as the resulting environment was always lost as soon as the program was terminated. The solution to this is to automate the initialization using a so called Dockerfile. As the purpose is now to go full Docker, it is time to leave local builds behind. First, in a terminal, go to the opendlv-logic-primechecker folder and run rm -r build to remove the local build folder. Then, create a new file called Dockerfile (note the capital D) along side all the source files and fill it with the following content (gedit Dockerfile):

# Build
FROM alpine:3.17 as builder
RUN apk update && \
    apk --no-cache add \
        ca-certificates \
        cmake \
        g++ \
        make \
RUN apk add libcluon --no-cache --repository \ --allow-untrusted
ADD . /opt/sources
WORKDIR /opt/sources
RUN mkdir /tmp/build && cd /tmp/build && \
    cmake /opt/sources && \
    make && make test && cp helloworld /tmp

# Deploy
FROM alpine:3.17
RUN apk update && \
    apk --no-cache add \
COPY --from=builder /tmp/helloworld /usr/bin
CMD ["/usr/bin/helloworld"]

The purpose of a Dockerfile is to act as a recipe on how to build a Docker image. This is slightly different from what was done in the previous example, where all changes were kept inside a running container. However, the principles are very similar, Docker internally fires up a container based on the image named after the keyword FROM and then it runs each command started with RUN in order to change the state of the container. In the end, the new container state is stored into a Docker image that later can be used to spawn new containers. In this way, the shortcoming of loosing the state as was done in the previous example is gone, as the resulting image can be used at any time to get up a new container, ready to run the helloworld program. Another big difference in this example is that the ubuntu:20.04 base image is no longer used, but rather an image called alpine:3.17. Alpine is a Linux variant often used as a base image for Docker images, due to its small size. Alpine has a size of just 7 MB compared to Ubuntu's 73 MB. The difference in size is important when thinking about distribution, where the end-user (or cyber-physical system) needs to download less data for run-time software.

Another difference compared to when using Ubuntu, is the use of apk rather than apt-get. The two are just different package management systems, where different variants of Linux use different options based on each variant's design goals. Note that the specific names of packages also may differ, but common packages such as cmake would likely be the same between different systems. Connected to installing packages one can also note that external packages such as Libcluon are installed differently in apk compared to apt-get.

Then, the ADD command links a folder on the host system into a folder inside the container that builds the new image, similar to what was seen with the -v flag for docker run. The WORKDIR command tells Docker that all following commands should run starting from the /opt/sources folder, the folder where all source files from the host system is accessible. Then the following command compiles, builds, and tests the prime checker program. In the end it also, perhaps a bit unexpectedly, copies the resulting binary helloworld to the /tmp folder. This is connected to the fact that the Dockerfile is divided into two parts, one titled Build and one titled Deploy. The task of the first part is only to build the program, and the second part finally describes how the resulting Docker image should constructed. The key is that the final image intended for running the program, and expected to be distributed over the Internet, should be as small as possible. This entails that all compilation tools such as the compiler and CMake should not be included. In the second part of the Dockerfile the same start image is used as in the first, the alpine:3.17 image, but then a single package called libstdc++ is installed, a package containing the bare minimum to run C++ programs. Then, importantly, the COPY command copies over the already compiled program helloworld from the previous Docker container that was only used for building the program. After the copying, the only thing left is to instruct Docker what command that should be run by default if a container is started from the newly created image, using the CMD command. In this case the helloworld program will be started. In summary, the result of the Dockerfile recipe is a very small Docker image containing the bare minimums to run the prime checker program. In the creation process, two Docker containers were started, one that installed and used all needed compilation tools to build and test the executable file, one one that installed the minimum set of software to run it to where the newly compiled program was copied over.

It is now time to see if the Dockerfile can result in a new Docker image. Save and close the file, and use it with the following command (inside the opendlv-logic-primechecker folder):

docker build -t myrepository/mydockerimage .

Note the small dot (.) in the end of the command. This instructs Docker that there should be a Dockerfile in the same folder as the docker build command was run. The -t myrepository/mydockerimage flag states the name of the resulting Docker image. After the build is done, it is time to test the program from within a Docker container (created from the new Docker image). This is done with docker run, similar to what was done before, according to:

docker run --rm -ti --net=host myrepository/mydockerimage

In another terminal, start an additional container from the image using the same command. You should see that the two programs start to communicate. Now, in a third terminal, run the following command to see some information about the running containers:

docker ps

Finally, make sure to commit the Dockerfile into the code repository by running the following commands:

git status
git add Dockerfile
git commit -m "Adding Dockerfile"
git push

Distributing Docker images

In the last section it was shown how Docker can be used to generate small packages (Docker images) containing everything needed to run a single application. The package can then be used to spawn run-time environments (Docker containers) where the program can run in an isolated way, without risk of unexpected interference or incompatibilities from the host system. Naturally the next step is to learn how to distribute Docker images so that they can be used on different systems. To store a Docker image to a file, one could use the following command:

docker save myrepository/mydockerimage > myImage.tar

This stores the data from the docker save command into a .tar file using the stream operator >. The resulting file myImage.tar contains the full Docker image containing the prime checker, and the bare minimum system (based on Alpine) to run it. You can view the file size by using the ls -lh command.

The file can then be sent to another computer running Docker, and be loaded with the following command:

cat myImage.tar | docker load

The command cat replays the content of the file, which is piped into the docker load command using the pipe operator |. In this way, the Docker image can easily be distributed and loaded into other systems, and it is guaranteed to work since everything is included in the image.