How to Split Docker Images for Slow Networking Connections

February 17, 2016. 576 words.

When working with docker on smaller scales, often it is economically not feasible to setup performant docker registries. Notwithstanding, operators desire to keep certain parts of the docker image private.

Combining docker containers with data-only docker images as volume mounts offers an option to have public docker images at some public registry or on github. Then, the (often much smaller) private images can be served from some private registry, eliminating the need for highly performant - and expensive - network connections.

Rationale

Imagine for instance that you are responsible for the packaging of some website. If you use docker, the images for the web- or application server are clearly public. There is no need to keep a private instance of nginx.

Configuration and data are not necessarily secret - a secret website tends to be contradiction in term. However, you might wish to keep pre-production data private and choose not to publish your configuration.

Using docker you can package both configuration and data into the docker image for your webserver. Such an image tends to grow in size, though, thus quickly saturation not-so-performant networking upstream lines.

Docker Images for Data or Configuration

Using the --volumes-from- directive, it is possible to mount volumes from one docker container into another. It is rather easy to package some files into a docker image using the special (empty!) image scratch:

FROM scratch
COPY _site/ /usr/share/nginx/html/
VOLUME /usr/share/nginx/html/

This will result in a docker image containing only those files, a data-only image. Likewise, you can use such images to package configuration analogous to configuration packages for RedHat Kickstart or Solaris Jumpstart.

Creating a Data or Configuration Container

From such an image, containers are created using the primitive docker create-. Using docker run is problematic, it will fail on images from scratch, which do not include binaries. So, it is not possible to properly chain a sequence in a unit file where return status are important.

ExecStartPre=/usr/bin/docker create  \
                  --name data_prod_cruwe_de \
                  -v /usr/share/nginx/html \
                  <dataimage> /bin/true

Using Docker Volumes in Other Docker Containers

The volume created (/usr/share/nginx/html) can then be used for a container encapsulating nginx:

ExecStart=/usr/bin/docker run \
               --name nginx_prod_cruwe_de \
               -p 8080:80 \
                --volumes-from data_prod_cruwe_de \
                --volumes-from conf_prod_cruwe_de \
                nginx

Chaining Volume Images for Container Startup

Chaining is then trivial. nginx is pulled from the public docker registry on the public internet and data and configuration are taken from a private and non-performant docker endpoint, which is not saturated by two or three megabytes.

[Service]
ExecStartPre=/usr/bin/docker pull nginx

ExecStartPre=/usr/bin/docker create  \
                  --name data_prod_cruwe_de \
                  -v /usr/share/nginx/html \
                  <dockerimage> /bin/true

ExecStartPre=/usr/bin/docker create  \
                  --name conf_prod_cruwe_de \
                  -v /etc/nginx \
                  <dockerimage> /bin/true

ExecStart=/usr/bin/docker run \
                  --name nginx_prod_cruwe_de \
                  -p 8080:80 \
                  --volumes-from data_prod_cruwe_de \
                  --volumes-from conf_prod_cruwe_de \
                  nginx

Constructing more complex examples is left as an exercise to the reader.