Docker volumes: How to protect your app’s data

Docker containers are designed to be short-lived. If a new version of an application is released, we don’t login inside our container and run apt upgrade to install the newer version—we simply replace it with a different container running the more recent version of the application. Docker volumes essential to making short-lived containers but persistent data.

Docker is designed to package an application and its dependencies in a new way: the container. Before containers were the norm on many servers, people often had issues with upgrades. For example, a developer upgraded their site from PHP 5 to PHP 7 but failed to update all the required PHP modules, and then things fell apart. Developers also struggled with having different environments on their local machine and the deployment server due to slightly different packages.

Docker is supposed to bridge these differences. For a PHP 7 environment, you start with a clean-slate environment where you develop and test your application. You then package that into a Docker image, which spawns the same container regardless of where you run it. Dropping into a container and upgrading software from inside is an anti-pattern: You no longer have a clean slate, and you defeat the very purpose of using Docker.

An application is not all its environment and code, however. There’s valuable data, media files, a user database, posts, and more, all of which needs to be kept around across upgrades. If you remove an old container, you also lose all its data unless you decouple the data from the container. Unless you use Docker volumes.

What are Docker volumes?

Docker volumes persistently store data, which can be made available to containers. For example, if you create a volume named my-vol it will create a directory at /var/lib/docker/volumes/my-volume on your host system. You don’t need to know the exact location on the host, as Docker manages the volume for you and allows you to mount it on any new container.

The container can write data to this volume, which persist even after you remove the container. Let’s create a couple of containers, one without a volume and another with one, to see the difference.

Ephemeral storage without volumes

To simulate user data, we will log in to the container, create a directory /app-v1/content/, and in it write a file data.txt with the message “Hello, World!” in it.

$ docker run -dit --name container1 ubuntu:16.04
$ docker attach container1
# mkdir -p /app-v1/content
# echo “Hello, World!” >> /app-v1/content/data.txt

If you stop and restart this container using docker stop container1 and docker start container1. The data would survive, and you can see that yourself. However, if you remove the container docker rm -f container1 then the data.txt file is lost forever. Even if you create a new container1, you won’t get the same data back.

Persistent storage with volumes

We can create volumes while creating the containers themselves or using the command:

$ docker volume create my-vol
$ docker volume ls

The second command lists all the volumes on your Docker host. You can then mount this volume to a given location inside the container’s filesystem, while creating the container. To do this, we use the -v flag followed by a list of parameters separated by a colon (:). These parameters are:

  • The volume name as shown by docker volume ls
  • The mount point inside the container file system. If that directory doesn’t exist, Docker will create it.
  • (Optional) You can use flags like ro to give the container read-only permissions for the data that resides inside this volume.
$ docker run -dit --name container2 -v my-vol:/app-v1/content ubuntu:16.04

If the volume my-vol exists, that particular volume and its data are made available to the new container at /app-v1/content directory. If the volume doesn’t exist, then a new one is created by Docker. You can try creating a sample data.txt file inside this container at this location.

$ docker attach container2
# echo "Hello, World! From a Docker volume." >> /app-v1/content/data.txt
# exit

Let’s upgrade container2 from Ubuntu 16.04 to Ubuntu 18.04. First, simply stop and remove this container, and create a new one with the same volume. We’ll mount my-vol at /app-v2/content to indicate a higher version.

$ docker stop container2
$ docker rm container2
$ docker run -dit --name container3 -v my-vol:/app-v2/content ubuntu:18.04
$ docker attach container3
/# cat /app-v2/content/data.txt
Hello, World! From a Docker volume.
/#

You can see that the data.txt file and its contents are preserved. Now’s let’s look at some real-world scenario where volumes can help.

Where to Use Docker volumes?

The most obvious use case for Docker volumes would involve storing a website’s contents. For example, if you are planning on running a WordPress site, then create a volume for the site’s configuration, themes, plugins, and more, and mount it at /var/www/wordpress/content. You’ll also want to create a volume for the database that stores user data and your posts.

By attaching these volumes to the WordPress container, you can freely remove the container and spin up a new one without losing any of your data.

There are a few other good uses for Docker volumes. One is configuration files for an Apache/Nginx web server, or for any other application, really. You can create a new volume mount it at /etc/apache2 on your Apache2 container, and this would save valuable time as you won’t have to re-configure Apache every time a new version is released.

Databases are another great example. Since the entire point of a database is to store user information, volumes are of enormous importance here. If your MySQL container is writing data to a given volume, and you wish to try Mariadb, simply mount the same volume inside a new Mariadb container. The databases, users, and passwords all will stick around, and you can get going with MariaDB no time.

It’s also important to put your WordPress database in a container as well, so you don’t accidentally delete all your posts!

Using Docker Volume with Ghost CMS

Let’s try a quick example that closely mirrors a real-world scenario. This involves running Ghost, a content management system with an elegant design and easy-to-use interface. The official Docker image for Ghost is configured to listen on port 2368, which you can map onto port 80 of your VPS if you like. Be sure to check out our Ghost installation tutorial, too.

The version 1.0 (and above) stores all of its contents at /var/lib/ghost/content, so we can mount a new volume ghost-content at that location. Begin with:

$ docker run -dit --name ghost-web-v1 -p 80:2368 -v ghost-content:/var/lib/ghost/content ghost:1

You can then visit http://IP_ADDRESS/ghost for initial setup. Here, IP_ADDRESS is just a placeholder for the actual IP address of your VPS. Create a new user and add some content to this new website.

I created a sample post as user “John Doe” titled “Docker Volumes”:

Docker volumes: An example using the Ghost CMS

Let’s stop this container and pull a different image. As mentioned in the official documentation the de facto image is built upon Debian image, but we can use a much smaller Alpine based image as well. It still expects the data to be at /var/lib/ghost/content, so we mount the same volume ghost-content at this mount point.

$ docker stop ghost-web-v1
$ docker run -dit --name ghost-web-alpine -p 80:2368 -v ghost-web:/var/lib/ghost/content ghost:1-alpine

Now when you log back into your website, you will notice that your user credentials are unaltered. You will use the same email and password to log in, and the same dummy post created by you (like the one I created above) will still exist.

You changed the underlying website without losing any data! You can try upgrading to version 2.0.3 yourself, which its developers released in August 2018.

Docker volumes are crucial to any Docker-based workflow. You can configure and automatically deploy them with docker-compose files, but it’s good to spend some time creating and connecting them to containers manually—that way, you’ll be better prepared in case anything goes wrong. Have fun and experiment!

Oh, and make sure you back them up. More on that soon.

The 10X cloud for developers who demand performance.

We've pioneered next-generation cloud hosting with NVMe disk technology: 10X performance at 1/5 the price of slower servers from DigitalOcean, Linode, and Vultr. Deploy faster and scale at a fraction of the cost.

Go 10X now →