How to dockerize a postgres SQL database

In this article, we will walk through how to dockerize a postgres database. If you are unfamiliar with docker, see this quick start.

We will work off an existing postgres database installed locally. First let’s connect to our local database and create a dump.sql back up of the database using pg_dump with the following format:

$ pg_dump -h {host} -p {port} -U {username} {database} > ./dump.sql

For this tutorial, I’m using a database called topstack on localhostport 5432with user postgres. We’ll create a backup SQL script with:

$ pg_dump -h localhost -p 5432 -U postgres topstack > ./dump.sql

We now have a backup of our existing database in the form of a SQL script dump.sql, we’ll use this to import the data into our postgres docker container. First let’s define our Docker container with a Dockerfile, that will sit in the same directory as dump.sql:

FROM postgres:9.6

ENV POSTGRES_DB topstack

ADD ./dump.sql /docker-entrypoint-initdb.d/

VOLUME [ "/var/lib/postgresql/data" ]

EXPOSE 5432

Let’s look a bit closer at what this Dockerfile is doing. The first line FROM postgres:9.6is defining our base image that we will build off of, for our case we are using the official postgres image with postgres version 9.6. If the version is omitted, docker will pull the lastest version.

ENV POSTGRES_DB employeesis setting an environment variable the docker image uses for the name of the database.

ADD ./dump.sql /docker-entrypoint-initdb.d/copies our SQL script back-up from our host machine into the designated directory that docker executes files from on initialization. For us, this will import our backed up data into the postgres database on our docker container.

VOLUME [ "/var/lib/postgresql/data" ]tells docker this directory is a volume, meaning that we can mount the directory along with its data on our host machine. We will use this to persist data from our container on the host machine so if the container is removed, the data will remain on our host machine as a docker volume.

Finally, EXPOSE 5432 tells docker that we will want to open port 5432 on our container. This is the default port that postgres listens on.

We can now build our docker image with using the Dockerfile:

$ docker build -t topstack-db-image -f ./Dockerfile .

Let’s break down this command. docker build tells docker to build an image, -t topstack-db-image tags that image with the name ‘topstack-db-image’, -f ./Dockerfile . tells docker to build the image from the file in our current directory called Dockerfile, which we created above.

We now have a built docker docker image called topstack-db. We should be able to see the image listed, along with our base images if we run the command:

$ docker images

Before we create a container from our image, let’s create a docker volume that will keep our postgres data on our host machine so that it will not be lost if the container is removed. We will name the volume ‘topstack’.

$ docker volume create topstack

Now that we have our volume made and our image is ready to go, we are ready to spin up a docker container.

docker run -d -v topstack:/var/lib/postgresql/data --name topstack-db-container -p 12345:5432 topstack-db-image

The docker run command tells docker to start up a container defined by our image we built from our Dockerfile. The distinction between a docker image and docker container can be thought of like using VMware or Virtual Box to run a virtual machine. The ISO file that defines the virtual machine ‘image’ is like the docker image, then we you open that ISO file and run an instance of the virtual machine, it is like executing the docker run command and telling docker to run an instance of the docker image in a new container. In short, a docker container is a particular instance of a docker image.

The -dflag tells docker to start the image in detach mode, which will run it in the background. -v topstack:/var/lib/postgresql/data is how we map the volume we created on our host machine, topstack, with the volume we defined in our image /var/lib/postgresql/data. This way all our postgres data in the container will be backed up on our host machine docker volume.

--name topstack-db-container is giving the container a name. -p 12345:5432 is telling docker to map port 12345 of our host machine to 5432 in our container, which we exposed in the Dockerfile. This will be our entry point for the database in the container. Finally, we must specify the name of the image we are using to run the container off, here it’s topstack-db-image. NOTE: you don’t have to name your image with -image or container with -container , I only did that for clarity in the distinction between the two for this tutorial.

Now we can see our container running:

$ docker ps

From the output of that command, you can see that docker is forwarding requests to port 12345 to port 5432 on the container. This means that our newly dockerized postgres database is listening on http://localhost:12345

For example we can use the following jdbc url:

jdbc:postgresql://localhost:12345/topstack

We can add/edit data in the postgres database on the container and it will persist to our host machine.

There are many advantages to use a dockerized database. We can run multiple postgres databases and have them map to different ports seamlessly. We can quickly spin up snapshots of a database for testing or rapid development. We have an isolated instance of our database. Plus much more!