Dockerize & Backup A Postgres Database

What is this?

This is a technical guide for doing common management tasks with a Postgres database running in Docker.

Why make this?

As a quick reference for taking db backups and restoring databases.

Steps

R…



What is this?

This is a technical guide for doing common management tasks with a Postgres database running in Docker.



Why make this?

As a quick reference for taking db backups and restoring databases.



Steps



Run The Postgres Image

Docker makes it incredibly easy to get going with a Postgres database with persistent storage. Let’s try it out.

  • Let’s just run the Postgres container pointing its volume location to a spot on disk we like. For me that’s: ~/docker/volumes/postgres.
# In your terminal
docker run \
 --name pg-db \
 -e POSTGRES_PASSWORD=shiny_metal_pass \
 -d \
 -p 6543:5432 \
 -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data \
 postgres
  • Let’s dissect this line by line:

    • docker run — Runs a specified image (in our case it’s postgres, see last line of command). Source
    • --name pg-db — Sets the name of the container running the postgres image. Source
    • -e POSTGRES_PASSWORD=shiny_metal_pass — Sets an environment variable which will be the default password for the superuser. Note: This should be a secret 🤫. Source
    • -d — Detach from the terminal instance. i.e. run in the background.
    • -p 6543:5432 — Map the local machine’s port 6543 to the container’s port 5432. So from our machine to connect we will use 6543 which in turn will hit the port in the container running on 5432.
    • -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data — Map any data created in the container at /var/lib/postgresql/data to a desired location on local disk: $HOME/docker/volumes/postgres. Source
    • postgres — The name of the image in Docker hub we want to create a container from.
  • When I run it I see the following

create-container

  • When I check our volume location (note: for me this was an empty location prior to invoking the command above) I see this:

check-mount

  • Finally, let’s see if we can connect to our database. I am using TablePlus but the same concepts apply with any general purpose database IDE or psql (found here). For me the connection looks like this:

connect-sql

  • And when we connect we can run the following to get a lay of the land.
-- From a SQL Editor (PSQL / TablePlus Session)
SELECT datname FROM pg_database;
-- RESULTS:
-- postgres
-- template1
-- template0



Create Some Persistent Data

So, we’ve got a database! What now? Well let’s store some data in it!

  • Start by creating a new database called todo_it_well. You guessed it this, YATA (yet another todo app). This time we are doing it in SQL later we will also show you how to use the createdb tool.
-- From a SQL Editor
CREATE DATABASE todo_it_well;
-- Query 1 OK: CREATE DATABASE
  • Switch to our new database. For me, in TablePlus, this just means editing my connection to use todo_it_well instead of postgres for the chosen database.

connect-todo-sql

  • Next, create some tables. We will, for this demonstration, create a users table and tasks table so that users can track their tasks or “todos”. The only almost complex bit of this is declaring a relationship between tasks and users. This post is not about SQL so don’t think too hard on these commands for now just know that they make a viable todo application.
-- From a SQL Editor
CREATE TABLE "user" (
   id INT SERIAL KEY NOT NULL,
   created_at TIMESTAMP NOT NULL DEFAULT NOW(),
   updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
   email TEXT NOT NULL,
   "password" TEXT NOT NULL,
   display_name TEXT
);

CREATE TABLE "task" (
   id INT SERIAL KEY NOT NULL,
   created_at TIMESTAMP NOT NULL DEFAULT NOW(),
   updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
   owner_id INT NOT NULL,
   description TEXT NOT NULL,
   label TEXT,
   completed_on TIMESTAMP,
   removed_on TIMESTAMP,
   CONSTRAINT fk_owner 
    FOREIGN KEY(owner_id) 
    REFERENCES "user"(id)
);
-- Query 1 OK: CREATE TABLE
-- Query 2 OK: CREATE TABLE
  • Before we finish this step let’s create some real data in here! Let’s create some users first and then some tasks for them!
-- From a SQL Editor
INSERT INTO "user"
  ("email", "password", "display_name")
VALUES 
  ('matt@email.fun', 'password-gasworld', 'matt-the-magic-monkey'),
  ('jennie@yahoo.fun', 'password-sassworld', 'dat-girl-jen'),
  ('timothy24@email.fun', 'pw-vw', 'yo-boi-tom-o-tee');
-- users created with ids: 1, 2 and 3
  • Next, the tasks:
-- From a SQL Editor
INSERT INTO "task"
   ("owner_id", "description", "label")
VALUES 
   (1, 'read this awesome post', 'tech'),
   (1, 'do dishes', NULL),
   (2, 'read chapter of lotr', 'lit'),
   (2, 'sign up to volunteer', NULL),
   (3, 'call brother', NULL),
   (3, 'make grocery list', 'health'),
   (3, 'rake leaves', 'house');
  • Validate that our data looks as expected:
-- From a SQL Editor
SELECT 
   u.display_name, 
   t.description
FROM 
   "user" u 
INNER JOIN
   "task" t ON u.id = t.owner_id
WHERE 
   u.id = 2
   AND t.completed_on IS NULL;

-- RESULTS:
-- display_name description
-- dat-girl-jen read chapter of lotr
-- dat-girl-jen sign up to volunteer



Create A Backup

So here we are with some valuable customer data that we really do not wanna lose. A good practice for any developer is to take regular (hopefully automated) database backups. Doing this in Postgres isn’t too hard so let’s get to it. I should state there are several ways to do this. This is just one of many ways to get this job done. I think it is the most straightforward way; though, there are definitely faster ways.

This approach consists of issuing a set of commands to the container that could likely be CRON job-ified without too much effort.

  • First let’s create a backups directory at a spot that we share with our container’s filesystem and our host machine:
# In your terminal
docker exec -t pg-db bash -c 'mkdir /var/lib/postgresql/data/backups'
  • This command (run from the host machine) executes a command (-c) mkdir on the container and creates a directory called backups which is located at the spot on the container that is shared with the host machine. You can verify this on the host by running:
# In your terminal
ls $HOME/docker/volumes/postgres | grep backups
# my machine outputs: backups
  • Next, let’s create our backup
# In your terminal
docker exec -t pg-db bash -c 'pg_dump todo_it_well -U postgres  --file=/var/lib/postgresql/data/backups/todo_it_well_db-$(date +%Y-%m-%d).sql'

Let’s breakdown this command:

  • docker exec -t pg-db bash -c

    • We tell Docker we want to run a command on the pg-db container
  • pg_dump todo_it_well -U postgres

    • We invoke the pg_dump tool on the container telling it to target the todo_it_well db and run commands as the postgres user.
  • --file=/var/lib/postgresql/data/backups/todo_it_well_db-$(date +%Y-%m-%d).sql

    • This is the spot on the container we want to place our back up at and as you can see it is the shared backups directory we just made. There is a little date logic that interpolates the current date as YEAR-MONTH-DAY.
  • If this went as planned we should have made our database backup and we can verify it on our host machine by running:

# In your terminal
ls $HOME/docker/volumes/postgres/backups
# my machine outputs: todo_it_well_db-2021-11-07.sql
cat $HOME/docker/volumes/postgres/backups/todo_it_well_db-2021-11-07.sql
# output includes: -- PostgreSQL database dump
  • If you see some SQL in your output then congratulations! You have successfully created a database backup and your data strategy for your awesome app is just the tiniest bit more durable. 🎉



Drop Data And Restore It

Great, so our data has been backed up! This means that you can restore folks back to a particular point in time when things were as they should be. So long as you have an application that can reliably serve this data (ruminates on articles using Goose to accomplish database versioning) you can restore functionality and value to your users.

BUT, backing up your data is only half the battle. Actually restoring it is another matter. Let’s practice the act of restoring our data.

First, I should confess that this article commits a sin of simplicity in that I wanted to get going as simply as possible. You more than likely should not place your backups in the exact same spot as the rest of your mounted volume data, but it was useful in getting us going faster.

  • So to start, we are going to put our backup temporarily out of this spot so that we can pretend we had mounted two volumes:
# In your terminal
# Let's move our data to our home directory and pretend this is where we safely mounted our backups when starting the container, ok?
cp \
$HOME/docker/volumes/postgres/backups/todo_it_well_db-2021-11-07.sql \
$HOME/todo_it_well_db-2021-11-07.sql
  • Next, verify that our data is where we expect it to be:
# In your terminal
ls $HOME | grep todo_it_well_db-2021-11-07.sql
# my machine outputs todo_it_well_db-2021-11-07.sql
  • Ok cool, we are pretending the we have a spot outside the normal Postgres files where we keep our data safely backed up. Now, the fun part. OMG! We just got attacked 😰. All records were destroyed by the evil hacker! Let’s stop our container and delete all of our persistent data.
# In your terminal
# Stop the container forcibly
docker rm -f pg-db
# Delete our persistent data
rm -rf $HOME/docker/volumes/postgres
  • Verify the data is indeed gone.
# In your terminal
ls $HOME/docker/volumes/postgres
# my machine outputs: No such file or directory
  • Now we have discovered the attack and want to bring our application back online. So we start up our container as before:
# In your terminal
docker run \
 --name pg-db \
 -e POSTGRES_PASSWORD=shiny_metal_pass \
 -d \
 -p 6543:5432 \
 -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data \
 postgres
  • However, sadly, our data, including our database, is gone and we cannot connect to the todo_it_well database.

connect-fail-sql

  • Fear not, 🦸‍♀️ we have a backup! Simply copy the backup into the container:
# In your terminal
docker cp $HOME/todo_it_well_db-2021-11-07.sql pg-db:/var
  • Next, you will need to recreate the database (so your backup data has a place to restore to):
# In your terminal
docker exec -t pg-db bash -c "createdb \
-h localhost \
-p 5432 \
-U postgres \
todo_it_well"
  • Finally, run the psql SQL file command and take a deep breath, you are recovering from a very stressful attack 😉.
docker exec -t pg-db bash -c \
"psql \
-U postgres \
-d todo_it_well \
-f /var/todo_it_well_db-2021-11-07.sql"
  • Lastly, let’s verify that our data was properly restored by running our query from earlier:
-- From a SQL Editor
SELECT 
   u.display_name, 
   t.description
FROM 
   "user" u 
INNER JOIN
   "task" t ON u.id = t.owner_id
WHERE 
   u.id = 2
   AND t.completed_on IS NULL;

-- RESULTS
-- display_name description
-- dat-girl-jen read chapter of lotr
-- dat-girl-jen sign up to volunteer

There you have it! You have done a ton! You set up a dockerized Postgres DB, you created you schemas and planted some data in your database, you took a back up of the data and then restored the data from back up after losing it. You rock 🎸.



Closing Thoughts

There are some issues with our set up and certainly there are some ease of use and durability improvements we could make, but this is a fully working solution. Some of the areas we should checkout:

  • Custom Dockerfile that includes a backups location.
  • A run command that references the previously mentioned backups location mount.
  • A cron job run on the host or on the container that creates backups.
  • A secure, but accessible spot for these backups to stored. Maybe DigitalOcean Spaces, why not 🦀?



Resources


Print Share Comment Cite Upload Translate
APA
Matthew Cale | Sciencx (2024-03-28T15:34:40+00:00) » Dockerize & Backup A Postgres Database. Retrieved from https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/.
MLA
" » Dockerize & Backup A Postgres Database." Matthew Cale | Sciencx - Wednesday November 17, 2021, https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/
HARVARD
Matthew Cale | Sciencx Wednesday November 17, 2021 » Dockerize & Backup A Postgres Database., viewed 2024-03-28T15:34:40+00:00,<https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/>
VANCOUVER
Matthew Cale | Sciencx - » Dockerize & Backup A Postgres Database. [Internet]. [Accessed 2024-03-28T15:34:40+00:00]. Available from: https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/
CHICAGO
" » Dockerize & Backup A Postgres Database." Matthew Cale | Sciencx - Accessed 2024-03-28T15:34:40+00:00. https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/
IEEE
" » Dockerize & Backup A Postgres Database." Matthew Cale | Sciencx [Online]. Available: https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/. [Accessed: 2024-03-28T15:34:40+00:00]
rf:citation
» Dockerize & Backup A Postgres Database | Matthew Cale | Sciencx | https://www.scien.cx/2021/11/17/dockerize-backup-a-postgres-database/ | 2024-03-28T15:34:40+00:00
https://github.com/addpipe/simple-recorderjs-demo