Though there are PostgreSQL Operators to fully manage PostgreSQL High Availability (HA) installations in Kubernetes (K8S), sometimes you don’t need HA. For many small applications, you just need a single instance of PostgreSQL with daily backup. For these cases, simple K8S deployment with persistent volume and once in 24 hours backup with K8S CronJob is fine. I will show you how to run a single instance of PostgreSQL with daily backups in this post. 

How to Install PostgreSQL and PostGIS in Google Kubernetes Engine (GKE)

For this tutorial, I assume you have a K8S cluster on GCP and Kubectl configured on your local machine.  If you don’t have one, please refer to this Quickstart on Google Kubernetes Engine docs. Google also gives you $300 credit to explore Google Cloud Services, which should be more than sufficient to try out this tutorial. 

Namespace

I am using a namespace called prod

Filename: namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: prod

To create the namespace, run the command:

kubectl apply -f namespace.yaml

This should create the namespace for you. 

K8S Secret

Now we need to create a K8S secret so that we don’t have to specify the PostgreSQL user password in plain text. 

Create a file named prod-postgresql-12-password.txt and add a sufficiently complex password in it. Make sure that the file contains only 1 line, and you DO NOT put a line break (Enter) at the end of the line. Once the file is ready, you can run the following command to create the Kubernetes Secret:

kubectl create secret --namespace=prod generic prod-pgsql-12-password \
  --from-file=password=./passwords/prod-postgresql-12-password.txt

K8S Deployment and Service

Next, we need to create the Persistent Volume, Deployment, and Service for PostgreSQL. We will put all of these in the same file so that we can refer to them together later. 

Filename: postgresql-12.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pgsql12-volumeclaim
  namespace: prod
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgsql12
  namespace: prod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pgsql12
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: pgsql12
      namespace: prod
    spec:
      containers:
        - name: postgres
          image: postgis/postgis:12-3.1
          resources:
            requests:
              memory: "1024Mi"
              cpu: "250m"
            limits:
              memory: "1024Mi"
              cpu: "250m"
          imagePullPolicy: "IfNotPresent"
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              value: "postgresdb"
            - name: POSTGRES_USER
              value: "pguser"
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: prod-pgsql-12-password
                  key: password
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: postgredb
              subPath: postgres
      volumes:
        - name: postgredb
          persistentVolumeClaim:
            claimName: pgsql12-volumeclaim
---
apiVersion: v1
kind: Service
metadata:
  name: pgsql12
  namespace: prod
  labels:
    app: pgsql12
spec:
  type: ClusterIP
  ports:
    - protocol: TCP
      port: 5451
      targetPort: 5432
  selector:
    app: pgsql12

Here we are creating 3 K8S objects:

  1. PersistentVolumeClaim of 20GB to hold our PostgreSQL data.
  1. Deployment, which runs postgis/postgis:12-3.1 image so that we can use the wonderful PostGIS extension for GeoSpatial analysis along with PostgreSQL. Here we have assigned 1GB RAM and 1/4th of the CPU core to this DB. To get better performance, you might need to increase this value depending on the kind of nodes you have in your K8S cluster. 
  1. Service which exposes the PostgreSQL service at port 5451 using Cluster IP. Since this is specified as Cluster IP, one can’t connect to the database outside the cluster. 

To create these objects, you can run the following command:

kubectl apply -f postgresql-12.yaml

At this point, you should have PostgreSQL running inside your GKE cluster. 

Port Forwarding

To connect to the database from your local machine, you need to set up temporary port forwarding. You can do the same using the following command:

kubectl port-forward service/pgsql12 5451:5451 --namespace=prod

At this point, the PostgreSQL server should be available on your localhost:5451 port. You can connect to it using the psql command-line utility or using a GUI tool like DBeaver. You will need to use pguser as the username and the password you used while creating the K8S secret. 

Extend your team with proven Java developers to create dynamic Java-based web and mobile apps.

Backup using K8S CronJob

Now that we have the PostgreSQL Server running, we should set up a regular backup for our data. For that, we will use K8S CronJob. However, for this backup, we will need a custom script. So we will first need to create a Docker image for this backup script and then schedule it as a cron job.

We will store the data in a GCP storage bucket. To write to this bucket, we will need a GCP service account and get its credentials as a JSON file. The process of setting a GCP service account is out of the scope of this tutorial. You can read more about service accounts here

PgBackup Docker Image

To create a docker image, I have set up a folder structure like the following:

pgbackup-k8s
|
| -- files
|    | -- gcp-service-account.json
|    | -- pgbackups.sh
|
| -- Dockerfile  

Filename:  pgbackups.sh

#!/bin/bash
#
# Backup a Postgresql database into a daily file.
#

TIMESTAMP=`date +"%Y-%m-%d-%H-%M"`
BACKUP_DIR=/tmp/pgsql-backup/"$TIMESTAMP"
FILE_SUFFIX=-pg.sql

HOST=$PGSQL_HOST
PORT=$PGSQL_PORT
USER=$PGSQL_USER
export PGPASSWORD="$PGSQL_PASS"

GCLOUD_KEY_FILE=$GCLOUD_KEY_FILE
BACKUP_BUCKET_LOC=$BACKUP_BUCKET_LOC

# Remove backup dir if already exists
rm -rf $BACKUP_DIR

echo "Listing db names"

DBNAMES="`/usr/bin/psql -h "$HOST" -p "$PORT" -U "$USER" -l -A -F: | sed -ne "/:/ { /Name:Owner/d; /template0/d; s/:.*$//; p }"`"

echo $DBNAMES

for MDB in $DBNAMES
do
        DATABASE=$MDB
        FILE=`date +"%Y-%m-%d-%H-%M"`-$DATABASE${FILE_SUFFIX}

        OUTPUT_DIR=${BACKUP_DIR}
        mkdir -p ${OUTPUT_DIR}
        OUTPUT_FILE=${OUTPUT_DIR}/${FILE}

        # do the database backup (dump)
        /usr/bin/pg_dump -h $HOST -p $PORT -U $USER -F p -f ${OUTPUT_FILE} ${DATABASE}

        # show the user the result
        echo "${OUTPUT_FILE} was created:"
        ls -lh ${OUTPUT_FILE}
done

# Zip backup
cd $BACKUP_DIR
BACKUP_FILE=$PGSQL_HOST-$PGSQL_PORT-backup-${TIMESTAMP}.tar.gz
tar -zcvf $BACKUP_FILE *.sql

gcloud auth activate-service-account --key-file=$GCLOUD_KEY_FILE
gsutil cp $BACKUP_FILE $BACKUP_BUCKET_LOC

Filename: Dockerfile

FROM gcr.io/google.com/cloudsdktool/cloud-sdk:alpine

RUN apk --update add --no-cache postgresql-client
RUN mkdir /srv/jobs

ENV PGSQL_HOST "127.0.0.1"
ENV PGSQL_PORT 5432
ENV PGSQL_USER "postgres"
ENV PGSQL_PASS ""
ENV GCLOUD_KEY_FILE ""
ENV BACKUP_BUCKET_LOC ""

COPY files/* /srv/jobs/

ENTRYPOINT "/srv/jobs/pgbackups.sh"

To build the docker image and to push it to your GCP private container registry, run the following commands:

docker build -t pgbackup-k8s .
docker tag pgbackup-k8s gcr.io/your-repo/pgbackup-k8s
docker push gcr.io/your-repo/pgbackup-k8s

Note: If you want to know the process of setting a GCP Container Registry, please click here

PostgreSQL backup K8S CronJob

Once the image is available in our private GCP container registry, we need to set up the K8S CronJob to run the backups periodically. For that, create the following file:

filename: pgsql-12-backup-cronjob.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: pgsql12-k8s-backup
  namespace: prod
spec:
  schedule: "0 0 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: pgsql12-k8s-backup-prod
            image: gcr.io/mobisoftk8s/pgbackup-k8s
            resources:
              requests:
                memory: "256Mi"
                cpu: "10m"
              limits:
                memory: "512Mi"
                cpu: "30m"
            env:
            - name: PGSQL_HOST
              value: pgsql12.prod
            - name: PGSQL_PORT
              value: "5451"
            - name: PGSQL_USER
              value: pguser
            - name: GCLOUD_KEY_FILE
              value: "/srv/jobs/gcp-service-account.json"
            - name: BACKUP_BUCKET_LOC
              value: "gs://your-bucket/pgsql12-backups/"
            - name: PGSQL_PASS
              valueFrom:
                secretKeyRef:
                  name: prod-pgsql-12-password
                  key: password
            imagePullPolicy: IfNotPresent
          restartPolicy: OnFailure

Here please pay attention to the environment variable values and make sure that those are correct as per your deployment.

To create the CronJob run:

kubectl apply -f pgsql-12-backup-cronjob.yaml

This will create the CronJob in your K8S cluster. It will run every midnight. If you want to run it right away to check how it works, you can log in to your GKE dashboard. Go to the CornJob deployment and run it by clicking on the Run Now button in the top row. 

That’s it. This concludes our tutorial on how to run PostgreSQL on GKE along with PostGIS and a daily backup. 

In Conclusion,

Take advantage of our extensive experience and a decade of technical expertise to get started with this powerful and open-source container management system. At Mobisoft, we have seasoned technical professionals to help you utilize resources effectively using Kubernetes. You can get in touch with us for further deployment, development, and consulting for Kubernetes.

Author's Bio:

mobisoft-pritam
Pritam Barhate

Pritam Barhate, with an experience of 14+ years in technology, heads Technology Innovation at Mobisoft Infotech. He has a rich experience in design and development. He has been a consultant for a variety of industries and startups. At Mobisoft Infotech, he primarily focuses on technology resources and develops the most advanced solutions.