Though there are PostgreSQL Operators to fully manage PostgreSQL High Availability (HA) installations in Kubernetes (K8S), sometimes you don’t need HA. For many small applications, you just need a single instance of PostgreSQL with daily backup. For these cases, simple K8S deployment with persistent volume and once in 24 hours backup with K8S CronJob is fine. I will show you how to run a single instance of PostgreSQL with daily backups in this post.
For this tutorial, I assume you have a K8S cluster on GCP and Kubectl configured on your local machine. If you don’t have one, please refer to this Quickstart on Google Kubernetes Engine docs. Google also gives you $300 credit to explore Google Cloud Services, which should be more than sufficient to try out this tutorial.
Namespace
I am using a namespace called prod.
Filename: namespace.yaml
apiVersion: v1 kind: Namespace metadata: name: prod
To create the namespace, run the command:
kubectl apply -f namespace.yaml
This should create the namespace for you.
K8S Secret
Now we need to create a K8S secret so that we don’t have to specify the PostgreSQL user password in plain text.
Create a file named prod-postgresql-12-password.txt and add a sufficiently complex password in it. Make sure that the file contains only 1 line, and you DO NOT put a line break (Enter) at the end of the line. Once the file is ready, you can run the following command to create the Kubernetes Secret:
kubectl create secret --namespace=prod generic prod-pgsql-12-password \ --from-file=password=./passwords/prod-postgresql-12-password.txt
K8S Deployment and Service
Next, we need to create the Persistent Volume, Deployment, and Service for PostgreSQL. We will put all of these in the same file so that we can refer to them together later.
Filename: postgresql-12.yaml
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pgsql12-volumeclaim namespace: prod spec: storageClassName: standard accessModes: - ReadWriteOnce resources: requests: storage: 20Gi --- apiVersion: apps/v1 kind: Deployment metadata: name: pgsql12 namespace: prod spec: replicas: 1 selector: matchLabels: app: pgsql12 strategy: type: Recreate template: metadata: labels: app: pgsql12 namespace: prod spec: containers: - name: postgres image: postgis/postgis:12-3.1 resources: requests: memory: "1024Mi" cpu: "250m" limits: memory: "1024Mi" cpu: "250m" imagePullPolicy: "IfNotPresent" ports: - containerPort: 5432 env: - name: POSTGRES_DB value: "postgresdb" - name: POSTGRES_USER value: "pguser" - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: prod-pgsql-12-password key: password volumeMounts: - mountPath: /var/lib/postgresql/data name: postgredb subPath: postgres volumes: - name: postgredb persistentVolumeClaim: claimName: pgsql12-volumeclaim --- apiVersion: v1 kind: Service metadata: name: pgsql12 namespace: prod labels: app: pgsql12 spec: type: ClusterIP ports: - protocol: TCP port: 5451 targetPort: 5432 selector: app: pgsql12
Here we are creating 3 K8S objects:
- PersistentVolumeClaim of 20GB to hold our PostgreSQL data.
- Deployment, which runs postgis/postgis:12-3.1 image so that we can use the wonderful PostGIS extension for GeoSpatial analysis along with PostgreSQL. Here we have assigned 1GB RAM and 1/4th of the CPU core to this DB. To get better performance, you might need to increase this value depending on the kind of nodes you have in your K8S cluster.
- Service which exposes the PostgreSQL service at port 5451 using Cluster IP. Since this is specified as Cluster IP, one can’t connect to the database outside the cluster.
To create these objects, you can run the following command:
kubectl apply -f postgresql-12.yaml
At this point, you should have PostgreSQL running inside your GKE cluster.
Port Forwarding
To connect to the database from your local machine, you need to set up temporary port forwarding. You can do the same using the following command:
kubectl port-forward service/pgsql12 5451:5451 --namespace=prod
At this point, the PostgreSQL server should be available on your localhost:5451 port. You can connect to it using the psql command-line utility or using a GUI tool like DBeaver. You will need to use pguser as the username and the password you used while creating the K8S secret.
Backup using K8S CronJob
Now that we have the PostgreSQL Server running, we should set up a regular backup for our data. For that, we will use K8S CronJob. However, for this backup, we will need a custom script. So we will first need to create a Docker image for this backup script and then schedule it as a cron job.
We will store the data in a GCP storage bucket. To write to this bucket, we will need a GCP service account and get its credentials as a JSON file. The process of setting a GCP service account is out of the scope of this tutorial. You can read more about service accounts here.
PgBackup Docker Image
To create a docker image, I have set up a folder structure like the following:
pgbackup-k8s | | -- files | | -- gcp-service-account.json | | -- pgbackups.sh | | -- Dockerfile
Filename: pgbackups.sh
#!/bin/bash # # Backup a Postgresql database into a daily file. # TIMESTAMP=`date +"%Y-%m-%d-%H-%M"` BACKUP_DIR=/tmp/pgsql-backup/"$TIMESTAMP" FILE_SUFFIX=-pg.sql HOST=$PGSQL_HOST PORT=$PGSQL_PORT USER=$PGSQL_USER export PGPASSWORD="$PGSQL_PASS" GCLOUD_KEY_FILE=$GCLOUD_KEY_FILE BACKUP_BUCKET_LOC=$BACKUP_BUCKET_LOC # Remove backup dir if already exists rm -rf $BACKUP_DIR echo "Listing db names" DBNAMES="`/usr/bin/psql -h "$HOST" -p "$PORT" -U "$USER" -l -A -F: | sed -ne "/:/ { /Name:Owner/d; /template0/d; s/:.*$//; p }"`" echo $DBNAMES for MDB in $DBNAMES do DATABASE=$MDB FILE=`date +"%Y-%m-%d-%H-%M"`-$DATABASE${FILE_SUFFIX} OUTPUT_DIR=${BACKUP_DIR} mkdir -p ${OUTPUT_DIR} OUTPUT_FILE=${OUTPUT_DIR}/${FILE} # do the database backup (dump) /usr/bin/pg_dump -h $HOST -p $PORT -U $USER -F p -f ${OUTPUT_FILE} ${DATABASE} # show the user the result echo "${OUTPUT_FILE} was created:" ls -lh ${OUTPUT_FILE} done # Zip backup cd $BACKUP_DIR BACKUP_FILE=$PGSQL_HOST-$PGSQL_PORT-backup-${TIMESTAMP}.tar.gz tar -zcvf $BACKUP_FILE *.sql gcloud auth activate-service-account --key-file=$GCLOUD_KEY_FILE gsutil cp $BACKUP_FILE $BACKUP_BUCKET_LOC
Filename: Dockerfile
FROM gcr.io/google.com/cloudsdktool/cloud-sdk:alpine RUN apk --update add --no-cache postgresql-client RUN mkdir /srv/jobs ENV PGSQL_HOST "127.0.0.1" ENV PGSQL_PORT 5432 ENV PGSQL_USER "postgres" ENV PGSQL_PASS "" ENV GCLOUD_KEY_FILE "" ENV BACKUP_BUCKET_LOC "" COPY files/* /srv/jobs/ ENTRYPOINT "/srv/jobs/pgbackups.sh"
To build the docker image and to push it to your GCP private container registry, run the following commands:
docker build -t pgbackup-k8s . docker tag pgbackup-k8s gcr.io/your-repo/pgbackup-k8s docker push gcr.io/your-repo/pgbackup-k8s
Note: If you want to know the process of setting a GCP Container Registry, please click here.
PostgreSQL backup K8S CronJob
Once the image is available in our private GCP container registry, we need to set up the K8S CronJob to run the backups periodically. For that, create the following file:
filename: pgsql-12-backup-cronjob.yaml
apiVersion: batch/v1 kind: CronJob metadata: name: pgsql12-k8s-backup namespace: prod spec: schedule: "0 0 * * *" jobTemplate: spec: template: spec: containers: - name: pgsql12-k8s-backup-prod image: gcr.io/mobisoftk8s/pgbackup-k8s resources: requests: memory: "256Mi" cpu: "10m" limits: memory: "512Mi" cpu: "30m" env: - name: PGSQL_HOST value: pgsql12.prod - name: PGSQL_PORT value: "5451" - name: PGSQL_USER value: pguser - name: GCLOUD_KEY_FILE value: "/srv/jobs/gcp-service-account.json" - name: BACKUP_BUCKET_LOC value: "gs://your-bucket/pgsql12-backups/" - name: PGSQL_PASS valueFrom: secretKeyRef: name: prod-pgsql-12-password key: password imagePullPolicy: IfNotPresent restartPolicy: OnFailure
Here please pay attention to the environment variable values and make sure that those are correct as per your deployment.
To create the CronJob run:
kubectl apply -f pgsql-12-backup-cronjob.yaml
This will create the CronJob in your K8S cluster. It will run every midnight. If you want to run it right away to check how it works, you can log in to your GKE dashboard. Go to the CornJob deployment and run it by clicking on the Run Now button in the top row.
That’s it. This concludes our tutorial on how to run PostgreSQL on GKE along with PostGIS and a daily backup.
In Conclusion,
Take advantage of our extensive experience and a decade of technical expertise to get started with this powerful and open-source container management system. At Mobisoft, we have seasoned technical professionals to help you utilize resources effectively using Kubernetes. You can get in touch with us for further deployment, development, and consulting for Kubernetes.
Author's Bio:
Pritam Barhate, with an experience of 14+ years in technology, heads Technology Innovation at Mobisoft Infotech. He has a rich experience in design and development. He has been a consultant for a variety of industries and startups. At Mobisoft Infotech, he primarily focuses on technology resources and develops the most advanced solutions.