Setup Kubernetes Persistent Volumes on Docker Desktop
Kubernetes provides a powerful platform for managing containerized applications. In Kubernetes, Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) enable dynamic and scalable storage management, ensuring that stateful applications can retain their data even if containers are deleted and recreated.
Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) in Kubernetes are crucial for managing persistent data in stateful applications like databases, content management systems, and CI/CD pipelines. They ensure that important data, such as database records, application logs, and media files, remain intact even when containers are restarted or moved. This capability is particularly valuable for maintaining data consistency and durability in applications like MySQL, WordPress, Elasticsearch, and big data processing tools like Hadoop, Spark. PVs and PVCs offer flexible and scalable storage management, supporting various backends like cloud storage, NFS, and local disks, making it easier to run stateful applications in a dynamic Kubernetes environment.
This article will walk you through the steps to set up PostgreSQL in a Kubernetes cluster on Docker Desktop with persistent storage, ensuring your data remains intact even when the deployment is deleted, terminated, or restarted.
Prerequisites
- Docker Desktop: Ensure Docker Desktop is installed with Kubernetes enabled.
- kubectl: The Kubernetes command-line tool must be installed.
- File Sharing: Ensure Docker Desktop's file sharing settings include the directory where data will be stored.
Create Persistent Directory
Create a directory on your host machine that Kubernetes will use to store persistent data. For example, if you're setting up a PostgreSQL database:
mkdir -p data/postgres
This directory will be mounted inside your Kubernetes pods.
Create Persistent Volume
Next define a Persistent Volume (PV) and a Persistent Volume Claim (PVC) to ensure that Kubernetes can allocate storage and bind it to your pods.
postgres-sc-pv-pvc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: docker-sc
provisioner: docker.io/hostpath
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-pv
spec:
storageClassName: docker-sc # hostpath
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: "/Users/saurav/Tech/Kubernetes/pv_pvc/data/postgres" # Host path on MacOS
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
spec:
storageClassName: docker-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Apply the configuration:
kubectl apply -f postgres-sc-pv-pvc.yaml
This creates a Storage Class, Persistent Volume and a Persistent Volume Claim that any Pod can use.
Verify Persistent Volume
First, confirm that the Persistent Volume and Persistent Volume Claim are correctly created and bound.
kubectl get sc
kubectl get pv
kubectl get pvc
Deploy PostgreSQL
Now, we’ll deploy PostgreSQL using a Kubernetes Deployment and connect it to the Persistent Volume.
postgres-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:13
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
value: "Password1234"
volumeMounts:
- mountPath: "/var/lib/postgresql/data"
name: postgres-storage
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
Apply the deployment:
kubectl apply -f postgres-deployment.yaml
This will create a PostgreSQL Pod using the specified Docker image (postgres:13) and mount the Persistent Volume to /var/lib/postgresql/data within the container.
Verify Deployment
kubectl get deployments
kubectl get pods
Check Local Directory
Since we’re using a hostPath for the Persistent Volume, the files should be stored on our local machine at /Users/saurav/Tech/Kubernetes/pv_pvc/data/postgres.
ls -ltra /Users/saurav/Tech/Kubernetes/pv_pvc/data/postgres
Create Service for PostgreSQL
Now we need to create a Service for the PostgreSQL deployment. This Service will route traffic to the PostgreSQL Pod on port 5432.
postgres-service.yaml
apiVersion: v1
kind: Service
metadata:
name: postgres-service
spec:
type: ClusterIP
selector:
app: postgres
ports:
- port: 5433
targetPort: 5432
This YAML file defines a ClusterIP Service named postgres-service, which will expose port 5433 within the Kubernetes cluster. Apply the Service configuration to the cluster:
kubectl apply -f postgres-service.yaml
Verify Service
kubectl get svc
Access PostgreSQL Locally
Now that the Service is created, you can use the kubectl port-forward command to access PostgreSQL on your local machine:
kubectl port-forward svc/postgres-service 5433:5433
Verify Data Persistence
Connect to PostgreSQL
With port forwarding in place, you can connect to PostgreSQL using a client like psql or a GUI tool like pgAdmin.
psql -h 127.0.0.1 -p 5433 -U postgres -W
To test data persistence, we can create a simple table & insert data.
CREATE TABLE test_table(id SERIAL PRIMARY KEY, name VARCHAR(50));
INSERT INTO test_table(name) VALUES ('Kubernetes Persistent Volume');
INSERT INTO test_table(name) VALUES ('Kubernetes Persistent Volume Claim');
SELECT * FROM test_table;
Simulate Failure
To demonstrate that the data persists even when the deployment is killed, we’ll delete the PostgreSQL Pod.
kubectl delete pod postgres-deployment-5f667db685-2dt5q
Kubernetes will automatically restart the Pod since it is part of a Deployment. Once the Pod is up and running again, reconnect to PostgreSQL and verify if the data is still there.
SELECT * FROM test_table;
You should see that the data still exists, demonstrating that the Persistent Volume is working correctly.
By following the steps above, you’ve successfully set up PostgreSQL in a Kubernetes cluster on Docker Desktop with persistent storage. This setup ensures that your PostgreSQL data is not lost even if the container crashes or is deleted. Persistent Volumes in Kubernetes provide a reliable way to manage stateful applications in an otherwise stateless environment, making them essential for databases like PostgreSQL.