Storage and Persistence in Kubernetes and OpenShift: An Introduction
One of the biggest mental shifts when moving to Kubernetes and OpenShift is understanding what happens to your data.
Containers are ephemeral.
Pods are disposable.
Nodes come and go.
So where does your data live?
This post introduces how storage and persistence work in Kubernetes and OpenShift, and why they’re so different from traditional virtual machines.
The Problem Kubernetes Had to Solve
In traditional environments:
- Applications run on servers or VMs
- Disks are attached directly
- Data lives where the app runs
In Kubernetes:
- Pods are rescheduled
- Nodes can disappear
- Containers are recreated constantly
Without a storage abstraction, every pod restart would mean data loss.
Kubernetes solved this by separating:
Compute from Storage
Persistent Volumes: Storage That Lives Beyond Pods
Kubernetes introduces Persistent Volumes (PVs).
A PV is:
- A piece of storage
- Provisioned from a storage system (NFS, iSCSI, Ceph, cloud disks, etc.)
- Independent of any specific pod
Think of it as:
“A disk in the cluster”
Persistent Volume Claims (PVCs)
Applications don’t use PVs directly.
They request storage using a Persistent Volume Claim (PVC).
A PVC says:
“I need 10Gi of storage that supports ReadWriteOnce”
Kubernetes then:
- Finds a matching PV
- Or dynamically creates one
- Binds it to that claim
The application never needs to know where the storage came from.
StorageClasses: How Storage Is Provisioned
A StorageClass defines:
- What backend to use (EBS, Ceph, NFS, SAN, etc.)
- Performance characteristics
- Replication and durability
- How volumes are created
In OpenShift, StorageClasses represent:
- ODF (Ceph)
- Cloud block storage
- SANs
- Virtualization-backed volumes
- CSI drivers
StorageClasses make storage self-service.
How Pods Use Storage
A pod mounts a PVC like this:
volumes:
- name: data
persistentVolumeClaim:
claimName: my-dataIf the pod dies:
- Kubernetes recreates it
- The same PVC is reattached
- The data is still there
This is how Kubernetes makes state possible.
Why This Matters for Workloads
Stateless workloads
- Web apps
- APIs
- Workers
→ Often don’t need persistent storage
Stateful workloads
- Databases
- Message queues
- Identity services
- CI systems
→ Always need persistent storage
Storage is what turns:
“A container” into “A system”
How OpenShift Makes This Better
OpenShift builds on Kubernetes storage by adding:
- CSI driver integration
- Storage operators
- Dynamic provisioning
- Snapshot support
- Volume cloning
- Policy enforcement
OpenShift can support:
- Cloud disks
- SANs
- Ceph
- Hyperconverged storage
- Virtualization workloads
This allows OpenShift to run:
- Databases
- Virtual machines
- AI/ML pipelines
- CI/CD platforms
…on the same platform.
Common Misconceptions
“Containers can’t have storage.”
They can — Kubernetes just manages it differently.
“Stateful workloads don’t belong on Kubernetes.”
They do — when storage is designed correctly.
“PVs are tied to pods.”
They aren’t. Pods come and go; volumes remain.
What’s Coming Next
Future posts will cover:
- StorageClasses in OpenShift
- ReadWriteOnce vs ReadWriteMany
- Local vs network storage
- Running databases on OpenShift
- Persistent storage for virtual machines
Final Thoughts
Kubernetes changed how we think about compute.
Persistent Volumes changed how we think about storage.
Once you understand that:
- Pods are temporary
- Volumes are durable
- Claims connect the two
…everything about stateful workloads in OpenShift starts to make sense.
