Storage (typically) plays a large role in software systems. With Kubernetes that is, of course, no different. The way Kubernetes handles storage can be a little confusing, especially in cloud-provided scenarios. This post will explain some basics of storage in Kubernetes and also the options that are available in the Azure Kubernetes Service (AKS).
How storage works in Kubernetes
There are a few layers of storage in Kubernetes. What you provide to the pods is a PersistentVolumeClaim (PVC). This PVC is a claim to a PersistentVolume (PV). A PV is created either manually or it can be created automatically through the use of a StorageClass.
There is a lot more complexity to how this works in Kubernetes, but it is beyond the scope of this blog post. If you’re interested in learning more, take a look at the following resources:
This post will be focusing on StorageClass objects. A StorageClass will either be created by the Kubernetes cluster user, or sometimes (like we’ll see in this blog post) there are StorageClass objects that already ship with the cloud provider. Let’s focus on what AKS provides users for storage.
Provisioners available for Azure
Azure Disk uses a either a managed or an unmanaged disk as the storage backing. You can specify the storage account type which will provide you with the required performance that you need for this storage.
Azure File is backed by an SMB share from a storage account (either standard with HDD or premium with SSD).
Built-in StorageClass objects in AKS
One of the benefits of running Kubernetes in Azure with AKS is that it ships with out-of-the-box StorageClass objects.
1 2 3 4 5 6 $ kubectl get storageclass NAME PROVISIONER AGE azurefile kubernetes.io/azure-file 8m31s azurefile-premium kubernetes.io/azure-file 8m31s default (default) kubernetes.io/azure-disk 3m30s managed-premium kubernetes.io/azure-disk 3m30s
You can see here which are Azure Disk and which are Azure File by the
PROVISIONER column. To see the specifics of these StorageClass objects you can run
kubectl get storageclass <sc_name> -o yaml.
As of the writing of this post, here are the differences between the respective StorageClass objects in the same provisioniner.
default has a storage account type of StandardSSD_LRS. It is a managed disk.
managed-premium has a storage account type of Premium_LRS and is also a managed disk.
azurefile has a Standard_LRS storage account type.
azurefile-premium has a Premium_LRS storage account type.
Create your own StorageClass
We have talked about the StorageClass objects that are provided from AKS, but that doesn’t mean you are restricted to only using those. Like other resources in Kubernetes, you are free to define your own custom StorageClass (and are recommended in doing this, especially if no other StorageClass fits your software’s needs).
Note: Any changes made to the default StorageClass objects will be overwritten! If you need to slightly modify the behavior of a built-in StorageClass then a separate one should be created.
Choosing between different StorageClass objects
There are a couple of factors involved in determining which StorageClass you should be using for your Azure storage needs in your cluster. The primary one is if the storage needs to be accessed by a single pod or by multiple pods. If you require multiple pods to access the storage then you must use a StorageClass with the Azure File provisioner. But if you require only a single pod to access the storage then you can use an Azure Disk.
And of course your application performance requirements will dictate the underlying storage that you need to use. For more information on selecting the right storage, see this official documentation (Best practices for storage and backups in Azure Kubernetes Service).
Here’s a quick example of using Azure storage in an AKS cluster. My (fake) requirements are that I need multiple pods to be able to access the storage (
ReadWriteMany), so that means I’m going to have to use a StorageClass with the Azure File provisioniner. I’ll choose
azurefile, as I don’t need the performance of premium storage.
Like explained above, this is what tells the cluster to reserve (“claim”) the storage from the StorageClass (in this case,
azurefile). Once this PVC is created and provisioned, it can be used by pods.
1 2 3 4 5 6 7 8 9 10 11 kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc1 spec: accessModes: - ReadWriteMany storageClassName: azurefile resources: requests: storage: 1Gi
This pod mounts the PVC and just loops while reading it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 kind: Pod apiVersion: v1 metadata: name: pod1 spec: containers: - name: pod1container image: debian:latest command: ["/bin/bash"] args: ["-c", "while true; do date; cat /var/local/sharedstorage/file1; sleep 5; done"] volumeMounts: - name: sharedstorage mountPath: /var/local/sharedstorage volumes: - name: sharedstorage persistentVolumeClaim: claimName: pvc1
This pod also mounts the PVC (which is possible because the PVC is set to be
ReadWriteMany) and writes to a file in a loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 kind: Pod apiVersion: v1 metadata: name: pod2 spec: containers: - name: pod2container image: debian:latest command: ["/bin/bash"] args: ["-c", "while true; do sleep 15; echo hi from pod2 $(date) >> /var/local/sharedstorage/file1; done"] volumeMounts: - name: sharedstorage mountPath: /var/local/sharedstorage volumes: - name: sharedstorage persistentVolumeClaim: claimName: pvc1
This is a small example to show how to work with storage that is provisioned from Azure into an AKS cluster, and having multiple pods reading to it and writing from it.
The logs from the reader pod (pod1) show the expected output data from pod2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 $ kubectl logs pod1 Mon Aug 31 15:46:03 UTC 2020 cat: /var/local/sharedstorage/file1: No such file or directory Mon Aug 31 15:46:08 UTC 2020 cat: /var/local/sharedstorage/file1: No such file or directory Mon Aug 31 15:46:13 UTC 2020 hi from pod2 Mon Aug 31 15:46:12 UTC 2020 Mon Aug 31 15:46:18 UTC 2020 hi from pod2 Mon Aug 31 15:46:12 UTC 2020 Mon Aug 31 15:46:23 UTC 2020 hi from pod2 Mon Aug 31 15:46:12 UTC 2020 Mon Aug 31 15:46:28 UTC 2020 hi from pod2 Mon Aug 31 15:46:12 UTC 2020 hi from pod2 Mon Aug 31 15:46:27 UTC 2020
Through the usage of built-in StorageClass objects in AKS, you can be up-and-running with cloud-provided storage in your Kubernetes cluster quickly and easily. With some basic understanding of the moving parts of storage, you can make the right decisions for your Kubernetes-hosted software.