Kubernetes : Using Ceph RBD as Container Storage Interface (CSI)

Fajlinuxblog
6 min readMay 31, 2020

Kubernetes Pods with persistent volumes are important for data persistence because containers are being created and destroyed, depending on the load and on the specifications of the developers. Pods and containers can self-heal and replicate. They are, in essence, ephemeral.

Before start this post I’ve been posted some hands on and conceptual docs to understand since the beginning this solution.

1. About Container Storage Interface

Container Storage Interface. CSI is a unifying effort created by CNCF Storage Working Group, aimed towards defining a standard container storage interface that can enable storage drivers to work on any container orchestrator.

We are using rbd-kubernetes for my example with CSI , so developers can access storage exposed by a CSI compatible volume driver with the csi volume type on Kubernetes.

I’ve used this configuration my laboratory with 3 Ceph Monitos and 3 OSD Nodes.

  • 1 Gbs RAM
  • 1 vcpu
  • 15Gbs for OS Disk and 12 GBs for each OSD disk

Don’t use the configuration for prodution environment, for a production ready deployment use physical hardware and study the workload before any setup.

Ceph Ansible deployment how to : https://fajlinux.com/cloud/ceph-cluster-deployment/

As decribed by support matrix from official repository these versions are necessary :

  • Ceph : Mimic version +
  • Kubernetes: 1.14 +

2. Ceph configurations

Create a kubernetes pool inside Ceph cluster.

[root@ceph-mon1 ~]# ceph osd pool create kubernetes 256 256

I’ve used 256 for PG numbers inside my cluster, because this is a teste. So look these documentations and understand the pg pool calculation before create it for production.

On the Ceph monitor initialize the Kubernetes pool for use by RBD :

rbd pool init kubernetes

Create the Kubernetes auth user :

[root@ceph-mon1 ~]# ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'...[client.kubernetes]
key = AQDXs9JeQXtqFRAAGSU9alHjNm+CORLwBk9qQg==
[root@ceph-mon1 ~]#

The ceph-csi requires a ConfigMap object stored in Kubernetes to define the the Ceph monitor addresses for the Ceph cluster. Collect both the Ceph cluster unique fsid and the monitor addresses:

[root@ceph-mon1 ~]# ceph mon dump...
dumped monmap epoch 1
epoch 1
fsid 31d15197-ddf7-418f-8f07-3e6744c6af80
last_changed 2020-05-30 17:11:10.563775
created 2020-05-30 17:11:10.563775
min_mon_release 14 (nautilus)
0: [v2:192.168.15.140:3300/0,v1:192.168.15.140:6789/0] mon.ceph-mon1
1: [v2:192.168.15.141:3300/0,v1:192.168.15.141:6789/0] mon.ceph-mon2
2: [v2:192.168.15.142:3300/0,v1:192.168.15.142:6789/0] mon.ceph-mon3

Create a RBD image with 1Gb size and run a rbd list images :

# Create a RBD image
$ rbd create rbd-myvol --size 1G -p kubernetes
# Disable some features
$ rbd feature disable -p kubernetes rbd-myvol exclusive-lock object-map fast-diff deep-flatten
# Validate with the rbd list command
$ rbd -p kubernetes ls

After this install ceph-common package, in my example I will using Ansible:

# RHEL / Fedora / CentOS
$ ansible rbdclient -m yum -a "name=ceph-common state=present"
# Ubuntu
$ ansible rbdclient -m apt -a "name=ceph-common state=present"

Ansible inventory file

[rbdclients]
master1
master2
master3
worker1
worker2
worker3

3. RBD Kubernetes

  • Requires Kubernetes 1.14+

Get the values from the last command “ceph mon dump” and create csi-config-map.yaml file :

  • clusterID : fsid from Ceph cluster
  • monitors: each Ceph monitor Ip with the format “<monitor Ip>:<monitor port>”
---
apiVersion: v1
kind: ConfigMap
data:
config.json: |-
[
{
"clusterID": "31d15197-ddf7-418f-8f07-3e6744c6af80",
"monitors": [
"192.168.15.140:6789",
"192.168.15.141:6789",
"192.168.15.142:6789"
]
}
]
metadata:
name: ceph-csi-config

Apply the csi config map:

$ kubectl apply -f csi-config-map.yaml

get the output from configuration for auth key and create csi-rbd-secret.yaml file :

---
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: default
stringData:
userID: kubernetes
userKey: AQDXs9JeQXtqFRAAGSU9alHjNm+CORLwBk9qQg==

Clone the ceph-csi repository :

$ git clone https://github.com/ceph/ceph-csi.git

Apply the RBAC rules :

$ kubectl apply -f ceph-csi/deploy/rbd/kubernetes/csi-provisioner-rbac.yaml...
serviceaccount/rbd-csi-provisioner created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner-rules created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
$ kubectl apply -f ceph-csi/deploy/rbd/kubernetes/csi-nodeplugin-rbac.yaml...serviceaccount/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin-rules created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created

Apply vault configuration

$ kubectl apply -f  examples/kms/vault/kms-config.yaml

Finally, create the ceph-csi provisioner and node plugins.

$ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml$ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin.yaml$ kubectl apply -f csi-rbdplugin-provisioner.yaml$ kubectl apply -f csi-rbdplugin.yaml

Create a storage class for rbd and change the clusterID for your Ceph fsid with the YAML file csi-rbd-sc.yaml :

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
clusterID: 31d15197-ddf7-418f-8f07-3e6744c6af80
pool: kubernetes
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
csi.storage.k8s.io/controller-expand-secret-namespace: default
csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
csi.storage.k8s.io/node-stage-secret-namespace: default
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard

Apply the storage class configuration :

$ kubectl apply -f csi-rbd-sc.yaml

Check the storage class :

$  kubectl get storageclassNAME         PROVISIONER        RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
csi-rbd-sc rbd.csi.ceph.com Delete Immediate false 40m

Check all pods, services and deployments

$ kubectl get podNAME                                         READY   STATUS    RESTARTS   AGE
csi-rbdplugin-hvrkk 3/3 Running 0 7m59s
csi-rbdplugin-provisioner-7b564c8c67-7thps 6/6 Running 0 7m59s
csi-rbdplugin-provisioner-7b564c8c67-mhj6r 6/6 Running 0 82m
csi-rbdplugin-provisioner-7b564c8c67-qhff5 6/6 Running 0 7m59s
csi-rbdplugin-vl2c4 3/3 Running 0 7m59s
$ kubectl get deploymentsNAME READY UP-TO-DATE AVAILABLE AGE
csi-rbdplugin-provisioner 3/3 3 3 82m
$ kubectl get svcNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-metrics-rbdplugin ClusterIP 10.233.8.205 <none> 8080/TCP 22s
csi-rbdplugin-provisioner ClusterIP 10.233.6.202 <none> 8080/TCP 83m
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 114m

4. Environment Test

In the following steps we will validate two types of volume mode:

  • Filesystem mode: is mounted into Pods into a directory. If the volume is backed by a block device and the device is empty, Kuberneretes creates a filesystem on the device before mounting it for the first time.
  • Block mode : To use a volume as a raw block device. Such volume is presented into a Pod as a block device, without any filesystem on it. This mode is useful to provide a Pod the fastest possible way to access a volume, without any filesystem layer between the Pod and the volume. On the other hand, the application running in the Pod must know how to handle a raw block device

4.1 Filesystem

Create the PVC YAML filesystem.yaml

vi filesystem.yaml---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc

Apply the filesystem YAML :

$ kubectl apply -f raw-block-pvc.yaml

Check the persistent volume claim status :

[root@kube-master1 ~]# kubectl get pvc...NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
rbd-pvc Bound pvc-6816306c-5fc4-42fb-8050-0b3b50963132 1Gi RWO csi-rbd-sc 35s

Check the RBD Image on Ceph :

[root@ceph-mon1 ~]# rbd -p kubernetes ls...csi-vol-8d7059ff-a35d-11ea-81ae-1abea9a19b7a

Pod to validate this :

vi filesystem-pod.yaml---
apiVersion: v1
kind: Pod
metadata:
name: csi-rbd-demo-pod
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
# Apply the filesystem podkubectl apply -f filesystem-pod.yaml

4.2 Block Mode

Create the PVC YAML raw-block-pvc.yaml

vi raw-block-pvc.yaml---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: raw-block-pvc
spec:
accessModes:
- ReadWriteMany
volumeMode: Block
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc

Apply the raw block YAML :

$ kubectl apply -f raw-block-pvc.yaml

Check the persistent volume claim status :

[root@kube-master1 ~]# kubectl get pvcNAME            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
raw-block-pvc Bound pvc-06f1ee78-989e-4f6f-93e3-04c6894f75bc 1Gi RWX csi-rbd-sc 23m
rbd-pvc1 Bound pvc-6816306c-5fc4-42fb-8050-0b3b50963132 1Gi RWO csi-rbd-sc 28m

Check the RBD Image on Ceph :

[root@ceph-mon1 ~]# rbd -p kubernetes lscsi-vol-3d10f75f-a35e-11ea-81ae-1abea9a19b7a
csi-vol-8d7059ff-a35d-11ea-81ae-1abea9a19b7a

Pod to validate this :

vi raw-block-pod.yaml---
apiVersion: v1
kind: Pod
metadata:
name: pod-with-raw-block-volume
spec:
containers:
- name: fc-container
image: fedora:26
command: ["/bin/sh", "-c"]
args: ["tail -f /dev/null"]
volumeDevices:
- name: data
devicePath: /dev/xvda
volumes:
- name: data
persistentVolumeClaim:
claimName: raw-block-pvc
# Apply the raw-block podkubectl apply -f filesystem-pod.yaml

4.3 Deleting all resources for PVC test

To delete the persistent volumes claims and pods run :

# Delete pods 
kubectl delete pod <POD NAME>
# Delete PVC
kubectl delete pvc <PVC NAME>
# Delete PV
kubectl delete pv <PV NAME>

Check if these volumes were deleted from Ceph cluster after delete the persistent volumes :

[root@ceph-mon1 ~]# rbd -p kubernetes ls

--

--