Kubernetes : Using Ceph RBD as Container Storage Interface (CSI)
Kubernetes Pods with persistent volumes are important for data persistence because containers are being created and destroyed, depending on the load and on the specifications of the developers. Pods and containers can self-heal and replicate. They are, in essence, ephemeral.
Before start this post I’ve been posted some hands on and conceptual docs to understand since the beginning this solution.
- Kubernetes: An Overview.
- Kubernetes: Ansible deployment.
- Ceph : An Overview.
- Ceph : Ansible deployment.
1. About Container Storage Interface
Container Storage Interface. CSI is a unifying effort created by CNCF Storage Working Group, aimed towards defining a standard container storage interface that can enable storage drivers to work on any container orchestrator.
We are using rbd-kubernetes for my example with CSI , so developers can access storage exposed by a CSI compatible volume driver with the csi volume type on Kubernetes.
I’ve used this configuration my laboratory with 3 Ceph Monitos and 3 OSD Nodes.
- 1 Gbs RAM
- 1 vcpu
- 15Gbs for OS Disk and 12 GBs for each OSD disk
Don’t use the configuration for prodution environment, for a production ready deployment use physical hardware and study the workload before any setup.
Ceph Ansible deployment how to : https://fajlinux.com/cloud/ceph-cluster-deployment/
As decribed by support matrix from official repository these versions are necessary :
- Ceph : Mimic version +
- Kubernetes: 1.14 +
2. Ceph configurations
Create a kubernetes pool inside Ceph cluster.
[root@ceph-mon1 ~]# ceph osd pool create kubernetes 256 256
I’ve used 256 for PG numbers inside my cluster, because this is a teste. So look these documentations and understand the pg pool calculation before create it for production.
On the Ceph monitor initialize the Kubernetes pool for use by RBD :
rbd pool init kubernetes
Create the Kubernetes auth user :
[root@ceph-mon1 ~]# ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'...[client.kubernetes]
key = AQDXs9JeQXtqFRAAGSU9alHjNm+CORLwBk9qQg==
[root@ceph-mon1 ~]#
The ceph-csi requires a ConfigMap object stored in Kubernetes to define the the Ceph monitor addresses for the Ceph cluster. Collect both the Ceph cluster unique fsid and the monitor addresses:
[root@ceph-mon1 ~]# ceph mon dump...
dumped monmap epoch 1
epoch 1
fsid 31d15197-ddf7-418f-8f07-3e6744c6af80
last_changed 2020-05-30 17:11:10.563775
created 2020-05-30 17:11:10.563775
min_mon_release 14 (nautilus)
0: [v2:192.168.15.140:3300/0,v1:192.168.15.140:6789/0] mon.ceph-mon1
1: [v2:192.168.15.141:3300/0,v1:192.168.15.141:6789/0] mon.ceph-mon2
2: [v2:192.168.15.142:3300/0,v1:192.168.15.142:6789/0] mon.ceph-mon3
Create a RBD image with 1Gb size and run a rbd list images :
# Create a RBD image
$ rbd create rbd-myvol --size 1G -p kubernetes# Disable some features
$ rbd feature disable -p kubernetes rbd-myvol exclusive-lock object-map fast-diff deep-flatten# Validate with the rbd list command
$ rbd -p kubernetes ls
After this install ceph-common package, in my example I will using Ansible:
# RHEL / Fedora / CentOS
$ ansible rbdclient -m yum -a "name=ceph-common state=present"# Ubuntu
$ ansible rbdclient -m apt -a "name=ceph-common state=present"
Ansible inventory file
[rbdclients]
master1
master2
master3
worker1
worker2
worker3
3. RBD Kubernetes
- Requires Kubernetes 1.14+
Get the values from the last command “ceph mon dump” and create csi-config-map.yaml file :
- clusterID : fsid from Ceph cluster
- monitors: each Ceph monitor Ip with the format “<monitor Ip>:<monitor port>”
---
apiVersion: v1
kind: ConfigMap
data:
config.json: |-
[
{
"clusterID": "31d15197-ddf7-418f-8f07-3e6744c6af80",
"monitors": [
"192.168.15.140:6789",
"192.168.15.141:6789",
"192.168.15.142:6789"
]
}
]
metadata:
name: ceph-csi-config
Apply the csi config map:
$ kubectl apply -f csi-config-map.yaml
get the output from configuration for auth key and create csi-rbd-secret.yaml file :
---
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: default
stringData:
userID: kubernetes
userKey: AQDXs9JeQXtqFRAAGSU9alHjNm+CORLwBk9qQg==
Clone the ceph-csi repository :
$ git clone https://github.com/ceph/ceph-csi.git
Apply the RBAC rules :
$ kubectl apply -f ceph-csi/deploy/rbd/kubernetes/csi-provisioner-rbac.yaml...
serviceaccount/rbd-csi-provisioner created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner-rules created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created$ kubectl apply -f ceph-csi/deploy/rbd/kubernetes/csi-nodeplugin-rbac.yaml...serviceaccount/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin-rules created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
Apply vault configuration
$ kubectl apply -f examples/kms/vault/kms-config.yaml
Finally, create the ceph-csi provisioner and node plugins.
$ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml$ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin.yaml$ kubectl apply -f csi-rbdplugin-provisioner.yaml$ kubectl apply -f csi-rbdplugin.yaml
Create a storage class for rbd and change the clusterID for your Ceph fsid with the YAML file csi-rbd-sc.yaml :
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
clusterID: 31d15197-ddf7-418f-8f07-3e6744c6af80
pool: kubernetes
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
csi.storage.k8s.io/controller-expand-secret-namespace: default
csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
csi.storage.k8s.io/node-stage-secret-namespace: default
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard
Apply the storage class configuration :
$ kubectl apply -f csi-rbd-sc.yaml
Check the storage class :
$ kubectl get storageclassNAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
csi-rbd-sc rbd.csi.ceph.com Delete Immediate false 40m
Check all pods, services and deployments
$ kubectl get podNAME READY STATUS RESTARTS AGE
csi-rbdplugin-hvrkk 3/3 Running 0 7m59s
csi-rbdplugin-provisioner-7b564c8c67-7thps 6/6 Running 0 7m59s
csi-rbdplugin-provisioner-7b564c8c67-mhj6r 6/6 Running 0 82m
csi-rbdplugin-provisioner-7b564c8c67-qhff5 6/6 Running 0 7m59s
csi-rbdplugin-vl2c4 3/3 Running 0 7m59s$ kubectl get deploymentsNAME READY UP-TO-DATE AVAILABLE AGE
csi-rbdplugin-provisioner 3/3 3 3 82m$ kubectl get svcNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-metrics-rbdplugin ClusterIP 10.233.8.205 <none> 8080/TCP 22s
csi-rbdplugin-provisioner ClusterIP 10.233.6.202 <none> 8080/TCP 83m
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 114m
4. Environment Test
In the following steps we will validate two types of volume mode:
Filesystem mode:
is mounted into Pods into a directory. If the volume is backed by a block device and the device is empty, Kuberneretes creates a filesystem on the device before mounting it for the first time.Block mode :
To use a volume as a raw block device. Such volume is presented into a Pod as a block device, without any filesystem on it. This mode is useful to provide a Pod the fastest possible way to access a volume, without any filesystem layer between the Pod and the volume. On the other hand, the application running in the Pod must know how to handle a raw block device
4.1 Filesystem
Create the PVC YAML filesystem.yaml
vi filesystem.yaml---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc
Apply the filesystem YAML :
$ kubectl apply -f raw-block-pvc.yaml
Check the persistent volume claim status :
[root@kube-master1 ~]# kubectl get pvc...NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-6816306c-5fc4-42fb-8050-0b3b50963132 1Gi RWO csi-rbd-sc 35s
Check the RBD Image on Ceph :
[root@ceph-mon1 ~]# rbd -p kubernetes ls...csi-vol-8d7059ff-a35d-11ea-81ae-1abea9a19b7a
Pod to validate this :
vi filesystem-pod.yaml---
apiVersion: v1
kind: Pod
metadata:
name: csi-rbd-demo-pod
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false# Apply the filesystem podkubectl apply -f filesystem-pod.yaml
4.2 Block Mode
Create the PVC YAML raw-block-pvc.yaml
vi raw-block-pvc.yaml---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: raw-block-pvc
spec:
accessModes:
- ReadWriteMany
volumeMode: Block
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc
Apply the raw block YAML :
$ kubectl apply -f raw-block-pvc.yaml
Check the persistent volume claim status :
[root@kube-master1 ~]# kubectl get pvcNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
raw-block-pvc Bound pvc-06f1ee78-989e-4f6f-93e3-04c6894f75bc 1Gi RWX csi-rbd-sc 23m
rbd-pvc1 Bound pvc-6816306c-5fc4-42fb-8050-0b3b50963132 1Gi RWO csi-rbd-sc 28m
Check the RBD Image on Ceph :
[root@ceph-mon1 ~]# rbd -p kubernetes lscsi-vol-3d10f75f-a35e-11ea-81ae-1abea9a19b7a
csi-vol-8d7059ff-a35d-11ea-81ae-1abea9a19b7a
Pod to validate this :
vi raw-block-pod.yaml---
apiVersion: v1
kind: Pod
metadata:
name: pod-with-raw-block-volume
spec:
containers:
- name: fc-container
image: fedora:26
command: ["/bin/sh", "-c"]
args: ["tail -f /dev/null"]
volumeDevices:
- name: data
devicePath: /dev/xvda
volumes:
- name: data
persistentVolumeClaim:
claimName: raw-block-pvc# Apply the raw-block podkubectl apply -f filesystem-pod.yaml
4.3 Deleting all resources for PVC test
To delete the persistent volumes claims and pods run :
# Delete pods
kubectl delete pod <POD NAME># Delete PVC
kubectl delete pvc <PVC NAME># Delete PV
kubectl delete pv <PV NAME>
Check if these volumes were deleted from Ceph cluster after delete the persistent volumes :
[root@ceph-mon1 ~]# rbd -p kubernetes ls