Kubernetes backup with Velero and Ceph

4 min readJun 5, 2020

Velero is a tool that enables backup and restore Kubernetes cluster resources and persistent volumes. It simplifies the task of taking backups/restores, migrating resources to other clusters, and replication of clusters.

Stores Kubernetes resources in highly available object stores (S3, GCS, Blob Storage, etc.)
Backs up PVs / PVCs using cloud providers’ disk capture mechanism
Scheduling backups with a syntax cron
Rotation of automatic backups with TTL (Time to Live)
Supports community-enhanced plugins

The Velero client makes a call to the Kubernetes API server to create a Backup object.
The BackupController notices the new Backupobject and performs validation.
The BackupController begins the backup process. It collects the data to back up by querying the API server for resources.
The BackupController makes a call to the object storage service – for example by S3 to upload the backup file.

1. Object Store environment

Velero supports many options for object store and because of my necessity to test this on onpremise environment I would like to test this with Ceph and Rados Gateway.

Check all supported object store in this link .

Create a S3 user

[root@ceph-mon1 ~]# sudo radosgw-admin user create --subuser=velero:s3 --display-name="Velero Kubernetes Backup" --key-type=s3 --access=full    "user_id": "velero",
    "display_name": "Velero Kubernetes Backup",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "subusers": [
        {
            "id": "velero:s3",
            "permissions": "full-control"
        }
    ],
    "keys": [
        {
            "user": "velero:s3",
            "access_key": "AOTBA6CUYR4P2WD7Q7ZK",
            "secret_key": "d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6"
        },
        {
            "user": "velero:s3",
            "access_key": "RKF0CW7T2XA16BMJI8FW",
            "secret_key": "Z8BF56cAsuj5KFSSjMOD0At1nGfVmTjPx3sOFpWZ"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "default_storage_class": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

Install s3cmd to create a bucket by CLI

[root@ceph-mon1 ~]# yum install s3cmd

Configure s3cmd to use my Rados GW endpoint

[root@ceph-mon1 ~]# s3cmd --configureEnter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key [AOTBA6CUYR4P2WD7Q7ZK]: 
Secret Key [d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6]: 
Default Region [US]: Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [radosgw.local.lab:80]: Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.radosgw.local.lab]: radosgw.local.lab:80Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: 
Path to GPG program [/usr/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [No]: On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name: New settings:
  Access Key: AOTBA6CUYR4P2WD7Q7ZK
  Secret Key: d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6
  Default Region: US
  S3 Endpoint: radosgw.local.lab:80
  DNS-style bucket+hostname:port template for accessing a bucket: radosgw.local.lab:80
  Encryption password: 
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: False
  HTTP Proxy server name: 
  HTTP Proxy server port: 0Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)Now verifying that encryption works...
Not configured. Never mind.Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'

Create a bucket for Velero

[root@ceph-mon1 ~]# s3cmd mb s3://veleroBucket 's3://velero/' created

2. Velero Install

CLI Download

$ wget https://github.com/vmware-tanzu/velero/releases/download/v1.2.0/velero-v1.2.0-linux-amd64.tar.gz

$ tar -xzf velero-v1.2.0-linux-amd64.tar.gz

$ sudo cp velero-v1.2.0-linux-amd64/velero /usr/local/sbin

Create a file with s3 credentials

$ vi credentials-store
[default]
aws_access_key_id = AOTBA6CUYR4P2WD7Q7ZK
aws_secret_access_key = d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6

Velero deployment with Rados Gateway

We are using the S3 credentials for access the bucket velero with aws S3 sdk plugin , so , change the s3Url for Rados Gateway like our example http://radosgw.local.lab

$ velero install  --provider aws --bucket velero \
--plugins velero/velero-plugin-for-aws:v1.0.0 \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config \
region=minio,s3ForcePathStyle="true",s3Url=http://radosgw.local.lab

3. Velero Backup and Restore

Create a single Nginx deployment for the test

$ kubectl velero-v1.2.0-linux-amd64/examples/nginx-app/base.yaml

Create a backup from this Nginx app

$ velero backup create nginx-backup --selector app=nginx --snapshot-volumes=true
$ velero backup describe nginx-backup --details

List the job backup

$ velero get backup
NAME           STATUS      CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
nginx-backup   Completed   2020-06-05 13:56:40 +0000 UTC   29d       default            app=nginx

Check the namespace and resources created for this example

# Get Namespaces
$ kubectl get namespacesNAME              STATUS   AGE
default           Active   139m
kube-node-lease   Active   139m
kube-public       Active   139m
kube-system       Active   139m
metallb-system    Active   74m
nginx-example     Active   27m
velero            Active   132m
# Get Pods 
$ kubectl get pods -n nginx-example
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-7cd5ddccc7-ctvg8   1/1     Running   0          27m
nginx-deployment-7cd5ddccc7-kpfz5   1/1     Running   0          27m

Delete the namespace

$ kubectl delete namespace nginx-example

Restore backup for nginx-example

$ velero restore create --from-backup nginx-backupRestore request "nginx-backup-20200605135949" submitted successfully.
Run `velero restore describe nginx-backup-20200605135949` or `velero restore logs nginx-backup-20200605135949` for more details.

Check the restore

$ kubectl get namespaces | grep nginx nginx-example     Active   3m44s
$ kubectl get pods -n nginx-example NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-7cd5ddccc7-ctvg8   1/1     Running   0          3m52s
nginx-deployment-7cd5ddccc7-kpfz5   1/1     Running   0          3m51s

The content can be viewed when you check the bucket with s3cmd

$ s3cmd ls s3://velero
                          DIR  s3://velero/backups/
                          DIR  s3://velero/restores/

Kubernetes backup with Velero and Ceph

1. Object Store environment

2. Velero Install

3. Velero Backup and Restore

Written by Fajlinuxblog