Ceph : Cluster deployment

7 min readMay 28, 2020

Ceph : Cluster deployment

Posted on May 28, 2020 by fajlinux

In this post will be presented how to deploy the environment by ceph ansible.

An ansible deployment is the most standardized and official format among the main vendors using Ceph. Ex: Suse, Oracle and Redhat.

The installation presented in this document will use a similar flow for deploying:

If you would like to get an overview about Ceph these document could help you.

1) Requirements

Admin node: Server where you have the Ansible package and the Ceph Ansible module.
3 Monitors and Managers: Ceph Monitors, the same hosts will be hosts for Managers.
3 Storage nodes: In order to deploy in a physical environment, it is necessary to conduct a thorough hardware study. http://docs.ceph.com/docs/luminous/start/hardware-recommendations/

Enable epel repo on all hosts

yum install epel-release

Firewall definition for a Ceph environment:

Firewall documentation: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/installation_guide_for_red_hat_enterprise_linux/requirements-for-installing-rhcs#configuring-a-firewall-for-red-hat-ceph -storage-install

Monitor rule :
6789/tcp
Regra do manager e osd node:
from port 6800 / tcp to 7300 / tcp
Regra do metadata server:
6800/tcp
Regra do object gateway:
Utiliza 8080/tcp, 80/tcp e 443/tcp (if you want SSL).

2) Environment

I'm considering use these hosts for Ceph cluster deployment :

3 Ceph Monitors
3 OSDs Nodes
3 Ceph MGRs

A good approach for understand wich workload and hardware needed are this documents :

3) Admin node e Ceph Ansible

The admin node referenced in the requirements will be the laboratory’s controller node.

Install the ansible and git package on the host controller:

[root@host ~]# yum install ansible git -y

Let’s configure Ceph Ansible, where $ BRANCH is the branch that will be used and explained shortly after.

[root@host ~]# cd /usr/share
[root@host ~]#  git clone https://github.com/ceph/ceph-ansible.git
[root@host ~]# git checkout $BRANCH 
[root@host ~]# ln -s /usr/share/ceph-ansible/group_vars /etc/ansible/group_vars

According to the Ceph Ansible documentation, we have the following branches to use.

stable-3.0 Supports Ceph versions jewel and luminous. This branch requires Ansible version 2.4.
stable-3.1 Supports Ceph versions luminous and mimic. This branch requires Ansible version 2.4.
stable-3.2 Supports Ceph versions luminous and mimic. This branch requires Ansible version 2.6.
stable-4.0 Supports Ceph version nautilus. This branch requires Ansible version 2.8.
master Supports Ceph@master version. This branch requires Ansible version 2.8.

For more information the official documentation has all the information: http://docs.ceph.com/ceph-ansible/master/#releases

During this deploy we are using branch stable-3.2.

Create an admin user with passwordless sudo for execute the ansible playbooks.

useradd admin
passwd admin 
echo "admin ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/admin
chmod 0440 /etc/sudoers.d/admin
sed -i s'/Defaults requiretty/#Defaults requiretty'/g /etc/sudoers

Create the key on the host controller and distribute it among all hosts.

[admin@host ~]$ ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/admin/.ssh/id_rsa): 
Created directory '/home/admin/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/admin/.ssh/id_rsa.
Your public key has been saved in /home/admin/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:NpdAJw1vFexb/PPUFPFbX18DS4K550hiRMLH9oZEc3E admin@controller
The key's randomart image is:
+---[RSA 2048]----+
|    ..+++==Eo+ ..|
|     ..Bo*o.+ o..|
|      = + +. o .*|
|       + B o. o X|
|      . S *  o ++|
|       . + ..  .+|
|               .o|
|                .|
|                 |
+----[SHA256]-----+
[admin@host ~]$ vi  ~/.ssh/config
...Host controller
        Hostname ansible
        User admin
 
Host mon1
        Hostname mon1
        User adminHost mon2
        Hostname mon2
        User admin Host mon3
        Hostname mon3
        User adminHost osd1
        Hostname osd1
        User admin
 
Host osd2
        Hostname osd2
        User admin
 
Host osd3
        Hostname osd3
        User adminHost client
        Hostname client
        User admin
...

Create the ceph-ansible-keys directory

[admin@host ~]$ mkdir ~/ceph-ansible-keys

Create the ansible log directory

[root@host ~]# mkdir /var/log/ansible
[root@host ~]# chown admin.admin /var/log/ansible 
[root@host ~]# chmod 755 /var/log/ansible

Distribute keys among the hosts involved.

[admin@host ~]$ ssh-keyscan osd1 osd2 osd3 mon1 mon2 mon3 client >> ~/.ssh/known_hosts[admin@host ~]$ ssh-copy-id <HOST>

Ansible inventory configuration.

vi /etc/ansible/hosts
...
[mons]
mon1
mon2
mon3 [mgrs]
mon1
mon2
mon3[osds]
osd1
osd2
osd3

Edit the /etc/ansible/group_vars/all.yaml file and set some variables for Ceph deployment.

[root@host group_vars]$ cd /etc/ansible/group_vars/
[root@host group_vars]$ vi /etc/ansible/group_vars/all.yml
...ceph_origin: repository
ceph_repository: community
ceph_repository_type: cdn
ceph_stable_release: luminous
monitor_interface: eth1
public_network: 192.168.0.0/24
cluster_network: 10.10.10.0/24
osd_scenario: non-collocated
osd_objectstore: bluestore
devices:
  - /dev/sdb
  - /dev/sdc
dedicated_devices: 
  - /dev/sdd
  - /dev/sdd

ceph_origin — The origin of the packages, our case is via the repository.
ceph_repository — Which version of the repository, our case, is the community package.
ceph_repository_type — Type of repository if it is local or cdn.
ceph_stable_release — Which version of Ceph.
monitor_interface — Interface that the ceph monitor is using
public_network — Public ceph access network.
cluster_network — ceph replication network.
osd_scenario: The redhat document explains this variable well. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/installation_guide_for_red_hat_enterprise_linux/deploying-red-hat-ceph-storage
osd_objectstore — version of storage filestore or bluestore.
osd_scenario: It is necessary to review the documentation in case of a deployment in production. Doc: http://docs.ceph.com/ceph-ansible/master/osds/scenarios.html
devices — What are the disks involved.
dedicated_devices — What are the journal disks involved.

The number of dedicated_devices has to be distributed according to the number of devices, if you have 6 disks and 2 nvmes it is necessary to make the following distribution:

devices:
  - /dev/disk1
  - /dev/disk2
  - /dev/disk3
  - /dev/disk4
  - /dev/disk5
  - /dev/disk6
dedicated_devices: 
  - /dev/nvme0
  - /dev/nvme0
  - /dev/nvme0
  - /dev/nvme1
  - /dev/nvme1
  - /dev/nvme1

We can also configure the ceph.conf parameter with the ceph_conf_overrides parameter where we can specify the conf in the global, mon, osd, mds sessions:

ceph_origin: repository
ceph_repository: community
ceph_repository_type: cdn
ceph_stable_release: luminous
monitor_interface: eth1
public_network: 192.168.0.0/24
cluster_network: 10.10.10.0/24
osd_scenario: non-collocated
osd_objectstore: bluestore
devices:
  - /dev/sdb
  - /dev/sdc
dedicated_devices: 
  - /dev/sdd
  - /dev/sddceph_conf_overrides:
  global:
    max_open_files: 131072
    osd_pool_default_size: 3
    osd_pool_default_min_size: 2
    osd_pool_default_crush_rule: 0
    osd_pool_default_pg_num: 32
    osd_pool_default_pgp_num: 32
  mon:
    mon_osd_down_out_interval: 600
    mon_osd_mon_down_reporters: 7
    mon_clock_drift_allowed: 0.15
    mon_clock_drift_warn_backoff: 30
    mon_osd_full_ratio: 0.95
    mon_osd_nearfull_ratio: 0.85
    mon_osd_report_timeout: 300
    mon_pg_warn_max_per_osd: 300
    mon_osd_allow_primary_affinity: true
  osd:
    osd_mon_hearbeat_inverval: 30
    osd_recovery_max_active: 1
    osd_recovery_backfills: 1
    osd_recovery_sleep: 0.1
    osd_recovery_max_chunk: 1048576
    osd_recovery_threads: 1
    osd_scrub_sleep: 0.1
    osd_deep_scrub_stride: 1048576
    osd_snap_trim_sleep: 0.1
    osd_client_message_cap: 10000
    osd_client_message_size_cap: 1048576000
    osd_scrub_begin_hour: 23
    osd_scrub_end_hour: 5

4) Deploy

We will deploy with the following steps

[root@controller ~]$ su - admin
[admin@controller ~]$ cd /usr/share/ceph-ansible/
[admin@controller ~]$ cp site.yml.sample  site.yml 
[admin@controller ~]$ ansible-playbook site.yml

If the deployment is successful, the following output will be displayed at the end of the deployment:

TASK [show ceph status for cluster ceph] ****************************************************************************************************
Wednesday 19 June 2019  20:44:24 -0300 (0:00:00.612)       0:06:16.827 ******** 
ok: [mon1 -> mon1] => {
    "msg": [
        "  cluster:", 
        "    id:     c9e1807a-56fd-472b-aced-9479273d18a6", 
        "    health: HEALTH_OK", 
        " ", 
        "  services:", 
        "    mon: 3 daemons, quorum mon1,mon2,mon3", 
        "    mgr: mon1(active), standbys: mon3, mon2", 
        "    osd: 6 osds: 6 up, 6 in", 
        " ", 
        "  data:", 
        "    pools:   0 pools, 0 pgs", 
        "    objects: 0 objects, 0B", 
        "    usage:   6.02GiB used, 23.8GiB / 29.8GiB avail", 
        "    pgs:     ", 
        " "
    ]
}PLAY RECAP **********************************************************************************************************************************
mon1                       : ok=165  changed=9    unreachable=0    failed=0   
mon2                       : ok=151  changed=9    unreachable=0    failed=0   
mon3                       : ok=153  changed=9    unreachable=0    failed=0   
osd1                       : ok=128  changed=11   unreachable=0    failed=0   
osd2                       : ok=124  changed=11   unreachable=0    failed=0   
osd3                       : ok=124  changed=11   unreachable=0    failed=0   
INSTALLER STATUS ****************************************************************************************************************************
Install Ceph Monitor        : Complete (0:01:15)
Install Ceph Manager        : Complete (0:00:54)
Install Ceph OSD            : Complete (0:02:32)

[root@mon1 ~]# ceph -s 
  cluster:
    id:     c9e1807a-56fd-472b-aced-9479273d18a6
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum mon1,mon2,mon3
    mgr: mon1(active), standbys: mon3, mon2
    osd: 6 osds: 6 up, 6 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   6.03GiB used, 23.8GiB / 29.8GiB avail
    pgs:

5) Client configuration

Update the /etc/ansible/hosts inventory and configure the clients tag.

[mons]
mon1
mon2
mon3 [mgrs]
mon1
mon2
mon3[osds]
osd1
osd2
osd3[clients]
client

After installation confirm the installation on the host client

[root@client ~]# rpm -qa | grep ceph 
ceph-fuse-12.2.12-0.el7.x86_64
python-cephfs-12.2.12-0.el7.x86_64
ceph-common-12.2.12-0.el7.x86_64
ceph-selinux-12.2.12-0.el7.x86_64
libcephfs2-12.2.12-0.el7.x86_64
ceph-base-12.2.12-0.el7.x86_64[root@client ~]# cat /etc/ceph/ceph.conf 
# Please do not change this file directly since it is managed by Ansible and will be overwritten[global]
fsid = c9e1807a-56fd-472b-aced-9479273d18a6mon host = 192.168.0.100,192.168.0.101,192.168.0.102public network = 192.168.0.0/24
cluster network = 10.10.10.0/24[client.libvirt]
admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be writable by QEMU and allowed by SELinux or AppArmor
log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and allowed by SELinux or AppArmor

In the vm client we will create a 1 Gb image.

[root@client]# rbd create rbd1 --size 1024 --name client.rbd

Let’s list and get the image information created in the RBD pool

[root@client]# rbd ls -p rbd --name client.rbd
...
rbd1
...[root@client]# rbd --image rbd1 info --name client.rbd
...
rbd image 'rbd1':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	id: 5e4a6b8b4567
	block_name_prefix: rbd_data.5e4a6b8b4567
	format: 2
	features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
	op_features: 
	flags: 
	create_timestamp: Wed Aug 29 09:30:43 2018
...

Before mapping, let’s disable some features of the image so there are no errors.

[root@client]# rbd feature disable rbd1 object-map fast-diff deep-flatten

Let’s map the image:

[root@client]# rbd map --image rbd1 --name client.rbd...
/dev/rbd0
...

Validating the image delivered to the OS with fdisk:

[root@client]# fdisk -l /dev/rbd0Disk /dev/rbd0: 1073 MB, 1073741824 bytes, 2097152 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 4194304 bytes / 4194304 bytes[root@client log]#

Mounting the image

mkfs.xfs /dev/rbd0
mkdir /opt/labrbd
mount /dev/rbd0 /opt/labrbd

Persistently mounting the image

[root@client]# vi  /etc/ceph/rbdmap 
...rbd/rbd1	  id=admin,keyring=/etc/ceph/ceph.client.admin.keyring

Adjust the / etc / fstab file

[root@client]# vi  /etc/fstab
.../dev/rbd0 /opt/labrbd xfs noauto 0 0

Enable the service at boot

systemctl enable rbdmap.service

To unmount the OS image, unmount the mapping and image

umount /opt/lab/rbd
rbdmap unmap /dev/rbd0

Ceph : Cluster deployment

Ceph : Cluster deployment

1) Requirements

2) Environment

3) Admin node e Ceph Ansible

4) Deploy

5) Client configuration

Written by Fajlinuxblog