OpenStack Integration
Overview
We're really here to bind Ceph into OpenStack which will have the virtue of enabling live migration as the filesystem is now shared across all compute nodes. We'll also (eventually) get copy-on-write too.
For all three OpenStack components, glance, cinder and nova we'll need to create a Ceph user with appropriate permissions to the appropriate pool(s), update OpenStack to use an rbd backend and for cinder and nova we'll need to perform some hijinks with virsh so that it has the secret(s) necessary to be able to access Ceph.
We'll parameterise it in the usual way to try to keep a grip on things.
You can read the Ceph notes.
OpenStack Nodes
With our "limited budget" we're cheekily using the OpenStack compute nodes as the Ceph nodes. As a side effect we already have most of the Ceph software installed. However, if you're working with a different set of circumstances you may well want to:
yum install -y python-rbd mkdir /etc/ceph useradd ceph passwd ceph cat << EOF >/etc/sudoers.d/ceph ceph ALL = (root) NOPASSWD:ALL Defaults:ceph !requiretty EOF
Remember to copy the appropriate bootstrap parts of Ceph across.
Glance
Most of this will be on the OpenStack controller (where Glance is running), OS_HOST.
On a Ceph Host
Create the pool for glance and a Ceph user:
GLANCE_POOL=glance-images CEPH_GLANCE_CLIENT=client.images ceph osd pool create ${GLANCE_POOL} 64 ceph auth get-or-create ${CEPH_GLANCE_CLIENT} mon 'allow r' osd "allow class-read object_prefix rbd_children, allow rwx pool=${GLANCE_POOL}" -o /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring # test the user! ceph --id ${CEPH_GLANCE_CLIENT#client.} osd tree scp /etc/ceph/ceph.conf /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring root@${OS_HOST}:/etc/ceph
On OpenStack
Here we need to make the Ceph keyring available to the glance user:
GLANCE_POOL=glance-images CEPH_GLANCE_CLIENT=client.images chown glance:glance /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring chmod 0640 /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring # SELinux chcon -t etc_t /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring chcon ... /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring
Tell the local Ceph about the new keyring:
cat <<EOF >> /etc/ceph/ceph.conf [${CEPH_GLANCE_CLIENT}] keyring = /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring EOF
Backup then twiddle with the OpenStack config to use rbd in preference to the existing stores:
cd /etc/glance cp glance-api.conf glance-api.conf.0 # default stores is file,http IFS=, stores=( glance.store.rbd.Store $(crudini --get glance-api.conf glance_store stores) ) crudini --set glance-api.conf glance_store stores "${stores[*]}" unset IFS crudini --set glance-api.conf glance_store default_store rbd crudini --set glance-api.conf glance_store rbd_store_pool ${GLANCE_POOL} crudini --set glance-api.conf glance_store rbd_store_user ${CEPH_GLANCE_CLIENT#client.} crudini --set glance-api.conf glance_store rbd_store_ceph_conf /etc/ceph/ceph.conf # enable copy-on-write crudini --set glance-api.conf DEFAULT show_image_direct_url True systemctl restart openstack-glance-api
Done. Glance is really easy.
New images will go into the rbd store. Existing images will remain where they are which means we won't get the advantage of the Ceph copy-on-write mechanisms until we somehow migrate them. Unfortunately, I'm not sure there's an easy way to migrate images between stores, I think you simply have to download them then re-upload them again.
Cinder
Most of this will be on the OpenStack controller (where Cinder is running), OS_HOST.
Note
You will have to repeat the virsh part on each compute node.
On Ceph
Create the pool for cinder and a Ceph user:
CINDER_POOL=cinder-volumes CEPH_CINDER_CLIENT=client.volumes ceph osd pool create ${CINDER_POOL} 64 ceph auth get-or-create ${CEPH_CINDER_CLIENT} mon 'allow r' osd "allow class-read object_prefix rbd_children, allow rwx pool=${CINDER_POOL}, allow rx pool=${GLANCE_POOL}" -o /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring ceph auth get-key ${CEPH_CINDER_CLIENT} > /etc/ceph/${CEPH_CINDER_CLIENT}.key # test user! ceph --id ${CEPH_CINDER_CLIENT#client.} osd tree scp /etc/ceph/ceph.conf /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring /etc/ceph/${CEPH_CINDER_CLIENT}.key root@${OS_HOST}:/etc/ceph
Note
Cinder needs rx access to the Glance pool in addition to rwx access to the Cinder pool.
On OpenStack
Here we need to make the Ceph keyring available to the cinder user:
CINDER_POOL=cinder-volumes CEPH_CINDER_CLIENT=client.volumes chown cinder:cinder /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring chmod 0640 /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring # SELinux chcon -t etc_t /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring chcon ... /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring
Tell the local Ceph about the new keyring:
cat <<EOF >> /etc/ceph/ceph.conf [${CEPH_CINDER_CLIENT}] keyring = /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring EOF
Virsh
Now, an extra bit of magic to give virsh the secret to access Ceph. It's done through a standard virsh mechanism that uses UUIDs as keys to secrets. Having told virsh about the UUID in an XML file you then poke virsh with the secret.
Create a file with the UUID in (handy for this script, no other use):
uuidgen > /etc/ceph/cinder.uuid.txt
Create an XML file that references that UUID:
cat <<EOF > /etc/ceph/cinder.secret.xml <secret ephemeral="no" private="no"> <uuid>$(cat /etc/ceph/cinder.uuid.txt)</uuid> <usage type="ceph"> <name>${CEPH_CINDER_CLIENT} secret</name> </usage> </secret> EOF chmod 0640 /etc/ceph/cinder.* chmod 0640 /etc/ceph/*.key
Note
You have to do this on each compute node so copy the cinder.* files around.
Poke virsh with the secret:
virsh secret-define --file /etc/ceph/cinder.secret.xml virsh secret-set-value --secret $(cat /etc/ceph/cinder.uuid.txt) --base64 $(cat /etc/ceph/${CEPH_CINDER_CLIENT}.key)
Cinder
Back on the OpenStack controller (or node running Cinder) we can twiddle with the Cinder config to use rbd in preference to whatever:
cd /etc/cinder cp cinder.conf cinder.conf.0 # default enabled_backends is lvm IFS=, backends=( rbd $(crudini --get cinder.conf DEFAULT enabled_backends) ) crudini --set cinder.conf DEFAULT enabled_backends "${backends[*]}" unset IFS crudini --set cinder.conf rbd volume_driver cinder.volume.drivers.rbd.RBDDriver crudini --set cinder.conf rbd rbd_pool ${CINDER_POOL} crudini --set cinder.conf rbd rbd_ceph_conf /etc/ceph/ceph.conf crudini --set cinder.conf rbd rbd_flatten_volume_from_snapshot false crudini --set cinder.conf rbd rbd_max_clone_depth 5 crudini --set cinder.conf rbd rbd_store_chunk_size 4 crudini --set cinder.conf rbd rados_connect_timeout -1 crudini --set cinder.conf rbd glance_api_version 2 crudini --set cinder.conf rbd rbd_user ${CEPH_CINDER_CLIENT#client.} crudini --set cinder.conf rbd rbd_secret_uuid $(cat /etc/ceph/cinder.uuid.txt) systemctl restart openstack-cinder-*
Note
The name, rbd, here in enabled_backends is arbitrary and is just a pointer to the block called rbd.
We can test that Cinder now creates volumes in Ceph:
cinder create --display-name="test" 1 cinder list rbd ls ${CINDER_POOL}
Nova
Most of this will be on the OpenStack controller (where Nova is running), OS_HOST.
Note
You will have to repeat the virsh part on each compute node.
On Ceph
Create the pool for nova and a Ceph user:
NOVA_POOL=nova-vms CEPH_NOVA_CLIENT=client.nova ceph osd pool create ${NOVA_POOL} 64 ceph auth get-or-create ${CEPH_NOVA_CLIENT} mon 'allow r' osd "allow class-read object_prefix rbd_children, allow rwx pool=${NOVA_POOL}, allow rx pool=${GLANCE_POOL}" -o /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring ceph auth get-key ${CEPH_NOVA_CLIENT} > /etc/ceph/${CEPH_NOVA_CLIENT}.key # test user! ceph --id ${CEPH_NOVA_CLIENT#client.} osd tree scp /etc/ceph/ceph.conf /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring /etc/ceph/${CEPH_NOVA_CLIENT}.key root@${OS_HOST}:/etc/ceph
Note
Nova needs rx access to the Glance pool in addition to rwx access to the Nova pool.
On OpenStack
Here we need to make the Ceph keyring available to the nova user:
NOVA_POOL=nova-vms CEPH_NOVA_CLIENT=client.nova chown nova:nova /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring chmod 0640 /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring # SELinux chcon -t etc_t /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring chcon ... /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring
Tell the local Ceph about the new keyring:
cat <<EOF >> /etc/ceph/ceph.conf [${CEPH_NOVA_CLIENT}] keyring = /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring EOF
Virsh
Now, an extra bit of magic to give virsh the secret to access Ceph. It's done through a standard virsh mechanism that uses UUIDs as keys to secrets. Having told virsh about the UUID in an XML file you then poke virsh with the secret.
Create a file with the UUID in (handy for this script, no other use):
uuidgen > /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt
Create an XML file that references that UUID:
cat <<EOF > /etc/ceph/${CEPH_NOVA_CLIENT}.secret.xml <secret ephemeral="no" private="no"> <uuid>$(cat /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt)</uuid> <usage type="ceph"> <name>${CEPH_NOVA_CLIENT} secret</name> </usage> </secret> EOF chmod 0640 /etc/ceph/${CEPH_NOVA_CLIENT}.* chmod 0640 /etc/ceph/*.key
Note
You have to do this on each compute node so copy the nova.* files around.
Poke virsh with the secret:
virsh secret-define --file /etc/ceph/${CEPH_NOVA_CLIENT}.secret.xml virsh secret-set-value --secret $(cat /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt) --base64 $(cat /etc/ceph/${CEPH_NOVA_CLIENT}.key)
Nova
Back on the OpenStack controller (or node running Nova) we can twiddle with the Cinder config to use rbd in preference to whatever:
cd /etc/nova cp nova.conf nova.conf.0 crudini --set nova.conf DEFAULT force_raw_images True crudini --set nova.conf DEFAULT disk_cachemodes writeback crudini --set nova.conf libvirt images_type rbd crudini --set nova.conf libvirt images_rbd_pool ${NOVA_POOL} crudini --set nova.conf libvirt images_rbd_ceph_conf /etc/ceph/ceph.conf crudini --set nova.conf libvirt rbd_user ${CEPH_NOVA_CLIENT#client.} crudini --set nova.conf libvirt rbd_secret_uuid $(cat /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt) systemctl restart openstack-nova-compute
Warning
There may be some problem here with OpenStack and Cinder boot volumes in that our choice of rbd_user, here, nova, will not have access to Cinder volumes in Ceph. Needs some more thought!
We can test that Nova now creates volumes in Ceph:
nova create --display-name="test" 1 nova list rbd ls ${NOVA_POOL}
Troubleshooting
If your Ceph setup crashes and burns your VMs may not start even though everything appears to have recovered.
In particular you may find your VM in a dracut shell where xfs_repair -L gets IO errors.
Here the problem is that Nova sets an exclusive lock on disk images which, as your Ceph cluster crashed, you need to remove manually.
Firstly, get the instance ID and then look at the features on the disk for exclusive-lock:
# rbd info nova-vms/a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk rbd image 'a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk': features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features:
Next, look at the locks:
# rbd lock ls nova-vms/a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk There is 1 exclusive lock on this image. Locker ID Address client.13411674 auto 94615282305536 192.168.8.11:0/197030791
Finally, remove the lock -- note the ID has whitespace:
# rbd lock rm nova-vms/a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk 'auto 94615282305536' client.13411674
Note
rbd lock ls ... reports the two fields that rbd lock rm requires in the wrong order and it also uses a SPACE character in the one!
Document Actions