OpenStack Integration

Overview

We're really here to bind Ceph into OpenStack which will have the virtue of enabling live migration as the filesystem is now shared across all compute nodes. We'll also (eventually) get copy-on-write too.

For all three OpenStack components, glance, cinder and nova we'll need to create a Ceph user with appropriate permissions to the appropriate pool(s), update OpenStack to use an rbd backend and for cinder and nova we'll need to perform some hijinks with virsh so that it has the secret(s) necessary to be able to access Ceph.

We'll parameterise it in the usual way to try to keep a grip on things.

You can read the Ceph notes.

OpenStack Nodes

With our "limited budget" we're cheekily using the OpenStack compute nodes as the Ceph nodes. As a side effect we already have most of the Ceph software installed. However, if you're working with a different set of circumstances you may well want to:

yum install -y python-rbd
mkdir /etc/ceph
useradd ceph
passwd ceph
cat << EOF >/etc/sudoers.d/ceph
ceph ALL = (root) NOPASSWD:ALL
Defaults:ceph !requiretty
EOF

Remember to copy the appropriate bootstrap parts of Ceph across.

Glance

Most of this will be on the OpenStack controller (where Glance is running), OS_HOST.

On a Ceph Host

Create the pool for glance and a Ceph user:

GLANCE_POOL=glance-images
CEPH_GLANCE_CLIENT=client.images

ceph osd pool create ${GLANCE_POOL} 64
ceph auth get-or-create ${CEPH_GLANCE_CLIENT} mon 'allow r' osd "allow class-read object_prefix rbd_children, allow rwx pool=${GLANCE_POOL}" -o /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring

# test the user!
ceph --id ${CEPH_GLANCE_CLIENT#client.} osd tree

scp /etc/ceph/ceph.conf /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring root@${OS_HOST}:/etc/ceph

On OpenStack

Here we need to make the Ceph keyring available to the glance user:

GLANCE_POOL=glance-images
CEPH_GLANCE_CLIENT=client.images

chown glance:glance /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring
chmod 0640 /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring

# SELinux
chcon -t etc_t /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring
chcon ... /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring

Tell the local Ceph about the new keyring:

cat <<EOF >> /etc/ceph/ceph.conf

[${CEPH_GLANCE_CLIENT}]
keyring = /etc/ceph/ceph.${CEPH_GLANCE_CLIENT}.keyring
EOF

Backup then twiddle with the OpenStack config to use rbd in preference to the existing stores:

cd /etc/glance
cp glance-api.conf glance-api.conf.0

# default stores is file,http
IFS=,
stores=( glance.store.rbd.Store $(crudini --get glance-api.conf glance_store stores) )
crudini --set glance-api.conf glance_store stores "${stores[*]}"
unset IFS

crudini --set glance-api.conf glance_store default_store rbd
crudini --set glance-api.conf glance_store rbd_store_pool ${GLANCE_POOL}
crudini --set glance-api.conf glance_store rbd_store_user ${CEPH_GLANCE_CLIENT#client.}
crudini --set glance-api.conf glance_store rbd_store_ceph_conf /etc/ceph/ceph.conf

# enable copy-on-write
crudini --set glance-api.conf DEFAULT show_image_direct_url True

systemctl restart openstack-glance-api

Done. Glance is really easy.

New images will go into the rbd store. Existing images will remain where they are which means we won't get the advantage of the Ceph copy-on-write mechanisms until we somehow migrate them. Unfortunately, I'm not sure there's an easy way to migrate images between stores, I think you simply have to download them then re-upload them again.

Cinder

Most of this will be on the OpenStack controller (where Cinder is running), OS_HOST.

Note

You will have to repeat the virsh part on each compute node.

On Ceph

Create the pool for cinder and a Ceph user:

CINDER_POOL=cinder-volumes
CEPH_CINDER_CLIENT=client.volumes

ceph osd pool create ${CINDER_POOL} 64
ceph auth get-or-create ${CEPH_CINDER_CLIENT} mon 'allow r' osd "allow class-read object_prefix rbd_children, allow rwx pool=${CINDER_POOL}, allow rx pool=${GLANCE_POOL}" -o /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring
ceph auth get-key ${CEPH_CINDER_CLIENT} > /etc/ceph/${CEPH_CINDER_CLIENT}.key

# test user!
ceph --id ${CEPH_CINDER_CLIENT#client.} osd tree

scp /etc/ceph/ceph.conf /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring /etc/ceph/${CEPH_CINDER_CLIENT}.key root@${OS_HOST}:/etc/ceph

Note

Cinder needs rx access to the Glance pool in addition to rwx access to the Cinder pool.

On OpenStack

Here we need to make the Ceph keyring available to the cinder user:

CINDER_POOL=cinder-volumes
CEPH_CINDER_CLIENT=client.volumes

chown cinder:cinder /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring
chmod 0640 /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring

# SELinux
chcon -t etc_t /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring
chcon ... /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring

Tell the local Ceph about the new keyring:

cat <<EOF >> /etc/ceph/ceph.conf

[${CEPH_CINDER_CLIENT}]
keyring = /etc/ceph/ceph.${CEPH_CINDER_CLIENT}.keyring
EOF

Virsh

Now, an extra bit of magic to give virsh the secret to access Ceph. It's done through a standard virsh mechanism that uses UUIDs as keys to secrets. Having told virsh about the UUID in an XML file you then poke virsh with the secret.

Create a file with the UUID in (handy for this script, no other use):

uuidgen > /etc/ceph/cinder.uuid.txt

Create an XML file that references that UUID:

cat <<EOF > /etc/ceph/cinder.secret.xml
<secret ephemeral="no" private="no">
<uuid>$(cat /etc/ceph/cinder.uuid.txt)</uuid>
<usage type="ceph">
<name>${CEPH_CINDER_CLIENT} secret</name>
</usage>
</secret>
EOF

chmod 0640 /etc/ceph/cinder.*
chmod 0640 /etc/ceph/*.key

Note

You have to do this on each compute node so copy the cinder.* files around.

Poke virsh with the secret:

virsh secret-define --file /etc/ceph/cinder.secret.xml
virsh secret-set-value --secret $(cat /etc/ceph/cinder.uuid.txt) --base64 $(cat /etc/ceph/${CEPH_CINDER_CLIENT}.key)

Cinder

Back on the OpenStack controller (or node running Cinder) we can twiddle with the Cinder config to use rbd in preference to whatever:

cd /etc/cinder
cp cinder.conf cinder.conf.0

# default enabled_backends is lvm
IFS=,
backends=( rbd $(crudini --get cinder.conf DEFAULT enabled_backends) )
crudini --set cinder.conf DEFAULT enabled_backends "${backends[*]}"
unset IFS

crudini --set cinder.conf rbd volume_driver cinder.volume.drivers.rbd.RBDDriver
crudini --set cinder.conf rbd rbd_pool ${CINDER_POOL}
crudini --set cinder.conf rbd rbd_ceph_conf /etc/ceph/ceph.conf
crudini --set cinder.conf rbd rbd_flatten_volume_from_snapshot false
crudini --set cinder.conf rbd rbd_max_clone_depth 5
crudini --set cinder.conf rbd rbd_store_chunk_size 4
crudini --set cinder.conf rbd rados_connect_timeout -1
crudini --set cinder.conf rbd glance_api_version 2
crudini --set cinder.conf rbd rbd_user ${CEPH_CINDER_CLIENT#client.}
crudini --set cinder.conf rbd rbd_secret_uuid $(cat /etc/ceph/cinder.uuid.txt)

systemctl restart openstack-cinder-*

Note

The name, rbd, here in enabled_backends is arbitrary and is just a pointer to the block called rbd.

We can test that Cinder now creates volumes in Ceph:

cinder create --display-name="test" 1
cinder list
rbd ls ${CINDER_POOL}

Nova

Most of this will be on the OpenStack controller (where Nova is running), OS_HOST.

Note

You will have to repeat the virsh part on each compute node.

On Ceph

Create the pool for nova and a Ceph user:

NOVA_POOL=nova-vms
CEPH_NOVA_CLIENT=client.nova

ceph osd pool create ${NOVA_POOL} 64
ceph auth get-or-create ${CEPH_NOVA_CLIENT} mon 'allow r' osd "allow class-read object_prefix rbd_children, allow rwx pool=${NOVA_POOL}, allow rx pool=${GLANCE_POOL}" -o /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring
ceph auth get-key ${CEPH_NOVA_CLIENT} > /etc/ceph/${CEPH_NOVA_CLIENT}.key

# test user!
ceph --id ${CEPH_NOVA_CLIENT#client.} osd tree

scp /etc/ceph/ceph.conf /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring /etc/ceph/${CEPH_NOVA_CLIENT}.key root@${OS_HOST}:/etc/ceph

Note

Nova needs rx access to the Glance pool in addition to rwx access to the Nova pool.

On OpenStack

Here we need to make the Ceph keyring available to the nova user:

NOVA_POOL=nova-vms
CEPH_NOVA_CLIENT=client.nova

chown nova:nova /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring
chmod 0640 /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring

# SELinux
chcon -t etc_t /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring
chcon ... /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring

Tell the local Ceph about the new keyring:

cat <<EOF >> /etc/ceph/ceph.conf

[${CEPH_NOVA_CLIENT}]
keyring = /etc/ceph/ceph.${CEPH_NOVA_CLIENT}.keyring
EOF

Virsh

Create a file with the UUID in (handy for this script, no other use):

uuidgen > /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt

Create an XML file that references that UUID:

cat <<EOF > /etc/ceph/${CEPH_NOVA_CLIENT}.secret.xml
<secret ephemeral="no" private="no">
<uuid>$(cat /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt)</uuid>
<usage type="ceph">
<name>${CEPH_NOVA_CLIENT} secret</name>
</usage>
</secret>
EOF

chmod 0640 /etc/ceph/${CEPH_NOVA_CLIENT}.*
chmod 0640 /etc/ceph/*.key

Note

You have to do this on each compute node so copy the nova.* files around.

Poke virsh with the secret:

virsh secret-define --file /etc/ceph/${CEPH_NOVA_CLIENT}.secret.xml
virsh secret-set-value --secret $(cat /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt) --base64 $(cat /etc/ceph/${CEPH_NOVA_CLIENT}.key)

Nova

Back on the OpenStack controller (or node running Nova) we can twiddle with the Cinder config to use rbd in preference to whatever:

cd /etc/nova
cp nova.conf nova.conf.0

crudini --set nova.conf DEFAULT force_raw_images True
crudini --set nova.conf DEFAULT disk_cachemodes writeback
crudini --set nova.conf libvirt images_type rbd
crudini --set nova.conf libvirt images_rbd_pool ${NOVA_POOL}
crudini --set nova.conf libvirt images_rbd_ceph_conf /etc/ceph/ceph.conf
crudini --set nova.conf libvirt rbd_user ${CEPH_NOVA_CLIENT#client.}
crudini --set nova.conf libvirt rbd_secret_uuid $(cat /etc/ceph/${CEPH_NOVA_CLIENT}.uuid.txt)

systemctl restart openstack-nova-compute

Warning

There may be some problem here with OpenStack and Cinder boot volumes in that our choice of rbd_user, here, nova, will not have access to Cinder volumes in Ceph. Needs some more thought!

We can test that Nova now creates volumes in Ceph:

nova create --display-name="test" 1
nova list
rbd ls ${NOVA_POOL}

Troubleshooting

If your Ceph setup crashes and burns your VMs may not start even though everything appears to have recovered.

In particular you may find your VM in a dracut shell where xfs_repair -L gets IO errors.

Here the problem is that Nova sets an exclusive lock on disk images which, as your Ceph cluster crashed, you need to remove manually.

Firstly, get the instance ID and then look at the features on the disk for exclusive-lock:

# rbd info nova-vms/a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk
rbd image 'a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk':
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        op_features:

Next, look at the locks:

# rbd lock ls nova-vms/a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk
There is 1 exclusive lock on this image.
Locker          ID                  Address
client.13411674 auto 94615282305536 192.168.8.11:0/197030791

Finally, remove the lock -- note the ID has whitespace:

# rbd lock rm nova-vms/a72cdcce-31bb-4f21-a960-2b47e5ec5fd6_disk 'auto 94615282305536' client.13411674

Note

rbd lock ls ... reports the two fields that rbd lock rm requires in the wrong order and it also uses a SPACE character in the one!

Document Actions