OpenStack

Scheduling by Architecture

This is for OpenStack Train.

Overview

Doing nothing means OpenStack will instruct KVM to pass the host's CPU flags onto the instances running on that host. Which is fine in a homogenised environment. What if you have an eclectic mix of machines?

Host Passthrough

My teeniest box has an Intel i3-6100T which Intel's ark tells us is a Products formerly Skylake (whatever that means).

My other teeny box has an Intel i7-6700T which Intel's ark tells us is also a Products formerly Skylake. Except ever so slightly different.

And I have a third box, an Intel i7-8700, which, at the very least, does claim to be Products formerly Coffee Lake.

You know what's coming. If I launch an instance on one box then I there's a decent chance I can't (live-)migrate it to one of the others because the CPU flags are incompatible.

I need to be able to claim to have a smaller, common, set of CPU flags.

cpu_map.xml

Originally, /usr/share/libvirt/cpu_map.xml but possibly /usr/share/libvirt/cpu_map/*.xml on newer systems, contains a lengthy list of CPU models which collect together various CPU features under easy to remember names.

There are two desktop class Skylake entries in cpu_map.xml, Skylake-Client and Skylake-Client-IBRS. The IBRS part stands for Indirect Branch Restricted Speculation relating to the ongoing efforts to mitigate Spectre and Meltdown amongst others.

So there's a bunch of CPU flags. Great.

In my particular case Skylake-Client requires the hle and rtm flags which the i3-6100T does not have.

Keeping us on our toes, though, Broadwell-noTSX-IBRS (the oldest without the above) requires spec-ctrl which the i7-6700T does not have. Duh!

The oldest without either of those is Nehalem. I'm not doing anything remotely exciting with these instances. I can live with some lo-fi CPU flags.

So, in principle, then, I can claim all three processors are Nehalem and get the benefits of migrating instances between hosts.

To use it in OpenStack you need to twiddle with a couple of flags:

cpu_mode which defaults to host-passthrough but can be custom or none
cpu_model (or cpu_models for a fallback list) which is used when you set cpu_mode to custom and is one of the model names in cpu_map.xml.

In addition there's the cpu_model_extra_flags which you can set to a list of CPU flags when you are in Special Purposes mode.

So, in my case I'll use the following on all three hosts:

cpu_mode=custom
cpu_model=Nehalem

Top!

virsh capabilities

While we're passing through...

virsh capabilities reports in amongst its other data about the host a CPU model and some CPU flags. Those CPU flags are not the set listed in /proc/cpuinfo.

The CPU model reports

the Skylake i3-6100T as Broadwell-noTSX-IBRS
the Skylake i7-6700T as Skylake-Client-IBRS
and the Coffee Lake i7-8700 as also Skylake-Client-IBRS

To be fair, cpu_map.xml does note that it can't usefully distinguish between Skylake and its Kaby Lake and Coffee Lake successors.

AMD EPYC

buys a big box

That crashes and burns rather ignominously when you introduce an AMD 3950X into the mix. It transpires that all Ryzen chips (all post-Opteron chips?) are classed as EPYC in cpu_map.xml and Nehalem is a subset of EPYC. So what I can't do is direct instances to AMD or Intel.

Note

FWIW I want to be able to use the AMD/Intel difference to schedule anti-affinity pairs of instances so this is a useful thing to do.

Flavors (I)

I've never gotten this to work so your mileage may vary.

In principle, you can tag a hypervisor with some arbitrary key/value pairs in its metadata and construct a host aggregate from those tags. So an obvious pair might be cpu_model=AMD-EPYC and cpu_model=Intel-Nehalem.

You can then tag a flavor with the same metadata.

In principle, now, when you use your tagged flavor you should get your instance placed onto a hypervisor with the same metadata.

Doesn't work for me. I've missed a trick.

Traits

OpenStack Train has traits. Traits are just a bunch of tags set by the system and can be extended by yourself. Of interest to us are a small subset of the CPU flags. It's not clear why this is (such a) small subset.

You can scramble through a couple of hoops to see the traits in your setup:

openstack resource provider list

will get you some IDs, one per hypervisor, which you can then use in:

openstack --os-placement-api-version 1.6 resource provider trait list $ID --sort-column name

(note the required --os-placement-api-version 1.6 flags!)

to get the hypervisor-specific traits.

My Nehalem hypervisor has:

...
HW_CPU_X86_MMX
HW_CPU_X86_SSE
HW_CPU_X86_SSE2
HW_CPU_X86_SSE41
HW_CPU_X86_SSE42
HW_CPU_X86_SSSE3

and the AMD has:

...
HW_CPU_X86_ABM
HW_CPU_X86_AESNI
HW_CPU_X86_AVX
HW_CPU_X86_AVX2
HW_CPU_X86_BMI
HW_CPU_X86_BMI2
HW_CPU_X86_CLMUL
HW_CPU_X86_F16C
HW_CPU_X86_FMA3
HW_CPU_X86_MMX
HW_CPU_X86_SHA
HW_CPU_X86_SSE
HW_CPU_X86_SSE2
HW_CPU_X86_SSE41
HW_CPU_X86_SSE42
HW_CPU_X86_SSE4A
HW_CPU_X86_SSSE3
HW_CPU_X86_SVM

Again, the Nehalem is giving me a subset of the EPYC set.

But wait, the EPYC has extra flags so can we toggle on the presence or not-presence of one of those? Why, yes, yes we can.

Flavors (II)

This time when we modify the flavor for our AMD/Intel difference the property, now prefixed with trait: can be marked as required or forbidden:

openstack flavor set --property trait:HW_CPU_X86_SVM=required $AMD-FLAVOR
openstack flavor set --property trait:HW_CPU_X86_SVM=forbidden $INTEL-FLAVOR

and using $INTEL-FLAVOR or $AMD-FLAVOR does the right thing.

Document Actions