OpenStack
Scheduling by Architecture
This is for OpenStack Train.
Overview
Doing nothing means OpenStack will instruct KVM to pass the host's CPU flags onto the instances running on that host. Which is fine in a homogenised environment. What if you have an eclectic mix of machines?
Host Passthrough
My teeniest box has an Intel i3-6100T which Intel's ark tells us is a Products formerly Skylake (whatever that means).
My other teeny box has an Intel i7-6700T which Intel's ark tells us is also a Products formerly Skylake. Except ever so slightly different.
And I have a third box, an Intel i7-8700, which, at the very least, does claim to be Products formerly Coffee Lake.
You know what's coming. If I launch an instance on one box then I there's a decent chance I can't (live-)migrate it to one of the others because the CPU flags are incompatible.
I need to be able to claim to have a smaller, common, set of CPU flags.
cpu_map.xml
Originally, /usr/share/libvirt/cpu_map.xml but possibly /usr/share/libvirt/cpu_map/*.xml on newer systems, contains a lengthy list of CPU models which collect together various CPU features under easy to remember names.
There are two desktop class Skylake entries in cpu_map.xml, Skylake-Client and Skylake-Client-IBRS. The IBRS part stands for Indirect Branch Restricted Speculation relating to the ongoing efforts to mitigate Spectre and Meltdown amongst others.
So there's a bunch of CPU flags. Great.
In my particular case Skylake-Client requires the hle and rtm flags which the i3-6100T does not have.
Keeping us on our toes, though, Broadwell-noTSX-IBRS (the oldest without the above) requires spec-ctrl which the i7-6700T does not have. Duh!
The oldest without either of those is Nehalem. I'm not doing anything remotely exciting with these instances. I can live with some lo-fi CPU flags.
So, in principle, then, I can claim all three processors are Nehalem and get the benefits of migrating instances between hosts.
To use it in OpenStack you need to twiddle with a couple of flags:
- cpu_mode which defaults to host-passthrough but can be custom or none
- cpu_model (or cpu_models for a fallback list) which is used when you set cpu_mode to custom and is one of the model names in cpu_map.xml.
In addition there's the cpu_model_extra_flags which you can set to a list of CPU flags when you are in Special Purposes mode.
So, in my case I'll use the following on all three hosts:
cpu_mode=custom cpu_model=Nehalem
Top!
virsh capabilities
While we're passing through...
virsh capabilities reports in amongst its other data about the host a CPU model and some CPU flags. Those CPU flags are not the set listed in /proc/cpuinfo.
The CPU model reports
- the Skylake i3-6100T as Broadwell-noTSX-IBRS
- the Skylake i7-6700T as Skylake-Client-IBRS
- and the Coffee Lake i7-8700 as also Skylake-Client-IBRS
To be fair, cpu_map.xml does note that it can't usefully distinguish between Skylake and its Kaby Lake and Coffee Lake successors.
AMD EPYC
buys a big box
That crashes and burns rather ignominously when you introduce an AMD 3950X into the mix. It transpires that all Ryzen chips (all post-Opteron chips?) are classed as EPYC in cpu_map.xml and Nehalem is a subset of EPYC. So what I can't do is direct instances to AMD or Intel.
Note
FWIW I want to be able to use the AMD/Intel difference to schedule anti-affinity pairs of instances so this is a useful thing to do.
Flavors (I)
I've never gotten this to work so your mileage may vary.
In principle, you can tag a hypervisor with some arbitrary key/value pairs in its metadata and construct a host aggregate from those tags. So an obvious pair might be cpu_model=AMD-EPYC and cpu_model=Intel-Nehalem.
You can then tag a flavor with the same metadata.
In principle, now, when you use your tagged flavor you should get your instance placed onto a hypervisor with the same metadata.
Doesn't work for me. I've missed a trick.
Traits
OpenStack Train has traits. Traits are just a bunch of tags set by the system and can be extended by yourself. Of interest to us are a small subset of the CPU flags. It's not clear why this is (such a) small subset.
You can scramble through a couple of hoops to see the traits in your setup:
openstack resource provider list
will get you some IDs, one per hypervisor, which you can then use in:
openstack --os-placement-api-version 1.6 resource provider trait list $ID --sort-column name
(note the required --os-placement-api-version 1.6 flags!)
to get the hypervisor-specific traits.
My Nehalem hypervisor has:
... HW_CPU_X86_MMX HW_CPU_X86_SSE HW_CPU_X86_SSE2 HW_CPU_X86_SSE41 HW_CPU_X86_SSE42 HW_CPU_X86_SSSE3
and the AMD has:
... HW_CPU_X86_ABM HW_CPU_X86_AESNI HW_CPU_X86_AVX HW_CPU_X86_AVX2 HW_CPU_X86_BMI HW_CPU_X86_BMI2 HW_CPU_X86_CLMUL HW_CPU_X86_F16C HW_CPU_X86_FMA3 HW_CPU_X86_MMX HW_CPU_X86_SHA HW_CPU_X86_SSE HW_CPU_X86_SSE2 HW_CPU_X86_SSE41 HW_CPU_X86_SSE42 HW_CPU_X86_SSE4A HW_CPU_X86_SSSE3 HW_CPU_X86_SVM
Again, the Nehalem is giving me a subset of the EPYC set.
But wait, the EPYC has extra flags so can we toggle on the presence or not-presence of one of those? Why, yes, yes we can.
Flavors (II)
This time when we modify the flavor for our AMD/Intel difference the property, now prefixed with trait: can be marked as required or forbidden:
openstack flavor set --property trait:HW_CPU_X86_SVM=required $AMD-FLAVOR openstack flavor set --property trait:HW_CPU_X86_SVM=forbidden $INTEL-FLAVOR
and using $INTEL-FLAVOR or $AMD-FLAVOR does the right thing.
Document Actions