Cluster Concepts and Overview
A vSphere cluster is a set of ESXi hosts that are intended to work together as a unit. When you add a host to a cluster, the host’s resources become part of the cluster’s resources. vCenter Server manages the resources of all hosts in a cluster as one unit. In addition to creating a cluster, assigning a name, and adding ESXi objects, you can enable and configure features on a cluster, such as vSphere Distributed Resource Scheduler (DRS), VMware Enhanced vMotion Compatibility (EVC), Distributed Power Management (DPM), vSphere High Availability (HA), and vSAN.
In the vSphere Client, you can manage and monitor the resources in a cluster as a single object. You can easily monitor and manage the hosts and virtual machines in the DRS cluster.
If you enable VMware EVC on a cluster, you can ensure that migrations with vMotion do not fail due to CPU compatibility errors. If you enable vSphere DRS on a cluster, you can allow automatic resource balancing using the pooled host resources in the cluster. If you enable vSphere HA on a cluster, you can allow rapid virtual machine recovery from host hardware failures, using the cluster’s available host resource capacity. If you enable DPM on a cluster, you can provide automated power management in the cluster. If you enable vSAN on a cluster, you use a logical SAN that is built on a pool of drives attached locally to the ESXi hosts in the cluster.
You can use the Quickstart workflow in the vSphere Client to create and configure a cluster. The Quickstart page provides three cards: Cluster Basics, Add Hosts, and Configure Cluster. For an existing cluster, you can use Cluster Basics to change the cluster name and enable cluster services, such as DRS and vSphere HA. You can use the Add Hosts card to add hosts to the cluster. You can use the Configure Cluster card to configure networking and other settings on the hosts in the cluster.
In addition, in vSphere 7.0 and later, you can configure a few general settings for a cluster. For example, when you create a cluster, even if you do not enable DRS, vSphere, HA, or vSAN, you can choose to manage all hosts in the cluster with a single image. With this option, all hosts in a cluster inherit the same image, which reduces variability between hosts, improves your ability to ensure hardware compatibility, and simplifies upgrades. This feature requires hosts to already be ESXi 7.0 or above. It replaces baselines. Once it is enabled, baselines cannot be used in this cluster.
vSphere Cluster Services (vCLS)
vCLS, which is implemented by default in all vSphere clusters, ensures that cluster services remain available even if vCenter Server becomes unavailable. When you deploy a new cluster in vCenter Server 7.0 Update 3 or upgrade a vCenter Server to Version 7.0 Update 3, vCLS virtual appliances are automatically deployed to the cluster. In clusters with three or more hosts, three vCLS appliances are automatically deployed with anti-affinity rules to separate the appliances. In smaller clusters, the number of vCLS VMs matches the number of hosts.
In vSphere 8.0, each vCLS VM is configured with one vCPU, 128 MB memory, and no vNIC. The datastore for each vCLS VM is automatically selected based on the rank of the datastores connected to the cluster’s hosts, with preference given to shared datastores. You can control the datastore choice by using the vSphere Client to select the cluster, navigating to Configure > vSphere Cluster Service > Datastores, and clicking the Add button. vCLS VMs are always powered on and should be treated as system VMs, where only administrators perform selective operations on the vCLS VMs. vCenter Server manages the health of vCLS VMs. You should not back up or take snapshots of these VMs. You can use the Summary tab for a cluster to examine the vCLS health, which is either Healthy, Degraded, or Unhealthy.
If you want to place a datastore hosting a vCLS VM into Maintenance Mode, you must either manually migrate the vCLS VM with Storage vMotion to a new location or put the cluster in Retreat Mode. In Retreat Mode, the health of vCLS is degraded, DRS stops functioning, and vSphere HA does not perform optimal placement when responding to host failure events. To put a cluster in Retreat Mode, you need to obtain its cluster domain ID from the URL of the browser after selecting the cluster in the vSphere Client. Then you apply the cluster domain ID, which is in the form domain-c(number), to create a new vCenter Server advanced setting with the entry config.vcls.clusters.domain-c(number).enabled that is set to False.
Enhanced vMotion Compatibility (EVC)
EVC is a cluster setting that can improve CPU compatibility between hosts for supporting vMotion. vMotion migrations are live migrations that require compatible instruction sets for source and target processors used by the virtual machine. The source and target processors must come from the same vendor class (AMD or Intel) to be vMotion compatible. The clock speed, cache size, and number of cores can differ between source and target processors. When you start a vMotion migration or a migration of a suspended virtual machine, the wizard checks the destination host for compatibility; it displays an error message if problems exist. By using EVC, you can allow vMotion between some processors that would normally be incompatible.
The CPU instruction set that is available to a virtual machine guest OS is determined when the virtual machine is powered on. This CPU feature set is based on the following items:
The host CPU family and model
Settings in the BIOS that might disable CPU features
The ESX/ESXi version running on the host
The virtual machine’s compatibility setting
The virtual machine’s guest operating system
EVC ensures that all hosts in a cluster present the same CPU feature set to virtual machines, even if the actual CPUs on the hosts differ. If you enable the EVC cluster setting, you can configure the EVC Mode with a baseline CPU feature set. EVC ensures that hosts in a cluster use the baseline feature set when presenting an instruction set to a guest OS. EVC uses AMD-V Extended Migration technology for AMD hosts and Intel FlexMigration technology for Intel hosts to mask processor features; this allows hosts to present the feature set of an earlier processor generation. You should configure EVC Mode to accommodate the host with the smallest feature set in the cluster.
The EVC requirements for hosts include the following:
ESXi 6.7 or later is required.
Hosts must be attached to a vCenter Server.
CPUs must be from a single vendor (either Intel or AMD).
If the AMD-V, Intel-VT, AMD NX, or Intel XD features are available in the BIOS, they need to be enabled.
Check the VMware Compatibility Guide to ensure that CPUs are supported for EVC Mode.
You can configure the EVC settings by using the Quickstart > Configure Cluster workflow in the vSphere Client. You can also configure EVC directly in the cluster settings. The options for VMware EVC are Disable EVC, Enable EVC for AMD Hosts, and Enable EVC for Intel Hosts. You can also configure per-VM EVC, as described in Chapter 5, “vCenter Server Features and Virtual Machines.”
If you choose Enable EVC for Intel Hosts, you can set the EVC Mode setting to one of the options described in Table 4-2.
Table 4-2 EVC Modes for Intel
Level |
EVC Mode |
Description |
---|---|---|
L0 |
Intel Merom |
Smallest Intel feature set for EVC mode. |
L1 |
Intel Penryn |
Includes the Intel Merom feature set and exposes additional CPU features, including SSE4.1. |
L2 |
Intel Nehalem |
Includes the Intel Penryn feature set and exposes additional CPU features, including SSE4.2 and POPCOUNT. |
L3 |
Intel Westmere |
Includes the Intel Nehalem feature set and exposes additional CPU features, including AES and PCLMULQDQ. |
L4 |
Intel Sandy Bridge |
Includes the Intel Westmere feature set and exposes additional CPU features, including AVX and XSAVE. |
L5 |
Intel Ivy Bridge |
Includes the Intel Sandy Bridge feature set and exposes additional CPU features, including RDRAND, ENFSTRG, FSGSBASE, SMEP, and F16C. |
L6 |
Intel Haswell |
Includes the Intel Ivy Bridge feature set and exposes additional CPU features, including ABMX2, AVX2, MOVBE, FMA, PERMD, RORX/MULX, INVPCID, and VMFUNC. |
L7 |
Intel Broadwell |
Includes the Intel Haswell feature set and exposes additional CPU features, including Transactional Synchronization Extensions, Supervisor Mode Access Prevention, Multi-Precision Add-Carry Instruction Extensions, PREFETCHW, and RDSEED. |
L8 |
Intel Skylake |
Includes the Intel Broadwell feature set and exposes additional CPU features, including Advanced Vector Extensions 512, Persistent Memory Support Instructions, Protection Key Rights, Save Processor Extended States with Compaction, and Save Processor Extended States Supervisor. |
L9 |
Intel Cascade Lake |
Includes the Intel Skylake feature set and exposes additional CPU features, including VNNI and XGETBV with ECX = 1. |
L10 |
Intel Ice Lake |
Includes the Intel Cascade Lake feature set and exposes additional CPU features, including HA extensions, Vectorized AES, User Mode Instruction Prevention, Read Processor ID, Fast Short REP MOV, WBNOINVD, Galois Field New Instructions, and AVX512 Integer Fused Multiply Add, Vectorized Bit Manipulation, and Bit Algorithms Instructions. |
L11 |
Intel Sapphire Rapids |
Includes the Intel Ice Lake feature set and exposes additional CPU features, including Control-Flow Enforcement Technology, Advanced Matrix Extensions, Supervisor Protection Keys, AVX-VNNI, AVX512 FP16, AVX512 BF16, CLDEMOTE, SERIALIZE, WBNOINVD, and MOVDIRI instructions. |
If you choose Enable EVC for AMD Hosts, you can set the EVC Mode setting to one of the options described in Table 4-3.
Table 4-3 EVC Modes for AMD
Level |
EVC Mode |
Description |
---|---|---|
A0 |
AMD Opteron Generation 1 |
Smallest AMD feature set for EVC mode. |
A1 |
AMD Opteron Generation 2 |
Includes the AMD Generation 1 feature set and exposes additional CPU features, including CPMXCHG16B and RDTSCP. |
A3 |
AMD Opteron Generation 3 |
Includes the AMD Generation 2 feature set and exposes additional CPU features, including SSE4A, MisAlignSSE, POPCOUNT, and ABM (LZCNT). |
A2, B0 |
AMD Opteron Generation 3 (without 3DNow!) |
Includes the AMD Generation 3 feature set without 3DNow support. |
B1 |
AMD Opteron Generation 4 |
Includes the AMD Generation 3 no3DNow feature set and exposes additional CPU features, including SSSE3, SSE4.1, AES, AVX, XSAVE, XOP, and FMA4. |
B2 |
AMD Opteron Piledriver |
Includes the AMD Generation 4 feature set and exposes additional CPU features, including FMA, TBM, BMI1, and F16C. |
B3 |
AMD Opteron Steamroller |
Includes the AMD Piledriver feature set and exposes additional CPU features, including XSAVEOPT RDFSBASE, RDGSBASE, WRFSBASE, WRGSBAS, and FSGSBASE. |
B4 |
AMD Zen |
Includes the AMD Steamroller feature set and exposes additional CPU features, including RDRAND, SMEP, AVX2, BMI2, MOVBE, ADX, RDSEED, SMAP, CLFLUSHOPT, XSAVES, XSAVEC, SHA, and CLZERO. |
B5 |
AMD Zen 2 |
Includes the AMD Zen feature set and exposes additional CPU features, including CLWB, UMIP, RDPID, XGETBV with ECX = 1, WBNOINVD, and GMET. |
B6 |
AMD Zen 3 |
Includes the AMD Zen 2 feature set and exposes additional CPU features, including always serializing LFENCE, INVPCID, PSFD, SSBD, PCID, PKU, VAES, VPCLMULQDQ, and shadow stacks. |
B7 |
AMD Zen 4 |
Includes the AMD Zen 3 feature set and exposes additional CPU features, including Fast Short CMPSB and STOSB, Automatic IBRS, AVX512BF16, AVX512BITALG, AVX512BW, AVX512CD, AVX512DQ, AVX512F, AVX512IFMA, AVX512VBMI, AVX512VBMI2, AVX512VL, AVX512VNNI, AVX512VPOPCNTDQ, GFNI, IBRS, and Upper Address Ignore. |
Starting with vSphere 7.0 Update 1, EVC provides a feature for Virtual Shared Graphics Acceleration (vSGA), allowing multiple virtual machines to share GPUs and leverage the 3D graphics acceleration capabilities.
vSAN Services
You can enable DRS, vSphere HA, and vSAN at the cluster level. The following sections provide details on DRS and vSphere HA. For details on vSAN, see Chapter 2.