One of the key
performance counters in a vSphere environment is: CPU ready (%rdy in
ESXTOP)
CPU ready is the
time a virtual CPU is ready to run but is not being scheduled on a physical CPU.
This would under normal circumstances indicate that there is not enough physical
CPU resources on an ESX/ESXi host. This is the first go-to counter when your
users complain about bad performance.
It is generally
normal for VMs to have small values for CPU Ready Time accumulating even if the
hypervisor is not over subscribed or under heavy activity, it’s just the nature
of shared scheduling in virtualization. For SMP VMs with multiple vCPUs the
amount of ready time will generally be higher than for VMs with fewer vCPUs
since it requires more resources to schedule/co-schedule the VM when necessary
and each of the vCPUs accumulates the time separately.
At what point does
CPU Ready Time start to affect performance?
VMware had a
recommendation that for a SMP VM anything over 5% per vCPU is typically a
warning level and anything over 10% per vCPU is critical. The reason this
specifically says per vCPU is that each vCPU allocates 100% to the VM’s
scheduling total, so a 4 vCPU VM would have a scheduling total of 400%. A 10%
CPU Ready on a 4 vCPU VM only equates to 2.5% per vCPU.
Beware that if the
VM has a CPU Limit placed on it, whenever the VM exceeds its allocated limit it
will accumulate CPU Ready time while it waits to be allowed to execute
again.
Real time (20sec)
xxxms x100 / 20,000ms =
ready-time%
Day (5min) xxxms
x100 / 300,000ms =
ready-time%
Week (30min) xxxms
x100 / 1,800,000ms =
ready-time%
Month (2 hours) xxxms
x100 / 7,200,000ms =
ready-time%
Take the result and
divide it by number of vCPU's
Lower than 5% is
good
Higher than 10% is
problematic.
As a shortcut, you
can use the following formulas for the default chart update intervals to get the
CPU ready %:
•Realtime: CPU
summation value / 200 •Past Day: CPU summation value / 3000
•Past Week: CPU summation value / 18000
•Past Month: CPU summation value / 72000
•Past Year: CPU summation value / 864000
Example: A realtime CPU summation value of 1000 is divided by 200 to give a CPU ready % of 5.
BTW, for real time
graph you can probably make life easier on yourself by using the "latency" CPU
counter in vSphere, this is the CPU Ready Time %.
What cause high CPU
Ready times?The most common reason tends to be host over subscription, where too many vCPUs have been allocated per physical CPU ratio. While ESX 5 supports a maximum of 25 vCPUs per physical CPU, this is definitely a case where just because you can, doesn’t mean it’s good to do. typically problems start when a host is in the range of 2-2.5X over subscribed for server workloads.
The second common
scenario where CPU Ready times are high is when a larger SMP VM, for example one
with 4-8 vCPUs is running on a host that has a lot of smaller VMs with 1-2 vCPUs
for application servers. The larger resource allocation for the SMP VM results
in it having to wait longer for the hypervisor to supply the necessary physical
CPUs to schedule/co-schedule the workload. Often in cases where this occurs,
after asking some questions I find that the number of vCPUs for the Server was
increased from 4 to 8 due to performance problems for the VM. Unfortunately, if
CPU Ready time was the original problem, increasing the vCPUs actually doesn’t
improve performance, it generally makes things worse.
What do I do if
this is actually a problem?when CPU Ready is a problem for your VMs there are a couple of different things that can be done. The correct one depends on your virtual infrastructure. If the problem is purely host over subscription vCPU to pCPU ratio wise, start off by evaluating whether the VMs need to have the number of configured vCPUs to determine if any of them can be reduced to lower the ratio. If this can’t be done, the only real answer is to add additional hosts to allow the load to be balanced better and reduce the over subscription rates. Evaluate whether you can consolidate the larger VMs onto one or most hosts and move the smaller VMs to the other hosts to separate the VMs based on their sizes.
No comments:
Post a Comment