One question is always getting popped into my inbox , the question is: can VMware DPM lead to ESXi host memory ballooning? if yes, is there any way we can fine tune DPM to avoid the memory ballooning? I explained it whenever possible but when this question keeps popping up again and again, I thought its better to write one posts to give overview of the DPM, its memory demand metric and how memory ballooning relates to DPM.
I have divided this post in 3 parts as follows
Part 1: DPM basic overview & its memory demand metric calculations.
Part 2 : DPM vs Memory ballooning & memory best practices.
Part 3: What are the ways we can fine tune DPM?
Today I have covered Part 1 : “DPM basic overview & its memory demand metric calculations”.
DPM basic overview:
As we already know that consolidation of physical servers into virtual machines reduces significant power consumption. VMware DPM (Distributed Power Management) takes this reduction in power consumption to the next level.
DPM is feature of VMware DRS (Distributed Resource Scheduler), once we enable DRS on cluster from vSphere client or Web client, enabling DPM is just a click away. DRS does dynamic CPU and memory load balancing across all ESXi in the cluster & DPM does the evaluation of each ESXi host in the cluster so that DPM can put one or more hosts into standby mode (Power OFF) to save the power consumption OR bring back one or more hosts from standby mode to meet the resource (cpu, memory) demand of virtual machines in the cluster. You might be wondering how DPM evaluates ESXi host? It is the Target Resource utilization Range that plays the crucial role. DPM calculates Target Resource Utilization Range as follows.
Target Resource Utilization Range = DemandCapacityRatioTarget ±
DemandCapacityRatioTarget is the target utilization of the ESXi host in the cluster. By default this is set at 63%.
DemandCapacityRatioToleranceHost sets the tolerance value around target utilization of each ESXi host, by default this is set at 18%.
Hence, by default, Target Resource Utilization Range = 63±18 i.e. Range is 45 to 81%
It is mean that DPM try its best to keep the ESXi host resource utilization in the range between 45 and 81 percent. If resource utilization of cpu or memory on each ESXi host is below 45%, DPM evaluates that host for putting into standby mode (Power OFF). If the resource utilization exceeds the 81% of either CPU or memory resources, DPM evaluates ESXi host to bring back that hosts from standby mode.
Note: DPM considers CPU & memory as resource for evaluation however, In this blog post, we would be focusing only on memory resource.
Active memory: Memory which is being actively used at any point of moment by VM. This keeps changing as the VM load increases or decreases.
Consumed memory: It is the memory consumed by VM since it is booted. Note that consumed memory is not the same as memory allocated to VM. Consumed memory can be equal to allocated (configured) memory if & only if VM consumes entire memory allocated to VM. ESXi host never allocates memory to any VM until that VM touches/requests the host memory. When VM is powered off consumed memory would be zero. It is good to note that every VM only can get min(Configured memory, specified limit). When VM is powered OFF, consumed memory would be zero. If there is no any limit set on VM, configured memory itself will be default limit.
DPM memory demand metric.
In earlier releases, DPM was just considering active memory as memory demand from each VM on the host. i.e. “DPM memory demand metric=active memory” which is aggressive. In order to control DPM’s aggressiveness, with version vCenter 5.1 U2c onwards and all the versions of vCenter 5.5, DPM can be tuned to consider idle consumed memory as well in DPM memory demand metric. i.e. DPM memory demand metric = active memory + X% of idle consumed memory.
1. Default value of the X is 25. X value can be modified by using DRS advanced option “ PercentIdleMBInMemDemand” on cluster level. We can set this value in the range from 0 to 100. When we set this value to 0, it is mean that DPM will be aggressive the way it was in earlier release & as we increase X value DPM keeps becoming less aggressive. If X value is 100, it is mean that DPM considers entire consumed memory as memory demand. (Consumed memory =active memory + idle consumed memory).
2. Example. : Say, we have one VM with 8192MB (8GB) configured memory. Consider since the VM is booted, VM has consumed 6144MB (6GB) memory from host but only 20% is being used actively, hence active memory would be 20% of 6144 MB=1228.8 MB. Idle consumed memory =6144-1228.8=4915.2 MB. If X value is 25 then DPM memory demand would be=1228.8 + 25 % of 4915.2 =2457.6 MB + overhead. Setting X to 25 means, DPM considers 25% of idle consume memory as a demand by VM to avoid performance impact. As we increase X, DPM becomes more conservative. Hence user needs to set the X value as per his environment & requirement.
3. DPM Power OFF recommendations: Based on Target Resource utilization range, DPM evaluates candidate hosts to put into standby mode (i.e. When utilization is under 45%), and then DPM takes help from DRS to run the simulations considering candidate hosts are powered off in the cluster. These DRS simulations internally use the DPM memory demand metric (active memory + X% of idle consumed memory) to calculate the memory demand by each VM in the cluster. These simulations will be used by DPM to see if there is improvement in Target Resource Utilization Range when candidate host(s) is powered OFF. If resource utilization of the all non-candidate hosts is within the target range (i.e. 45%-81%), DPM puts the candidate hosts into standby mode & saves the power.
4. DPM Power ON recommendations: DPM evaluates each standby host when resource utilization of the powered ON host is above 81%, and then DPM takes help from DRS to run the simulations considering standby host(s) is powered ON in the cluster. These DRS simulations internally use the DPM memory demand metric (active memory + X% of idle consumed memory) to calculate the memory demand by each VM in the cluster & distributes the VMs across all hosts. These simulations will be used by DPM to see if there is improvement in Target Resource Utilization Range when standby host(s) is powered-on. If resource utilization of the all hosts is within the target range, DPM generates host power ON recommendations.
I hope you enjoyed how DPM works in general, please do leave comment for any clarification & stay tuned for exciting PART 2 “DPM vs Memory ballooning & memory best practices.”