Tag Archives: Proactive HA using pyVmomi

Tutorial: Part-2: How to simulate Proactive HA using pyVmomi?

In part-1, we learned “What is Proactive HA and how to configure Proactive HA using pyVmomi?”. To understand part-2, I highly recommend you to take a look at part-1. In this post, we will learn how to simulate a failure to see Proactive HA in action.

Before going further, let us review what we configured already in last post.
– We registered a health provider.
– Added entities to be monitored i.e. hosts inside cluster
– Verified above 2 steps.
– Enabled Proactive HA on cluster in automated mode and configured remediation.

After performing above steps, you might have observed that all of the hosts are showing one red icon. This red icon indicates that initial health status for hosts is unknown i.e. gray, which is absolutely expected since Proactive HA provider must initialize health status of all ESXi hosts for the first time. In order to initialize health status of all ESXi, first we need to set status to “green”. This is where Proactive HA API i.e. PostHealthUpdates() comes to rescue. In fact, the same API will help us simulate a “red” health status, which will make Proactive HA to take configured action. Let us take a look at below pyVmomi script, which will initialize ESXi hosts to “green” health status.

This script is available on my git-hub repo as well i.e. push_health_updates.py

from pyVim.connect import SmartConnect, Disconnect
from pyVmomi import vim
import atexit
import ssl
import sys
import time
#Script to push meaningful health updates to Proactive HA 
s=ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode=ssl.CERT_NONE
si= SmartConnect(host="10.161.20.30", user="Administrator@vsphere.local", pwd="VMware#123",sslContext=s)
content=si.content

# Below method helps us to get MOR of the object (vim type) that we passed.
def get_obj(content, vimtype, name):
        obj = None
        container = content.viewManager.CreateContainerView(content.rootFolder, vimtype, True)
        for c in container.view:
                if name:
                        if c.name == name:
                                obj = c
                                break

        return obj

#Cluster where ProactiveHA is enabled, this parameter is NOT required for this script
cluster_name="ClusterProHA"
# initialize health status of below host, we need to repeat for all the hosts inside the cluster. 
host_name="10.160.20.15" 
host = get_obj(content,[vim.HostSystem],host_name)
# Managed object that exposes all the methods to configure & manage proactiveHA
heath_update_mgr=content.healthUpdateManager

#Provider ID for the registered provider
providerid="52 b5 17 d2 2f 46 7e 9b-5f 4e 1a 25 a3 db 49 85"
health_update=vim.HealthUpdate()
health_update.entity=host
health_update.healthUpdateInfoId="1000"
health_update.id="vThinkBeyondVM"
health_update.remediation =""
health_update.status="green"
#Below is the array of health updates
updates = [health_update]

#Method to post health updates for Proactive HA to consume
heath_update_mgr.PostHealthUpdates(providerid,updates)

#Disconnecting vCenter session
atexit.register(Disconnect, si)

We must understand line #32 to line #45.
Line #34: This is provider id for the health provider we registered in part 1 using QueryProviderName() API
Line #35: HealthUpdate is the data object to be passed to the Proactive HA API we are talking about i.e. PostHealthUpdates()
Line #36: This is the host object for which we wanted to set status to “green”. For the sake of simplicity, this script changes health status per host. You can either modify this script to have for all the hosts or can run it multiple times.
Line #37: This is the healthUpdateInfoId we passed while registering health provider i.e. register_provider.py
Line #38: This id can be any of our choice but it should be unique per health update being pushed.
Line #39: Note that remediation set here was blank. This is required when we are setting health updates as “green”.
Line #40: Here we set the health status as “green”. Other valid values are “yellow” and “red”.
Line #42: We can have multiple health updates in single API call.
Line #45: This is the API we are interested to call i.e. PostHealthUpdates().

Since we have initialized health status for all ESXi hosts to “green”, now is the time to simulate a fake health update so that Proactive HA will trigger “quarantine mode” on a host. We can use exactly the same script with little change to line #38,39 & 40. Note that simulating fake health update is only for educational purpose . Take a look at the code snippet for pushing “red”(fake) health status a host.

#Provider ID for the registered provider
providerid="52 b5 17 d2 2f 46 7e 9b-5f 4e 1a 25 a3 db 49 85"
health_update=vim.HealthUpdate()
health_update.entity=host
health_update.healthUpdateInfoId="1000"
health_update.id="vThink-BVM"
health_update.remediation ="Please replace power unit"
health_update.status="red"
#Below is the array of health updates
updates = [health_update]

#Method to post health updates for Proactive HA to consume
heath_update_mgr.PostHealthUpdates(providerid,updates)

Once I executed above script, I refreshed web client after 3-4 seconds and below is what I see. How cool is that?

Now you can set the host health status back to “green” the way did above, which will make host to exit from quarantine mode. Finally you can leverage my Part-1 scripts to disable Proactive HA on the cluster and un-register the fake health provider we registered initially. In addition to APIs we explored in part-1 & 2, there are few more utility APIs Proactive HA has exposed. I will leave them for your exercise. Let me know if you have any doubts/comments.

Further learning:

1. If you haven’t started with vSphere python SDK i.e. pyVmomi. Here is my tutorial post on the same.

2.For the sake of simplicity, I use to write individual API scripts & hard-code some parameters but I have a plan to consolidate above scripts to have good Proactive HA pyVmomi module. If you would like to contribute to module, please let me know, we can work together.

Tutorial: Part-1: How to configure & manage Proactive HA using pyVmomi?

Recently I got an opportunity to work on one of the cool vSphere 6.5 features i.e. Proactive HA. As usual, I did explore vSphere APIs introduced to configure and manage Proactive HA and I thought, it would be good to share my learning with you. Since there is lot to share, I have divided this tutorial in 2 parts as follows.

Part-1: What is Proactive HA in brief and how to configure ProactiveHA using pyVmomi.
Part-2: We will discuss on how to simulate a failure so that we can see ProactiveHA in action.

What is Proactive HA in brief?
As we know already, whenever there is a host failure, vSphere HA does restart VMs on alterante hosts inside cluster. vSphere HA works just fine for most of the vSphere environments but there is a minimal downtime as vSphere HA reacts to the failure. On the other hand, proactive HA triggers some meaningful proactive actions even before any host come across any possible failure. One of interesting facts about proactive HA is that, though it can be enabled from vSphere Availability tab, it is in-fact a DRS feature (more on this later). Lets explore Proactive HA APIs to learn further.

Configure ProactiveHA
Managed object HealthUpdateManager enables user to configure and manage ProactiveHA. In fact, these set of APIs are primarily exposed to server vendors & they are supposed to develop a web client plugin using these APIs in order to feed the server health updates to DRS. Once DRS receives these server health updates, it will react to these health updates even before(i.e.proactively) hardware/host gets failed.
Disclaimer: This tutorial is just for lab/educational purpose only.

1. First important thing is to register a health update provider. “RegisterHealthUpdateProvider()” API allows us to register it. Ideally health provider registered should be coming from server vendors. Since this is a tutorial for educational purpose, we will simulate a fake health provider for our understanding, which is not supported by VMware for production use.

2. Once we have health update provider registered, this provider needs to monitor entities for any matching health issues from server. In our case, entities to be monitored by provider are hosts inside the cluster. “AddMonitoredEntities()” is the API that enables user to bind a provider with managed entities such as hosts.

3. So far we talked about registering a provider and adding entities i.e. hosts to be monitored. Now let us go ahead and make this into action using below pyVmomi script.

Note: I have added documentation inside the script itself. Please take a look.

This script i.e. register_provider.py is available on git-hub repo as well


from pyVim.connect import SmartConnect, Disconnect
from pyVmomi import vim
import atexit
import ssl
import sys
import time
#Script to register a health update provider and adding monitored entities
s=ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode=ssl.CERT_NONE
si= SmartConnect(host="10.161.20.30", user="Administrator@vsphere.local", pwd="VMware#123",sslContext=s)
content=si.content

#Cluster where ProactiveHA will be enbaled
cluster_name="ClusterProHA"

# Managed object that exposes all the methods to configure & manage proactiveHA
heath_update_mgr=content.healthUpdateManager

name="ProHAProvider1"  #Name of the provider to be registered

# Building an object with fake/simulated health update
health_update_info=vim.HealthUpdateInfo()
health_update_info.componentType="Power" # Can also be Storage, Memory, Network
health_update_info.description="Power failure Detected"
health_update_info.id="1000"

# Array of health updates, we can have multiple health updates per component type
health_updates = [health_update_info]

#Register a provider with health_updates, it returns a providerid
providerId = heath_update_mgr.RegisterHealthUpdateProvider(name,health_updates)

# Below method helps us to get MOR of the object (vim type) that we passed.
def get_obj(content, vimtype, name):
        obj = None
        container = content.viewManager.CreateContainerView(content.rootFolder, vimtype, True)
        for c in container.view:
                if name:
                        if c.name == name:
                                obj = c
                                break
                        else:
                                obj = None
        return obj

#Cluster object
cluster = get_obj(content,[vim.ClusterComputeResource],cluster_name)
# Hosts inside the cluster
hosts = cluster.host
#Converting array of HostSystem objects to array of managed entities
cluster_entities=vim.ManagedEntity.Array(hosts)

# Adding entities i.e hosts to be monitored
heath_update_mgr.AddMonitoredEntities(providerId,cluster_entities)

#Disconnecting vCenter session
atexit.register(Disconnect, si)

Now that we have registered a provider and added entities to be monitored by this provider, let us verify whether this is really done. Let’s take a look at the script below. The script check_provider_and_entities.py available on my git-hub repo

from pyVim.connect import SmartConnect, Disconnect
from pyVmomi import vim
import atexit
import ssl
import sys
import time
#Script to verify a health update provider and monitored entities
s=ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode=ssl.CERT_NONE
si= SmartConnect(host="10.161.20.30", user="Administrator@vsphere.local", pwd="VMware#123",sslContext=s)
content=si.content

# Managed object that exposes all the methods to configure & manage proactiveHA
health_update_mgr=content.healthUpdateManager

# Listing provider list
provider_list =health_update_mgr.QueryProviderList()

print "Provider Id:"+provider_list[0]

#Quering specific provider
provider_name = health_update_mgr.QueryProviderName(provider_list[0])
print "Provider Name:"+provider_name

#Getting monitored entities by a provider
monitored_entities=health_update_mgr.QueryMonitoredEntities(provider_list[0])
print monitored_entities

#Disconnecting vCenter session
atexit.register(Disconnect, si)

If you take a look, we have used 3 new proactive HA APIs i.e. QueryProviderList(), QueryProviderName() & QueryMonitoredEntities(). Below output confirms that provider is registered and entities are monitored by that provider as expected.

Output:
vmware@localhost:~$ python providerList.py
Provider Id:52 5b d3 2c 0a aa c9 2c-46 4d b8 c2 79 e1 94 c1
Provider Name:ProHAProvider1
(vim.ManagedEntity) [
‘vim.HostSystem:host-21’,
‘vim.HostSystem:host-27’,
‘vim.HostSystem:host-15’
]

Now we are all set for enabling proactive HA on the cluster. Good news is, we can not only enable it using API but also from web client as well. Below is how we can do it using web client.

You can see that Proactive HA can be enabled from “vSphere Availability” tab and note that since this is a DRS feature, user needs to enable DRS first before enabling Proactive HA. Let us now take a look at “Proactive HA Failures and Responses” configuration

I would recommend you to read documentation provided in above screenshot. Below is how finally it looks like.

We enabled Proactive HA using web client. Now let us take a look how to enable the same using API.
This script i.e. enable_proactive_ha.py is available on my git-hub repo.

from pyVim.connect import SmartConnect, Disconnect
from pyVmomi import vim
import atexit
import ssl
import sys
import time

#Script to enable Proactive HA on cluster.

s=ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode=ssl.CERT_NONE
si= SmartConnect(host="10.161.20.30", user="Administrator@vsphere.local", pwd="VMware#123",sslContext=s)
content=si.content

#Cluster where ProactiveHA will be enbaled
cluster_name="ClusterProHA"

# Managed object that exposes all the methods to configure & manage proactiveHA
health_update_mgr=content.healthUpdateManager

# Below method helps us to get MOR of the object (vim type) that we passed.
def get_obj(content, vimtype, name):
        obj = None
        container = content.viewManager.CreateContainerView(content.rootFolder, vimtype, True)
        for c in container.view:
                if name:
                        if c.name == name:
                                obj = c
                                break
                        else:
                                obj = None

        return obj

#Cluster object
cluster = get_obj(content,[vim.ClusterComputeResource],cluster_name)
if not cluster:
        print "Cluster is NOT found, please enter correct cluster name"
        sys.exit()
cluster_spec=vim.cluster.ConfigSpecEx()
drs_enabled = cluster.configuration.drsConfig.enabled

if (not drs_enabled):
        drs_info=vim.cluster.DrsConfigInfo()
        drs_info.enabled = True
        cluster_spec.drsConfig=drs_info
else:
        print "DRS is already enabled, cool"

pro_ha_spec=vim.cluster.InfraUpdateHaConfigInfo()
pro_ha_spec.behavior="Automated"
pro_ha_spec.enabled=True
pro_ha_spec.moderateRemediation="QuarantineMode"
provider_list =health_update_mgr.QueryProviderList()
if(provider_list):
        pro_ha_spec.providers = provider_list  #In our case, it was just one but their can be multiple
else:
        print "Provider is not registered, please do it first before enabling Proactive HA"
        sys.exit()

pro_ha_spec.severeRemediation="MaintenanceMode"
cluster_spec.infraUpdateHaConfig=pro_ha_spec

cluster.ReconfigureComputeResource_Task(cluster_spec, True)

print "Cluster reconfiguration is triggered, please check out, you can track Task object for result"

#Disconnecting vCenter session
atexit.register(Disconnect, si)

You can see existing API i.e. ReconfigureComputeResource_Task() help us enable proactive HA since the same API is responsible for configuring vSphere cluster level features. Also you might have noticed that, in above script, moderateRemediation was set to “QuarantineMode” and severeRemediation was set to “MaintenanceMode, hence if you take a look at web client, it will show remediation mode as “mixed”.

That is all for Part-1. I hope you enjoyed learning this educational tutorial. Please stay tuned for Part-2, where we will simulate a failure to see Proactive HA in action.

Further learning:

1. If you haven’t started with vSphere python SDK i.e. pyVmomi. Here is my tutorial post on the same.

2.For the sake of simplicity, I use to write individual API scripts but I have plan to consolidate above scripts to have good Proactive HA module. If you would like to contribute it, please let me know

3. Very good article by Brian Graf on Proactive HA

4. If you do not prefer Python, as usual William has written a nice PowerCLI module