License groups in vROps 6.6

License groups in vROps 6.6

How to Remove “License Is Invalid” Watermark in vRealize Operations Manager

I have on good authority, although not publicly documented, that a code change happened in vROps 6.6 to resolve a bug where the “License is invalid” watermark was not showing up even when environments were not licensed. That is to say, no matter whether previous versions had valid license keys or not, the watermark never appeared. As a result of fixing this bug, some environments may now be seeing the watermark due to the incorrect construction of License Groups.

This is only applicable to using valid license keys in a supported manner. Nothing in this article is discussing bypassing valid product licensing for vROps. We carefully reviewed our VMware ELA with our AE and TAM to ensure we had valid entitlements.

The vROps environment I admin uses a combination of per-VM Advanced licenses and per-CPU vCloud Suite Advanced licenses. This is supported, as evidenced by the vRealize Operations 6.6.1 Release Notes:

You can deploy single choice and suite licenses together. License counting for individual license keys is handled through licensing groups. You can mix editions or licensing models …

However, the vROps documentation is extremely vague on how these license groups should or could be constructed.

We worked with VMware Support some time ago to establish license groups for each respective set of keys. However, upon upgrade to 6.6, we noted the “License is invalid” watermark appearing on many objects in the environment. License Keys were no longer associated with the License Groups. The dynamic criteria we used previously to create license groups was no longer valid in vROps 6.6, for reasons that are still not clear. After extensive work with VMware Support through BCS again, we have arrived at a solution that removes the watermark from all objects within our environment. I will try to elaborate on the general process here, as every environment will have unique requirements. Credit for this solution ultimately goes to Kyle P. at VMware Premier Support, and I hope a VMware KB eventually appears.

Criteria:
– License Group 1 “vCloud Suite” Host CPU-based licenses applied to Object type “Host System” for a given set of clusters.
– License Group 2 “VM Licenses” VM-based licenses applied to Object type “Virtual Machine” for all other VMs EXCEPT those on Hosts licensed with License Group 1.
– Use “dynamic” membership criteria, not static “Always include/exclude” lists to avoid manual maintenance of License Groups. This is an important distinction.

Important notes & considerations:
– When you apply a CPU license to a group containing Hosts, the VMs on the Hosts will still show “License is invalid” watermark. When you apply a VM license key to Virtual Machines, the Hosts on which those VMs run will still show the “License is invalid” watermark. I am convinced this is a bug in vROps, but I’m not getting much traction with the PR associated with my SR since a “workaround” is in place. So, the workaround is this: while the license is applied to the respective Object type of each License key, the related objects (parent or children) are also going to have to be included in membership for the License Group.

Let’s look at the example License Group 1 “vCloud Suite” first:

(I’m assuming if this article is relevant to you, that you know where to find License Groups under Administration – Management – Licensing & how to create/edit them.)


Object Type = Host System
Relationship – Descendant of – contains – [CLUSTER NAME] (a partial string can be used for wildcard matching) (If multiple clusters are needed that can’t be added with a wildcard, click “Add another criteria set” to create an OR construct and replicate the same criteria.)

This should make sense so far. We’re creating a License Group for CPU License Keys with membership of Hosts (which are descendants of the named cluster(s)).

Now here’s the part that is not so obvious: We also need to add additional OR criteria to select the Object Type “Virtual Machine” for all the VMs that live on these clusters. The license will not apply to the VMs for license utilization count, but the watermark will persist unless the VMs are added via this criteria.


Object Type = Virtual Machine
Relationship – Descendant of – contains – [CLUSTER NAME] (a partial string can be used for wildcard matching) – use the same criteria to select clusters as above.

Now, License Group 2 “VM Licenses”

Remember from our Criteria above, this license groups will include all VMs that do not live on the Host Systems licensed with License Group 1.


Object Type = Virtual Machine
Relationship – Descendant of – not contains – [CLUSTER NAME] (same string used in License Group 1 above)
For additional clusters, you’ll need to “Add” to use AND logic.

This selects all VMs in the environment that do NOT live on these clusters. Hopefully this makes sense. Now here again is the part that is does not follow logic: the Host Systems that host these VMs will still display “License is invalid” watermark until we add Host Systems to this license group. Again, I believe this is a flaw in vROps licensing, as the license group applying to object type Virtual Machine should not have to contain objects of type Host System. Despite that, in extensive testing, this is the workaround we’ve found to remove the watermark.

“Add another criteria set” (making an OR statement), Object Type = Host System.

That’s it. Note it can take a few minutes for the licensing to update. You can potentially speed this up by clicking Refresh under License Keys:

This was the solution we developed to meet our criteria as listed above. It should be understood that almost every environment will be different, and your criteria may be different. Hopefully the concepts behind this article can be extended to additional use cases to resolve the “License is invalid” watermark in vROps. If there’s anything I can clarify or any way I can help, contact me via Twitter @vMiklm.

VMworld 2017 – VMware Design Studio

VMworld 2017 – VMware Design Studio

One thing you should definitely make time for at VMworld is the VMware Design Studio. I have worked closely with the vRealize Operations PM & UX teams over the past year providing customer feedback, and they have extended the following invite to VMworld attendees:

Would you like to help define the next generation of VMware products and improve the ones you use today? We invite you to participate in the VMware Design Studio at VMworld 2017 US.

In the VMware Design Studio, we will show a few key product directions and design prototypes. This is a unique opportunity to get your voice heard and interact in a small group or 1:1 session with the product teams.

As a token of our appreciation, you will receive unique VMware Design Partner swag upon completion of a session!

Using PowerShell for VM Snapshot Creation & Cleanup

Using PowerShell for VM Snapshot Creation & Cleanup

There is often a use case for creating snapshots of a list of VMs in a VMware virtual environment. For example, when upgrading vRealize Operations Manager, the documentation recommends taking a snapshot of the VMs in the cluster before upgrading. With a growing number of nodes in our vROps cluster, I finally wrote some PowerShell to automate the snapshot creation and removal process. When shared with my peers in another IT group, they had another use case where they wished to skip any VMs larger than a given size (1 TB in this instance).


<# .SYNOPSIS Read a list of VMs from a text file; check disk usage and take snapshot if less than 1 TB .DESCRIPTION @vMiklm - miklm.com 2017-08-08 #>

# vCenter Server name (comma separate for multiple)
Connect-VIServer VCENTER_SERVER_1, VCENTER_SERVER_2

# Input :: Change this to the local path to your text list of VMs to check & snapshot
$serverList = Get-Content "e:\temp\vmInputList.txt"

# Output :: Filename for output list of successful snapshot VMs
$outputFile = 'e:\temp\vmOutputList-20170808.txt'

# snapshotName = Descriptive name of Snapshot (includes system date)
$snapshotDate=Get-date
$snapshotName = "VM Snapshot - " + $snapshotDate

# snapshotDesc = Description of snapshot including responsible user & delete by date (using system date + 7 days)
$snapshotDate=(Get-date).AddDays(7)
$snapshotDesc = "USERNAME - Delete after " + $snapshotDate

# Threshold Size is the maximum size, in GB, of the VMs to snapshot. 1024 GB = 1 TB
$thresholdSize = 1024

$snapshotServerList = @()

foreach ($server in $serverList)
{
# initialize the sum variable to zero
$diskSizeTotal = 0

$Disks = Get-HardDisk -VM $server

foreach($disk in $Disks)
{
$DiskName = $Disk.Name
$DisksizeinGB = ($Disk.CapacityKB * 1024)/1GB

# use this sum variable to keep the running total of disk size in GB
$diskSizeTotal=$diskSizeTotal+$DisksizeinGB
}

if ($diskSizeTotal -gt $thresholdSize)
{
echo $server" : "$diskSizeTotal" GB is greater than 1 TB. Snapshot skipped"
}
else
{
echo $server" : "$diskSizeTotal" GB is less than 1 TB."
# Take Snapshot
New-Snapshot -VM $server -Name $snapshotName -Description $snapshotDesc
# Add server to list array of servers with Snapshot taken; this list will be output to file to allow cleanup of snapshots
$snapshotServerList+=$server
}
}

# Output list of servers that had Snapshot taken; this file should then be used as input for the snapshot cleanup script
foreach ($serverinList in $snapshotServerList)
{
echo $serverinList | Out-File -Append $outputFile
}

And here’s the accompanying script to remove snapshots when no longer required. Assumption is that the $outputFile from the above “take snapshot” script will be used as input ($serverList) for the “remove snapshot” script below.


<# .SYNOPSIS Read a list of VMs from a text file and remove ALL snapshots for listed VMs .DESCRIPTION @vMiklm - miklm.com - 2017-08-08 #>

Connect-VIServer VCENTER_SERVER_1, VCENTER_SERVER_2

# Change this to the local path to your list of VMs to remove snapshots
# If used with the above snapshots script, this is same file as $outputFile
$serverList = Get-Content "e:\temp\vmOutputList-20170808.txt"

foreach ($server in $serverList)
{
$snapsList = Get-VM $server | Get-Snapshot
foreach ($snapshot in $snapsList)
{
# If the number of snapshots is large, consider removing -RunAsync to avoid too much load on the storage backend
Remove-Snapshot $snapshot -Confirm:$false -RunAsync
}
}

Code is presented for example, reference, and training. You may have to modify extensively to make any of it work in your environment. I make no claims of being a developer. I have learned what little I know by seeing how others have solved problems in PowerShell (and PHP, C#, etc.), so this is a small effort to contribute back to the community. There are certainly other ways to do this, and likely BETTER ways, but this is what I have made. Feedback & suggestions are welcome in the comments. No support, warranty, guarantee, or anything else is given with any code found on this site. Use at your own risk.

VMworld 2017 – vROps & vSAN

VMworld 2017 – vROps & vSAN

If you’re attending VMworld 2017 in Las Vegas (August 28-31), be sure to catch session STO1895BU. I’ll be accompanying John Dias (VMware Technical Marketing, CMBU) & Pete Koehler (VMware Technical Marketing, SABU) on “Optimizing vSAN with vRealize Operations“. We’ll highlight new features of vROps 6.6 including the vSAN Dashboards, and how we’re using these features in the real world in our enterprise customer SDDC environment.

vRealize Operations Certificate Installation Error

vRealize Operations Certificate Installation Error

There is a good bit of information available for installing certificates for vROps 6.x. I encountered an issue that required a VMware Support case and uncovered some changes in vRealize Operations Manager 6.3 and 6.4.

My environment was using an internally signed SHA1 (128-bit) certificate. This was being updated to a 256-bit certificate.

The process to generate a certificate for vROps is more complex than typical, and is poorly documented by VMware. There are some articles with better information on how to create and install vROps certificates:
http://www.kanap.net/2015/02/install-a-custom-certificate-for-vrealize-operations-manager/
http://virtuallyeuc.com/vrealize-operations-manager-6-0-certificate-creation/

After carefully following the process to create the CSR, having the certificate signed, and building the .pem file, vROps refused to accept the new cert:

Operation failed. If the error persists, contact VMware support.

Review casa.log [/var/log/casa_logs/casa.log on your master vROps node] and look for something similar to:

2016-12-06 17:45:44,102 [x2000UIE] [ajp-nio-127.0.0.1-8011-exec-7] INFO support.subprocess.GeneralCommand:255 - Command '/usr/lib/vmware-vcopssuite/python/bin/python /usr/lib/vmware-casa/bin/vropsCertificateTool.py -i /storage/db/tmp/uploaded_cert.tmp --no_describe --json --level NONE' threw exception: CommandLineExitException: key=general.failure; args=1,; cause=
2016-12-06 17:45:44,103 [x2000UIE] [ajp-nio-127.0.0.1-8011-exec-7] INFO casa.security.SecurityService:946 - validateCertificateFile script's STDOUT:
{"errors": [], "exitCode": 0, "warnings": [{"warning_args": "/C=US/ST=State/L=City/O=Org/OU=Domain/CN=CA", "warning_code": "WARN-2", "warning_message": "Issuer and subject names match (/C=US/ST=State/L=City/O=Org/OU=Domain/CN=CA), but the issuer key does not. This is likely the wrong issuer certificate."}]}

So here’s what happened:
Starting in vROps 6.3, a change was made to how vROps handles the upload of a certificate.  The service used for the handling of the chain file has a file limit of 8192 bytes.
– My certificate chain (.pem file) was 10.7 KB, due to the fact that we have 2 intermediate certificates (Root Cert->Intermediate 1->Intermediate 2->Server Cert)
– vROps was essentially truncating the file, causing it to interpret the first intermediate certificate as the root, causing the “likely the wrong issuer certificate” error.
– I confirmed the issue exists on 6.3 and 6.4; the TSE was not able to replicate on 6.2.1
The vRealize Operations Manager supports custom security certificates with key length up to 8192 bits. An error is displayed when you try to upload a security certificate generated with a stronger key length beyond 8192 bits.

Solutions:
1) Use a certificate with only one intermediate, which results in a .pem file size of ~7 KB
– I had the option to request a GlobalSign certificate at minimal cost; we were in the process of requesting this anyway to rule out our internal CA as the problem. It is of course understood that this will not be an option for many users.
2) “The workaround is to edit the ‘maxInMemorySize’ value for the spring service which handles the upload of the cert. This should prevent the use of the temp file which in turn will bypass the part of the code that reads the temp file with the 10k limit.” – Provided by VMware TSE. I did not pursue this workaround as we already had the new GlobalSign certificate ready to test on the same day this workaround was proposed. I have asked for additional information on this workaround and will update if it is provided.

Hopefully this information helps diagnose failures in certificate installation in vRealize Operations 6.3 or 6.4. I’m not sure why 8 KB was the file size limit chosen, but it is too small for many environments with multiple intermediate CAs. Hopefully this will be resolved in the next release.

Update:
1) To edit ‘maxInMemorySize’ value for the spring service to allow a larger certificate upload, the following information was provided by VMware support:
a. Stop casa.
service vmware-casa stop

b. Edit /usr/lib/vmware-casa/casa-webapp/webapps/casa/WEB-INF/casa-servlet.xml
Locate this text around line 256:
<bean id="multipartResolver" class="org.springframework.web.multipart.commons.CommonsMultipartResolver">
<property name="uploadTempDir" value="file:///${application.data?/storage}/db/tmp"/>
</bean>

Add the following within the <bean> definition:
<property name="maxInMemorySize" value="30720"/>

c. Restart casa.
service vmware-casa start

2) I have been advised that a fix for this issue will be included in the next release, although I cannot confirm that and have no information on the vROps release schedule.

I want to publicly acknowledge VMware Global Support Services (GSS) Business Critical Support (BCS) and the excellent experience working with them to diagnose and resolve this issue, as well as VMware Technical Account Manager (TAM) Services. While not cheap, these service offerings provide high value.

* Update 3/3/2017
vRealize Operations 6.5 was released on 3/2/2017. The 8192 bit certificate size limit is still reflected in the documentation .

VCSA – Moving Behind A Firewall

VCSA – Moving Behind A Firewall

We have a situation where we have to deploy VMware vCenter Server Appliance (VCSA), then move it to a secured network behind a firewall.

I have not (yet) had success changing the hostname, including the FQDN, so make sure you can create proper DNS entries on both sides of the firewall and can do the initial deployment with the intended final FQDN. Our VCSA’s FQDN is of the format “vcenterserver.securednet.companynet.org” so this was achievable (the segmented domain is a subdomain of the company domain)

Deploy the appliance, then access the VCSA console and press F2, then login as root. At this point you may also need to Edit the VM Settings and change it to the new network (behind the firewall, in my case).
– Configure Management Network
– Set IP Configuration, IPv6 Configuration, DNS Configuration, Custom DNS Suffixes as required.
– (Esc) Exit, (Y) Yes Apply changes and restart management network

At this point, I got the following 2 screens. IPv6 is understandable; we disabled it. The DNS error appears even though we have validated the DNS entries/servers.

Now, to access the shell console. Login as root.

Command> shell.set --enabled True
Command> shell
# vi /etc/sysconfig/networking/devices/ifcfg-eth0

Quick vi reminder: hit (Insert) to edit the line, then (Esc), :wq! to save and exit.

Make sure all the IP information is correct in this file. I had to update the broadcast & gateway for sure. Then,

# vi /etc/hosts

Fix the entry here to reflect the new IP.

Validate settings:

# ifconfig eth0
# route -n

At this point, we were missing the default gateway in the route table, even thought it was defined in the ifcfg-eth0 file and in the IP Configuration screen of the console. Also the Broadcast address still reflects the old network, but this is less of a problem.

To temporarily fix the default gateway, do # route add default gateway XXX.YYY.ZZZ.1, using the correct gateway IP obviously.

To permanently resolve the gateway issue, log into the web console (https://vcenterserver:443/vsphere-client/) as Administrator@vsphere.local or whatever you defined as your SSO domain.
– Go to System Configuration under Administration in the main panel
– Nodes (in left column, under System Configuration)
– Select your vCenter server under Nodes
– “Edit” in the upper right corner of the main panel
– Expand nic0 & define Default gateway
– Ok
– Reboot to validate configuration persists

VMware: Multi-NIC vMotion Configuration with PowerShell

VMware: Multi-NIC vMotion Configuration with PowerShell

Changed configuration of numerous VMware clusters to utilize multi-NIC vMotion. As with all my PS, it’s quick & dirty and cobbled from a couple of example scripts. Assumes existing vmkernel port group named “vMotion”


<# .SYNOPSIS Configure multi-NIC vMotion .DESCRIPTION Miklm.com 08/19/2015 Credit to examples from http://www.yellow-bricks.com/2011/09/17/multiple-nic-vmotion-in-vsphere-5/#comment-28153 & http://hostilecoding.blogspot.com/2014/01/vmware-virtual-standard-switch.html #>

$viserver = "vCenter.local"
$cluster = "CLUSTERNAME"
$vswitchName = "vSwitch1"
$VMotionVLan = "908"
$vSwitch2Nics = "vmnic1","vmnic3"
$VMotionPortGroupName1 = "vMotion-01"
$VMotionPortGroupName2 = "vMotion-02"

Connect-VIServer $viserver

Foreach ($esxhost in (Get-Cluster -name $cluster | Get-VMHost))
{
$vswitch = Get-VirtualSwitch -VMHost $esxhost -Name $vswitchName

# Remove the original "vMotion" VMkernel Port group
Remove-VMHostNetworkAdapter -Nic (Get-VMHostNetworkAdapter -VMHost $esxhost | where {$_.PortGroupName -eq "vMotion"}) -Confirm:$false
Remove-VirtualPortGroup -VirtualPortGroup (Get-VirtualPortGroup -Name "vMotion" -vmhost $esxhost) -Confirm:$false

# Configure virtual switch for Multi-NIC vMotion
$vportgroup1 = New-VirtualPortGroup -VirtualSwitch $vswitch -Name $VMotionPortGroupName1 -VLanId $VMotionVLan
$vportgroup2 = New-VirtualPortGroup -VirtualSwitch $vswitch -Name $VMotionPortGroupName2 -VLanId $VMotionVLan

$vNic1 = New-VMHostNetworkAdapter -VMHost $esxhost -VirtualSwitch $vswitch -PortGroup $vportgroup1 -VMotionEnabled:$true
$vNic2 = New-VMHostNetworkAdapter -VMHost $esxhost -VirtualSwitch $vswitch -PortGroup $vportgroup2 -VMotionEnabled:$true

$vportgroup1 | Get-NicTeamingPolicy | Set-NicTeamingPolicy -MakeNicActive $vSwitch2Nics[0] -MakeNicStandBy $vSwitch2Nics[1]
$vportgroup2 | Get-NicTeamingPolicy | Set-NicTeamingPolicy -MakeNicActive $vSwitch2Nics[1] -MakeNicStandBy $vSwitch2Nics[0]
}

Cleaning up ARP tables – VMware ESXi 5.1

Cleaning up ARP tables – VMware ESXi 5.1

We had a recent situation where a virtual appliance (Avamar proxy) was mistakenly deployed configured with the IP address of the gateway. This caused the hosts to be unreachable from vCenter. After finding the offending VM and shutting it down from the console, the ARP tables either had to be cleaned up (you could also allow them to time out but that could take up to 20 minutes)

ESXi 5.1
esxcli network ip neighbor list
vsish -e set /net/tcpip/v4/neighbor del IPADDRESS

New functionality added in 5.5 to remove ARP entries through esxcli

Find a VM by MAC in vCenter with PowerCLI
Get-vm | Select Name, @{N=“Network“;E={$_ | Get-networkAdapter | ? {$_.macaddress -eq“00:50:56:A4:22:F4“}}} |Where {$_.Network-ne “”}

Also you may need to clean up your Windows vCenter server from the command prompt
arp -a
arp -d IPADDRESS

Or,
netsh interface ip delete arpcache
Also a good idea to:
ipconfig /flushdns