vSphere with Tanzu on VMware Cloud Foundation – Enable Embedded Harbor Registry

Blog Date: December 2, 2021
VMware Cloud Foundation 4.3.1 used during deployment.

In my last post, I went over the process of Configuring Your First Kubernetes Namespace. In this post I will go over the simple steps to enabling the embedded harbor registry. There are basically two methods to deploying a Harbor registry; the embedded harbor which I will be showing in this post, and Deploying Harbor Registry as a Shared Service.

Note: To use the embedded Harbor Registry, you must deploy the Supervisor Cluster with NSX-T Data Center as the networking solution.

Procedure:

In the vSphere client/workload domain that has Tanzu deployed, select the compute cluster, click on the configure tab, and then in the center menu, scroll down until you find Namespaces. Under Namespaces, select “Image Registry”.

Click Enable Harbor.

In the Select Storage Policies window, select K8S Storage Policy and click OK

The embedded Harbor Registry begins deploying.

The deployment can take up to 20 minutes depending on how large the cluster is, but I have seen the deployment take less than 10 minutes for small clusters of four.

You’ll find the IP address and link to the Harbor UI on this page once the deployment completes. We’ll come back to this in a later post. If you’d like, you can log into the harbor registry UI with the user and or group account that was defined in the namespace permissions section.

In my next post, I’ll go over the steps of getting the Ubuntu VM ready, which I will either refer to as the developer box or developer workstation. We’ll get docker installed, the vsphere plugin which has the Kubernetes cli, and the docker credential helper to start with. Later on we’ll also install some TKG extensions like helm, kapp, and ytt.

vSphere with Tanzu on VMware Cloud Foundation – Configuring Your First Kubernetes Namespace.

Blog Date: December 2, 2021
VMware Cloud Foundation 4.3.1 Used During Deployment.

In my previous post, I described how to deploy vSphere with Tanzu on a VMware Cloud Foundation 4.3.1 instance. In this post I will describe how to configure your first namespace.

Procedure:

Access the vSphere client. Select Menu > Workload Management > Namespaces.

Click Create Namespace.

Expand the inventory tree and select the compute cluster.

As am example, you can enter namespace-01 as your namespace name. (The name must be in a DNS-compliant format (a-z, 0-9, -)).

Click Create. ( The namespace is created and shows a Config Status of Running and a Kubernetes Status of Active.)

Select the Don’t show for future workloads check box.

Click Got It.

Now we can move on to the next section and apply permissions, storage and VM class.

  1. Click Add Permissions
    1. Identity source: <make selection>
    2. User/Group Search: <customer specific>. In this example, I have created a vsphere.local account. You can easily use an active directory account or group here.
    3. Role: <customer specific>. In this example, I have chosen “can edit” that way I can create and destroy things inside the namespace.
    4. Click Ok
    5. (Rinse-wash-repeat as necessary)

Click Add Storage and add the storage policy. 

The namespace is configured with a storage policy and user permissions.  The assigned storage policy is translated to a Kubernetes storage class.

Under VM Service, click Add VM Class. Here we need to associate a VM class with the namespace, that will allow developers to self-service VMs in the namespace. This gives vSphere administrators flexibility with resources available in the cluster. In this example, best-effort-xsmall was chosen because this is a nested lab environment. You should work with your developers to determine the best sizing strategy for the containerized workloads.

Now that the Namespace, Storage, and VM Class policies have all been defined, your window should look something like:

That’s it. Technically we can start deploying workloads to the new namespace. However, because I am already logged into vSphere, I like to enable the embedded Harbor registry. In my next post, I’ll go over the simple process of how to enable the embedded harbor registry.

vSphere with Tanzu on VMware Cloud Foundation – Deployment.

Blog Date: December 1, 2021

VMware Cloud Foundation 4.3.1 Used During Deployment:

Assumptions:

In my previous post vSphere with Tanzu on VMware Cloud Foundation/vSphere with NSX-T Requirements, I went over the requirements I pass along to customers, along with the supporting VMware documentation, and this post assumes those requirements and those in the VMware documentation have been met.

Procedure:

  1. Validate/Create a Storage Policy for vSphere with Tanzu, if default vSAN policy wont be used.
  2. Validate Deployed NSX Edge Cluster.
  3. Validate/Add NSX-T Network Segments
  4. Validate/Configure NSX-T IP Prefixes on the Tier-0 Gateway
  5. Validate/Configure NSX-T Route Maps on the Tier-0 Gateway
  6. Validate MTU greater than or equal to 1600 on all networks that will carry Tanzu traffic i.e. management network, NSX Tunnel (Host TEP, Edge TEP) networks, and the external network.
  7. Create a Subscribed Content Library for vSphere with Kubernetes.
  8. Deploy vSphere with Tanzu:

Deployment Steps:

After you have configured VM (storage) policies in vSphere and added segments in NSX-T Data Center, you can deploy vSphere with Tanzu.  SDDC Manager first validates your environment then redirects you to the vSphere Client where you complete the deployment. From the SDDC manager UI, navigate to Solutions and select Deploy.

Select All to have SDDC manager run a deployment prerequisites check.  Click Begin.

Select the desired cluster for the Tanzu workload.  Click Next

The SDDC manager will begin running the Validation. All Statuses should succeed .  Else troubleshoot/Retry.  Click Next.

After successful validation, SDDC will switch you over to the vSphere client to complete the deployment.

In the vSphere client, select the desired cluster to enable Tanzu.  Click Next.

Next, select the size of the control plane. Click Next.

Fill in the Management Network details.

Scroll down, and fill in the Workload Network details.  As mentioned in a previous post, I will argue that the API Server endpoint FQDN entry is mandatory when applying a certificate. NOTE: The Pod and Service CIDRs are non-routable. The UI provides default values that can be used, otherwise you specify your own. The Ingress and Egress CIDRs will be routable networks defined by the network team.  Click Next.

Select the storage policy for Control Plane Nodes, Ephemeral Disks, Image cache. vSAN Default Storage Policy can be used if only storage/cluster provided. Click Next.

That’s it.  Click Finish.  The Tanzu deployment will now proceed (The entire process can take up to 45 minutes to complete).


One the process is complete, we will see a config status of running with a green check. The cluster will have a yellow triangle because we have not assigned a license yet.

The Control Plane Node IP address is the same API Server Endpoint  we referred to earlier in this post. This will be the end point where you can download and install the vSphere plugin and the vSphere docker credential helper. To validate connectivity, simply open a web browser and go to the IP address http://<ip-address&gt;

From here, you can download the CLI plugin for windows.

If you are not able to reach the Control Plane Node IP address/API Server Endpoint, it is possible that you might have invalid MTU settings in your environment that will require further troubleshooting. I did come across this at a customer site, and documented the MTU troubleshooting process here. Good luck.

In my next post, I will cover how to configure your first namespace.

vSphere with Tanzu on VMware Cloud Foundation/vSphere with NSX-T Requirements

Blog Date: December 1, 2021

VMware Cloud Foundation 4.3.1 Used During Deployment:

Summary requirements:

– For VMware Cloud Foundation 4.3.1 software requirements, please refer to the Cloud Foundation Bill of Materials (BOM)
– For vSphere with Tanzu, it is recommended to have at least three hosts with 16 CPU Cores per host (Intel), 128GB per host, 2x 10GBe Nics per host, and a shared datastore of at least 2.5 TB.
– This post assumes VCF 4.3.1 has already been deployed following recommended practices.

System Requirements for Setting Up vSphere with Tanzu with NSX-T Data Center

The following link to VMware’s documentation should be reviewed prior to installation of Tanzu. This link includes networking requirements.
Tanzu with NSX-T Data Center Requirements

Specific Tanzu Call-outs:

IP Network

POD CIDR

Services CIDR

Ingress CIDR

Egress CIDR

Management Network

IPv4 CIDR Prefix

Greater than or equal to /20

Greater than or equal to /22

Greater than or equal to /24

Greater than or equal to /24

5 consecutive IPs

Justification:

POD CIDR: Dedicate a /20 subnet for pod networking. Private IP space behind a NAT that you can use in multiple Supervisor Clusters.Note: when creating TKG clusters, the IP addresses used for the K8s nodes are also allocated from this IP block.
Services CIDR: Dedicate a /22 subnet for services. Private IP space behind a NAT that you can use in multiple Supervisor Clusters.
Ingress CIDR: TKGS sources the ingress CIDR for allocating routable addresses to the Kubernetes clusters’ API VIP, ingress controller VIP, and service-type Load-Balancer VIP(s). Note: This subnet must be routable to the rest of the corporate network.
Egress CIDR: TKGS sources the egress CIDR to enable outbound communication from namespace pods. NSX-T automatically creates a source network address translation (SNAT) entry for each namespace, mapping the pod network to the routable IP address, respectively. Note: This subnet must be routable to the rest of the corporate network
Management Network: Five consecutive IP addresses on the Management network are required to accommodate the Supervisor VMs.
MTU: Greater than or equal to 1600 on all networks that will carry Tanzu traffic i.e. management network, NSX Tunnel (Host TEP, Edge TEP) networks, and the external network.


Note:  Kubernetes cluster requires identifying private IPv4 CIDR blocks for internal pod networks and service IP addresses. The Pods CIDR and Service CIDR blocks cannot overlap with IPs of Workload Management components (vCenter, ESXi hosts, NSX-T components, DNS, and NTP) and other data center IP networks communicating with Kubernetes pods. The minimum Pods CIDR prefix length is /23, and the minimum Service CIDR prefix length is /24.

Also: If using Tanzu Mission Control, TKG cluster components use TCP exclusively (gRPC over HTTP to be specific) to communicate back to Tanzu Mission Control with no specific MTU outbound requirements (TCP supports packet segmentation and reassembly).

Deploy an NSX-T Edge Cluster

The following link to VMware’s documentation should be reviewed, and can be used to deploy the necessary NSX Edge Cluster prior to the Tanzu deployment.
Deploy an NSX-T Edge Cluster

Enterprise Service Requirements

  • DNS: System components require unique resource records and access to domain name servers for forward and reverse resolution.
  • NTP: System management components require access to a stable, common network time source; time skew < 10 seconds.
  • DRS and HA: Need to be enabled in the vSphere cluster
  • (Useful but Optional): Ubuntu developer VM with docker installed for use while interacting with Tanzu.

In my next post, I’ll go over deploying vSphere with Tanzu on VCF with NSX-T.

vSphere with Tanzu: Project Contour TLS Delegation Invalid – Secret Not Found

Blog Date: June 25, 2021
Tested on vSphere 7.0.1 Build 17327586
vSphere with Tanzu Standard

On a recent customer engagement, we ran into an issue where after we deployed project Contour, and created a TLS delegation “contour-tls”, but we ran into an issue where Contour did not like the public wildcard certificate we provided. We were getting an error message “TLS Secret “projectcontour/contour-tls” is invalid: Secret not found.”

After an intensive investigation to make sure everything in the environment was sound, we came to the conclusion that the “is invalid” part of the error message suggested that there was something wrong with the certificate. After working with the customer we discovered that they included the Root, the certificate, and the intermediate authorities in the PEM file. The root doesn’t need to be in the pem.  Just the certificate, and the intermediate authorities in descending order. Apparently that root being in the pem file made Contour barf. Who knew?

You could possibly see that the certificate is the issue by checking the pem data for both the <PrivateKeyName>.key and the <CertificateName>.crt by running the following commands, and comparing the pem output. IF it doesn’t match this could be your issue as well. The “<>” should be updated with your values, and don’t include these “<” “>”.

openssl pkey -in <PrivateKeyName>.key -pubout -outform pem | sha256sum
openssl x509 -in <CertificateName>.crt -pubkey -noout -outform pem | sha256sum

Below are the troubleshooting steps we took, and what we did to resolve the issue. We were using Linux, and had been logged into vSphere with Tanzu already. Did I mention that I hate certificates? But I digress….

The Issue:

You had just deployed a TKG cluster, and then deployed Project Contour as the ingress controller that uses a load balancer to be the single point of entry for all external users. This connection terminates SSL connections, and you have applied a public wildcard certificate to it. You created the TLS secret, and have created the TLS delegation, so that new apps deployed to this TKG cluster can delegate TLS connection terminations to contour. However, after you deployed your test app to verify the TLS delegation is working, you see a status of “Invalid. At least one error present, see errors for details.”, after running the following command:


kubectl get httpproxies

Troubleshooting:

  1. You run the following command to gather more information, and see in the error message: “Secret not found” Reason: “SecretNotValid
kubectl describe httpproxies.projectcontour.io

2. You check to make sure the TLS Secret was created in the right namespace with the following command, and you see that it is apart of the desired namespace. In this example, our namespace was called projectcontour, and the TLS secret was called contour-tls.


kubectl get secrets -A

3. You check the TLS delegation to make sure it was created with the following command. In this example ours was called contour-delegation, and our namespace is projectcontour.

kubectl get tlscertificatedelegations -A

4. You look at the contents of the tlscertificatedelegations with the following command, and nothing looks out of the ordinary.

kubectl describe tlscertificatedelegations -A

5. You check to see the secrets of the namespace with the following command. In this example our namespace is called projectcontour and we can see our TLS delegation contour-tls.


kubectl get secrets --namespace projectcontour

6. You validate contour-tls has data in it with the following command. In this example, our namespace is projectcontour and our TLS is contour-tls.

kubectl get secrets --namespace projectcontour contour-tls -o yaml

In the yaml output, up at the top you should see tls.crt: with data after

Down towards the bottom of the yaml output, you should see tls.key with data after

Conclusion: Everything looks proper on the Tanzu side. Based on the error message we saw “TLS Secret “projectcontour/contour-tls” is invalid: Secret not found.” The “is invalid” part could suggest that there is something wrong with the contents of the TLS secret.  At this point, the only other possibility would be that the public certificate has something wrong and needs to be re-generated.   The root doesn’t need to be in the pem.  Just the certificate for the site, and intermediate authorities in descending order.

The Resolution:

  1. Create and upload the new public certificate for contour to vSphere with Tanzu.
  2. We’ll need to delete the secret and re-create it.  Our secret was called “contour-tls”, and the namespace was called “projectcontour”.
kubectl delete secrets contour-tls -n projectcontour

3. We needed to clean our room, and delete the httpproxies we created in our test called test-tls.yml, and an app that was using the TLS delegation. In this example it was called tls-delegation.yml

kubectl delete -f test-tls.yml
kubectl delete -f tls-delegation.yml

4. Now we created a new secret called contour-tls with the new cert. The <> indicates you need to replace that value with your specific information. The “<>” does not belong in the command.

kubectl create secret tls contour-tls -n projectcontour --cert=<wildcard.tanzu>.pem --key=<wildcard.tanzu>.key

5. Validate the pem values for .key and .crt match. The <> indicates you need to replace that value with your specific information. The “<>” does not belong in the command.


openssl pkey -in <PrivateKeyName>.key -pubout -outform pem | sha256sum

openssl x509 -in <CertificateName>.crt -pubkey -noout -outform pem | sha256sum

6. If the pem numbers match, the certificate is valid. Lets go ahead an re-create the tls-delegation.  Example command:

kubectl apply -f tls-delegation.yml

7. Now you should be good to go. After you deploy your app, you should be able to check the httpproxies again for Project Contour, and see that it has a valid status

kubectl get httpproxies.projectcontour.io

If all else fails, you can open a ticket with VMware GSS to troubleshoot further.

vSphere with Tanzu Validation and Testing of Network MTU

Blog Date: June 18, 2021
Tested on vSphere 7.0.1 Build 17327586
vSphere with Tanzu Standard


On a recent customer engagement, we ran into an issue where vSphere with Tanzu wasn’t successfully deploying. We had intermittent connectivity to the internal Tanzu landing page IP. What we were fighting ended up being inconsistent MTU values configured both on the VMware infrastructure side, and also in the customers network. One of the many prerequisites to a successful installation of vSphere with Tanzu, is having a consistent MTU value of 1600.


The Issue:


Tanzu was just deployed to an NSX-T backed cluster, however you are unable to connect to the vSphere with Tanzu landing page address to download Kubernetes CLI Package via wget.  Troubleshooting in NSX-T interface shows that the load balancer is up that has the control plane VMs connected to it.


Symptoms:

  • You can ping the site address IP of the vSphere with Tanzu landing page
  • You can also telnet to it over 443
  • Intermittent connectivity to the vSphere with Tanzu landing page
  • Intermittent TLS handshake errors
  • vmkping tests between host vteps is successful.
  • vmkping tests from hosts with large 1600+ packet to nsx edge node TEPs is unsuccessful.


The Cause:


Improper/inconsistent MTU settings in the network data path.  vSphere with Tanzu requires minimum MTU of 1600.   The MTU size must be 1600 or greater on any network that carries overlay traffic.  See VMware documentation here:   System Requirements for Setting Up vSphere with Tanzu with vSphere Networking and NSX Advanced Load Balancer (vmware.com)


vSphere with Tanzu Network MTU Validations:


These validations should have been completed prior to the deployment. However, in this case we were finding inconsistent MTU settings. So to simplify, these are what you need to look for.

  • In NSX-T, validate that the MTU on the tier-0 gateway is set to a minimum of 1600.
  • In NSX-T, validate that the MTU on the edge transport node profile is set to a minimum of 1600.
  • In NSX-T, validate that the MTU on the host uplink profile is set to a minimum of 1600.
  • In vSphere, validate that the MTU on the vSphere Distributed Switch (vDS) is set to a minimum of 1600.
  • In vSphere, validate that the MTU on the ESXi management interface (vmk0) is set to a minimum of 1600.
  • In vSphere, validate that the MTU on the vxlan interfaces on the hosts is set to a minimum of 1600.


Troubleshooting:

In the Tanzu enabled vSphere compute cluster, SSH into an ESXi host, and ping from the host’s vxlan interface to the edge TEP interface.  This can be found in NSX-T via: System, Fabric and select Nodes, edge transport nodes, and find the edges for Tanzu. The TEP interface IPs will be to the right.  In this lab, I only have the one edge. Production environments will have more.

  • In this example, vxlan was configured on vmk10 and vmk11 on the hosts. Your mileage may vary.
  • We are disabling fragmentation with (-d) so the packet will stay whole. We are using a packet size of 1600
# vmkping -I vmk10 -S vxlan -s 1600 -d <edge_TEP_IP>
# vmkping -I vmk11 -S vxlan -s 1600 -d <edge_TEP_IP>
  • If the ping is unsuccessful, we need to identify the size of the packet that can get through.  Try a packet size of 1572. If unsuccessful try 1500.  If unsuccessful try 1476. If unsuccessful try 1472, etc.

To test farther up the network stack, we can perform a ping something that has a different VLAN, subnet, and is on a routable network.  In this example, the vMotion network is on a different network that is routable. It has a different VLAN, subnet, and gateway.  We can use two ESXi hosts from the Tanzu enabled cluster.

  1. Open SSH sessions to ESXi-01 and ESXi-02.
  2. On ESXi-02, get the PortNum for the vMotion vmk. On the far left you will see the PortNum for the vMotion enabled vmk. Run the following command:
# net-stats -l

3. Run a packet capture on ESXi-02 like so:

# pktcap-uw --switchport <vMotion_vmk_PortNum> --proto 0x01 --dir 2 -o - | tcpdump-uw -enr -

4. On the ESXi-01 session, use the vmkping command to ping the vMotion interface of ESXi-02.  In this example we use a packet size of 1472 because that was the packet size the could get through, and option -d to prevent fragmentation.

# vmkping -I vmk0 -s 1472 -d <ESXi-02_vMotion_IP>

5. On the ESXi-02 session, we should now see six or more entries. Do a CTRL+C to cancel the packet capture.

6. Looking at the packet capture output on ESXi-02, We can see on the request line that ESXi-01 MAC address made a request to ESXi-02 MAC address.

  • On the next line for reply, we might see a new MAC address that is not ESXi-01 or ESXi-02.  If that’s the case, then give this MAC address to the Network team to troubleshoot further.



Testing:

Using the ESXi hosts in the Tanzu enabled vSphere compute cluster, we can ping from the host’s vxlan interface to the edge TEP interface.

  • The edge TEP interface can be found in NSX-T via: System, Fabric and select Nodes, edge transport nodes, and find the edges for Tanzu. The TEP interface IPs will be to the far right.
  • You will need to know what host vmks the vxlan is enabled. In this example we are using vmk10 and vmk11 again.

In this example we are using vmk10 and vmk11 again. We are disabling fragmentation with (-d) so the packet will stay whole. We are using a packet size of 1600. These should now be successful.

The commands will look something like:

# vmkping -I vmk10 -S vxlan -s 1600 -d <edge_TEP_IP>
# vmkping -I vmk11 -S vxlan -s 1600 -d <edge_TEP_IP>

On the ESXi-01 session, use the vmkping command to ping something on a different routable network, so that we can force traffic out of the vSphere environment and be routed. In this example just like before, we will be using the vMotion interface of ESXi-02. Packet size of 1600 should now work. We still use option -d to prevent fragmentation.

# vmkping -I vmk0 -s 1600 -d <ESXi-02_vMotion_IP>

On the developer VM, you should now be able to download the vsphere-plugin.zip from the vSphere with Tanzu landing page with the wget command.

# wget https://<cluster_ip>/wcp/plugin/linux-amd64/vsphere-plugin.zip

With those validations out of the way, you should now be able to carry on with the vSphere with Tanzu deployment.

Kubernetes API Server Endpoint FQDN Missing from Certificate SAN in vSphere with Tanzu Deployment.

Blog Date: June 17, 2021
Tested on vSphere 7.0.1 Build 17327586
vSphere with Tanzu Standard

The Issue:

On a recent customer engagement we ran into an issue when we applied the CA signed certificate to the vSphere with Tanzu enabled cluster. The customer could reach the Tanzu Landing Page (internal Kubernetes site address) with the assigned IP, and they received the secure lock on the site. However, they received an invalid certificate warning when trying to connect to the internal Kubernetes site with the FQDN.  Upon closer inspection we realized that the FQDN is not apart of the certificate Subject Alternative Name (SAN). We had also found that this customer had MTU inconsistencies in their environment, and we ended up redeploying vSphere with Tanzu a couple of times. There will be another blog post for that, but in regards to this blog, on the last deployment we missed adding the FQDN during the setup.

The Cause:

When enabling vSphere with Tanzu on a compute cluster, during the deployment wizard, you are asked for the API Server endpoint FQDN (fully-qualified domain name).  You will notice this says “Optional”. 

However, because this value was not filled out during the deployment, it will not be in the SAN when you create a CSR to apply a certificate to the Tanzu supervisor cluster. 

Resolution:

Currently the only “easy” fix for this would be to redeploy vSphere with Tanzu on the cluster assuming you are early on in the deployment. 

However, if you have already deployed workloads this will be destructive.  Your only other option is to open a ticket with VMware GSS, and they will need to add the missing entry to the database on the vCenter.

I wouldn’t expect there to be a public KB article on this as we do not want customers editing the vCenter database without GSS guidance.  There is currently no way to add the missing API Server endpoint FQDN in the UI.  As of 6/16/2021, I heard an unconfirmed rumor that a feature request has been added for this, and there will be an option to edit this in the UI. However, there’s currently no ETA on when it will be added. 

A big shout-out to Nicholas M. in GSS for helping me to resolve. Even though I cannot share the full resolution here, I can at least help others troubleshoot.

Advanced Troubleshooting:

  1. If you really need to confirm that this is your issue, we can open a putty session to the vCenter.

2. Next we can check the database to to find what MasterDNSName was entered during the time of deployment.  In my test, I only have a single compute cluster that has vSphere with Tanzu enabled.  Your mileage may vary if you have more than one cluster enabled.  Enter the following command to view the table.  We are not making changes here. 

# PGPASSFILE=/etc/vmware/wcp/.pgpass psql -U wcpuser -d VCDB -h localhost -x -c "select desired_config from cluster_db_configs where cluster like 'domain-c%';" | less

3. Initially you will see a bunch of lines on your console. If you hit the “page down” key once or twice to get past these lines (if needed, lowercase g to go back to the top).  Look for MasterDNSNames. This would be the API Server endpoint FQDN.  If the value = null, the API Server endpoint FQDN was left blank during the setup. 

You cannot edit this. This however will confirm that the api server endpoint FQDN was not entered during the initial deployment.

5. hit q to quit

As stated previously, if this is a fresh deployment, the easiest path forward would be to re-deploy vSphere with Tanzu on the compute cluster in vSphere.

If you already have running workloads, your only other option at this point would be to open a support request with VMware GSS.

Collect Windows/Linux Virtual Server System Logs Using vRealize Log Insight

Blog Date: 06/16/2021
vRealize Log Insight 8.3

I recently had a customer engagement with Log Insight, and not only did they want to use it as the main log collector for their infrastructure, but they also wanted it to collect logs from their Windows virtual servers. Good news! There is a content pack called “Microsoft – Windows” that should be installed.  This contains a configuration template for windows servers.  This is used to create a group, so that every time a windows box has the agent installed, the agent picks up the settings from the server and forwards its logs.  Otherwise, we must configure each agent ini file manually which is not ideal.    There is also a Linux content pack on the market place that can be setup as well. This blog will focus on the Windows content pack, but the steps for Linux is very similar.

The ” Microsoft – Windows” content pack can be found in the Marketplace in Log Insight located on the Content Packs tab.

Once this is installed, go back to the Administration tab, and click on agents in the left column.  Click the down carrot next to All Agents, and find “Microsoft – Windows” in the list.  To the right of it, click the double box icon to copy the template.

Name the group, and click copy.

Now you configure the filter to find the windows server.  In my example, I chose “OS” “Matches” “Microsoft Windows Server 2016 Datacenter”. Click Save New Group button below.

Now that we have a Windows group defined for the agents, go ahead and install the agent on the Windows Server, and now it will have a proper configuration and begin forwarding its logs.  If the box already has the agent installed, you might need to restart the agent, or reinstall it. 

Likewise, there is also a Linux content pack on the market place that can be setup as well.  For either, we don’t have to create one group to rule them all.  You can get creative with your group filters, and have specific groups for specific server functions. 

VMware

Microsoft CA: Installing Custom vSphere 7 Certificates, and Enabling VMCA To Act As The Intermediate CA.

Blog Date: 3/03/2021
Versions Tested:
vCSA: 7.0.1
ESXi: 7.0.1
Microsoft CA Server: 2016

Recently I had a customer that wanted to install their custom certificates on a new vCenter, and have it act as an Intermediate CA to install approved certificates on their hosts. This is common, but is something I have never done myself, so I ended up testing this out in my lab. In this long blog post, I will walk through:

  • Generating a Certificate Signing Request (CSR) for the vCenter.
  • Signing the request, creating the certificate using a standalone Microsoft CA.
  • Creating a trusted root chain certificate.
  • Installing the custom signed machine SSL certificate.
  • Installing the custom signed VMCA root certificate.
  • Renew host certificates and test.

Before we get started, it is worthwhile to note if you were unaware that there are different certificate modes in vSphere 7. Bob Plankers does a great job explaining these modes in the following VMware Blog: vSphere 7 – Certificate Management

I have done my best to focus the steps involved, and provided those below while testing this out in my home lab. Your mileage may vary.
_________________________________________________________________________________________________________

Snapshot the vCenter

We will be applying Custom Machine and Trusted Root certificates to the vCenter, and it is good practice to take a snapshot of the vCenter appliance beforehand.

_________________________________________________________________________________________________________

Testing vCenter Connection with WinSCP

We will be using WinSCP to transfer the files to and from the vCenter. You should validate that you can connect to yours first. If you are having problems connecting to the vCenter win WinSCP, follow this KB2107727.

_________________________________________________________________________________________________________

GENERATE CERTIFICATE SIGNING REQUEST (CSR)

Start an SSH session with the vCenter appliance, and enter “/usr/lib/vmware-vmca/bin/certificate-manager” to launch the utility. We want option #2

Enter “Y” at the prompt: Do you wish to generate all certificates using configuration file : Option[Y/N] ? :

Enter privileged vCenter credentials. The default administrator@vsphere.local will do the trick. Hit Enter key, then input the password and hit the enter key again.

If there is a valid certtool.cfg file that exists with your organizations information you can select N here.

Else if

IF there is no certool.cfg file or you would like to look at the file, use option “Y” here.

In the next series of prompts, you will be required to enter information specific to your environment. The inputs in purple below are examples I used for my lab, and yours will be different:
value for ‘Country’ [Default value : US] : US
value for ‘Name’ [Default value : CA]: cb01-m-vcsa01.cptvops-lab.net
value for ‘Organization’ [Default value : VMware]: CaptainvOPS
value for ‘OrgUnit’ [Default value : VMware Engineering]: CaptainvOPS Lab
value for ‘State’ [Default value : California]: Colorado
value for ‘Locality’ [Default value : Palo Alto] Fort Collins
value for ‘IPAddress’ (Provide comma separated values for multiple IP addresses) [optional]: 10.0.1.28
value for ‘Email’ [Default value : email@acme.com] : info@cptvops-lab.net
value for ‘Hostname’ (Provide comma separated values for multiple Hostname entries) [Enter valid Fully Qualified Domain Name(FQDN), For Example : example.domain.com] : cb01-m-vcsa01.cptvops-lab.net
Enter proper value for VMCA ‘Name’ : cb01-m-vcsa01.cptvops-lab.net

This image has an empty alt attribute; its file name is image-5.png

Enter option “1” to Generate Certificate signing requests

It will prompt for an export directory. In this example I used /tmp/

Now using WinSCP, copy the files off to the workstation you will be using to connect the the Microsoft CA.

_________________________________________________________________________________________________________

SIGNING THE REQUEST, CREATING THE CERTIFICATE USING STANDALONE MICROSOFT CA

I am using a standalone Microsoft CA I configured on my lab’s jump server, and don’t have the web portal. Launch the Certification Authority app.

Right-click the CA, select all tasks, and select Submit new request.

Once you browse to the directory where you stored the “vmca_issued_csr.csr” and “vmca_issued_key.key” files, you will need to change the view to “All Files (*.*)” in the lower right.

Now select the “vmca_issued_csr.csr”

Now go to “Pending Requests”. Right-click on pending cert request, go to “all tasks”, and select “Issue”.

Now go to “Issued Certificates”, double-click to open the certificate, go to the “details” tab and click the Copy to File… button.

When the wizard starts, click Next.

Select “Base-64-encoded X.509 (.CER), and Click Next

Click Browse

type in the file name and click save. In this example, I just use “vmca”.

Click Next, and then click Finish. Now click OK to close the wizard.


_________________________________________________________________________________________________________

CREATING A TRUSTED ROOT CHAIN CERTIFICATE

For this next part we will need the root certificate from the Microsoft CA. Right-click on the CA and select “Properties”

On the General tab, click the “View Certificate” button. Then on the new Certificate window that opens, click the details tab and then the “Copy to File…” button

When the wizard starts, click Next.

Select “Base-64-encoded X.509 (.CER), and Click Next

Just like before, browse to a locate to save the cert, and give it a meaningful name. In this example, I just use “root-ca”

Click Next and then click Finish. Close the windows.

In order to create a trusted root, we need to create a new file that has the root chain. This file will include the vmca.cer first, and then the root-ca.cer beneath it. I just open the vmca.cer file and root-ca.cer file with Notepad ++

Save it as a new file. In this example I use: vmca_rootchain.cer.

Using WinSCP, I create a new directory in the vCenter’s root directory called CertStore, and then copy the “vmca.cer”, “vmca_rootchain.cer”, and the “vmca_issued_key.key” there.

_________________________________________________________________________________________________________

INSTALLING THE CUSTOM MACHINE SSL CERTIFICATE

Open an SSH session to the vCenter, launch the certificate-manager: “/usr/lib/vmware-vmca/bin/certificate-manager”. First we will replace the Machine SSL certificate, so select option 1

Again we are prompted for vCenter authoritative credentials, and just like before we’ll use the administrator@vsphere.local account and password.

Now we reach the menu where we will use Option 2 to import custom certificates. Hit enter.

It will prompt for the directory containing the cert file for the Machine SSL. In this example I used: /CertStore/vmca.cer

Now it prompts for the custom key file. In this example I used: /CertStore/vmca_issued_key.key

Now it will ask for the signing certificate of the Machine SSL certificate. Here we will use the custom vmca_rootchain.cer file we created earlier: /CertStore/vmca_rootchain.cer

Respond with “Y” to the prompt: You are going to replace Machine SSL cert using custom cert
Continue operation : Option[Y/N] ? :

Assuming all of the information in the files was correct, the utility will begin replacing the machine SSL certificate. Watch the console for any errors and troupleshoot accordingly. 9 out of 10 times fat fingers are the cause for errors. Or dumb keyboards..

In my case it finished successfully.


_________________________________________________________________________________________________________

INSTALLING THE CUSTOM SIGNED VMCA ROOT CERTIFICATE

Relaunch the certificate-manager: “/usr/lib/vmware-vmca/bin/certificate-manager”. Use Option #2 now.

Enter Option: Y

Enter the authoritative vCenter account and password. Just like before I’m using the administrator@vsphere.local

We already have a known good certool.cfg file. We’ll skip ahead to the import, so use option: N

Use option #2 to import the custom certificates.

We are prompted for the custom certificate for root. Here we will use the custom vmca_rootchain.cer file we created earlier: /CertStore/vmca_rootchain.cer

Here we are prompted for the valid custom key for root. Just like last time we will use: /CertStore/vmca_issued_key.key

Use option “Y” to replace Root Certificates at the next prompt

Now the utility will replace the root certificate.


_________________________________________________________________________________________________________

TESTING THE VSPHERE CERTIFICATE

After clearing the browser cache, we can see the secure padlock on the login page for vSphere 7.

Now log into vSphere, go to Administration, and Certificate Management. Depending on how quickly you logged in after applying the certs, you may see an error on this page. Give it up to 5 minutes, and you’ll see a prompt at the top of the screen asking you to reload. Do it. Now you will see a clean Certificates Management screen. You should see a _MACHINE_CERT with your organization’s details. You will also see three Trusted Root Certificates. One of which will have your Organizations details.


_________________________________________________________________________________________________________

RENEW HOST CERTIFICATES AND TEST

If you have hosts already attached to the vCenter, and you would like to “Renew” the certificate, by default you will need to wait 24 hours before the certificate can be updated. Otherwise you will receive a similar error in vCenter:

If you need to update the certificate right away, follow KB2123386 to change the setting from 1440 to 10. This will allow for updating the ESXi certificate right away. I’d personally change it back after.

Either 24 hours has passed, or you followed the KB2123386 mentioned above. Now you can “Renew” the certificate on the ESXi host(s). You’ll see the Issuer of the certificate has changed, and should reflect your information. This operation doesn’t appear to work while the host is in maintenance mode.

If you browse to the ESXi host, it too will now have a padlock on its login page signifying a valid certificate.

A career milestone has been acquired. Joining VMware!

My time with IntegraTouch LLC has been nothing short of amazing. Through this role as a Senior consultant, I have been able to work along side customers in different public and private industries. I have been challenged professionally, mentally, and technically. I have earned the respect from my customers as a trusted advisor, an educator, and trainer. I have taken the time to enjoy the little things in life, like the rush of getting to board a plane and jetting off to a customer site in a different city/town, instead of the normal workday at the same office day after day. I have nothing but nice things to say about my time working for David, Kevin, and the IntegraTouch LLC family. For the first time in a long time, I felt like I could actually breathe in this role and make my own decisions, and I really feel it helped me grow professionally. While working for IntegraTouch, I found my voice to go speak at many different VMUG UserCons, which is saying a lot considering the social anxiety I feel that comes with being an introvert. But, continuously meeting new customers and starting new assignments, is a lot like that nervous feeling we all get starting a new job. I think consulting, and being a public speaker has helped me in many ways.

I’m still honestly in disbelief where I have found myself these past couple of months. I remember growing up, my parents always encouraged me to do better. I grew up in rural New York, where farm tractors, horses and cows, were more common than traffic jams. This life I’ve chosen has been an uphill battle, but I’ve been able to surround myself with friends and family who support me, encouraged me, forever driving me forward. I can look back on all that I have accomplished, and how far I have come today. My video game hobby fostered a relationship with technology that I still hold close today. It has been a rough winding road, but it is with great excitement that I can finally announce that I will be joining VMware as a full time Senior Consultant, SDDC – on January 11th! Working for VMware as a sub contractor has been fun these past two years, but working for the company as an employee has been a career goal of mine since 2015. A career milestone has just been acquired, and I am looking forward to what the future will bring.