Blog Date: August 16, 2022 vSphere version 7.0.3 Build 20150588
Storage Policies for vSphere with Tanzu
Part of the prep work prior to Tanzu Kubernetes deployment in your environment, is to configure a storage policy or policies for Tanzu workloads. for more information, consult VMware’s Storage Policies for vSphere with Tanzu and vSphere with Tanzu Storage documentation. Here I will walk through the basic configuration we setup for customers to get them started.
Create the Storage Tag
1 – in the vSphere inventory, select the desired storage to house the Tanzu Kubernetes workloads. 2 – Under tags, click assign.
3 – Click Add Tag
4 – Name the tag. In this example we are using tanzu.
5 – Click the Create New Category, and give it a name. In this example we used: k8s Leave all other defaults, and click CREATE.
6 – Click CREATE again to complete the tag setup.
7 – Assign the newly created tanzu tag to the datastore.
Create and Assign a Storage Policy
1 – On the vSphere menu, select “Policies and Profiles”. 2 – Under VM Storage Policies, click CREATE.
3 – Name the storage profile. Make sure to use a DNS compliant name (lowercase, no spaces). In this example, we used : tanzu-storage. Click NEXT
4 – On Policy structure, Enable tag based placement rules. Click NEXT
5 – On the Tag based placement screen: 5a – Choose the Tag category: k8s 5b – Usage option: Use storage tagged with Tags: tanzu 5c – Click NEXT
6 – On the Storage compatibility screen, the tagged compatible datastores will be listed. Click NEXT
7 – Review and click FINISH.
At this point, we just completed the configuration and assignment of storage tags and policies. Let’s create the subscribed content library for Tanzu.
Create Subscribed Content Library for Tanzu Kubernetes
For more information on creating a subscribed content library for Tanzu Kubernetes, see VMware documentation here.
1 – vSphere Menu, Select Content Libraries, and click CREATE.
2 – Name the new content library (example: tanzu-content-library), select the desired vCenter and click NEXT.
3 – Configure the content library. 3a – Choose “Subscribed content library” 3b – Enter the Subscription URL: https://wp-content.vmware.com/v2/latest/lib.json 3c – Download content frequency depends on customer needs and bandwidth requirement. There’s roughly 27 OVAs available to download.
4 – Click NEXT
5 – Respond yes to the “tanzu – unable to verify authenticity” message. This is expected
6 – Select a security policy if needed, otherwise click NEXT.
7 – Select the storage for the content library.
8 – Review the configuration, and click FINISH
Depending on the chosen sync frequency, you may start to see that a sync has started in the vSphere recent tasks window. If you click on the content library, you can see the available OVAs.
This completes the vSphere environment prep for the Tanzu deployment.
Blog Date: August 12, 2022 NSX-ALB Controller version: 22.1.1 vSphere version 7.0.3 Build 20150588
In my previous blog: vSphere with Tanzu: Deployment of NSX-ALB Controller, I went over the basic NSX-ALB controller deployment and activation. In this blog, I will go over the configuration of the controller in preparation of deploying Tanzu to the target workload compute cluster.
Picking up where I left off previously, I just assigned the Essentials license to the controller.
Next, we need to configure our default cloud.
Configure the Default Cloud
Click on the Infrastructure tab, then select Clouds in the left menu. We see the controller comes with a default cloud already configured, and we can edit this for our needs. To the right of the green dot, you’ll see several icons. Click the gear to the right of the pencil.
On the Convert Cloud Type window, select “VMware vCenter/vSphere ESX” in the cloud type drop menu. Click YES, CONTINUE.
For the Default-Cloud, select “Prefer Static Routes vs Directly Connected Network” under Default Network IP Address Management.
Then under vCenter/vSphere section, click add credentials.
Here you will need to add the FQDN of the vCenter, along with the service account the controller will use to access the vCenter. We can use the example spreadsheet we filled out earlier. Click CONNECT.
This will kick you back to the setup wizard. However we now see a little blue information bar “VMware vCenter/vSphere ESX cloud needs to be created before proceeding. Please ‘Save & Relaunch’ the modal to complete setup.” However, the SAVE & RELAUNCH button in the lower right corner is grayed out. We first need to deselect the “Use Content Library” . Now we can click save & relaunch.
Make sure Data Center drop menu has the desired data center, else select it.
Now we can configure the management network information. Select the management network, add its CIDR and gateway. Under the Static IP Address Pool, we need to click the ADD button. This will need 5 consecutive IP addresses.
Click the SAVE button in the lower left. We will come back and edit this section later on.
Configure the NSX-ALB Controller to use a certificate
Now we need to update the NSX-ALB SSL certificate. We can either use a self signed certificate, or we can create a CSR and sign the certificate with a CA. In my lab, I have applied the signed CA certificate.
Check out my blog where I go over both options and how to create them here: vSphere with Tanzu: Replacing NSX-ALB Controller CertificatesApplying a certificate to the controller has to be done before proceeding to the next step! ________________________________________________________________________________
Configure the Service Engines
For that, we click the Infrastructure tab, and then on the left we expand Cloud Resources, and select Service Engine Group.
1 – Click the pencil on the default-group. 2 – The default configuration, the Legacy HA is already configured to be Active/Standby. This is the only mode available to the essentials license. 3 – The number of Virtual Services per Service Engine, is 10 at default. This is the number of load balancing services the SE will support. Each Tanzu cluster you create, will consume one of these load balancing services, and every service you expose from the tanzu cluster will consume a service. This can be turned up to 1000 depending on your needs. 4 – The maximum number of service engines is limited by the essentials license used.
Click on the Advanced tab.
Under cluster we need to select the workload cluster that will run Tanzu, and we need to specify all hosts.
Click Save in the lower right.
Configure The NSX-ALB Controller Networks
Now that we have configured the Service Engine in the previous section, we now need to configure our networks. On the left menu, select Networks under Cloud Resources. We can see that it has detected all of our available networks in the vCenter.
You’ll notice that it does not detect the network settings because we are using static IPs instead of DHCP, so first we will edit the Data network. Click the pencil on the right.
Click the ‘+ Add Subnet’ button. Refer to the spreadsheet again, copy the ‘Data Network CIDR address’, and paste it into the ‘IP subnet field’. Also click the ‘+ Add Static IP Address Pool’ button, and copy the pool for the Data Network off the spreadsheet. End result should look similar to this.
Click Save in the lower right. Click SAVE again on the next screen. Now the Data network is configured. Next we need to configure the routing.
On the left hand side, select VRF Context to configure the routing. To the right of ‘global’ select the edit button.
We need add the default gateway route and set to 0.0.0.0/0. In the Next hop, we can add the gateway for the data network from the spreadsheet.
Click Save in the lower right.
Now the Data network has been setup.
Configure the IPAM profile
Next, we need to make sure that the NSX ALB knows which IPs it should use, so we need to setup an IPAM as well.
1 – Click on the Templates tab, and then under Profiles, select IPAM/DNS Profiles. 2 – Click the CREATE button, and select ‘IPAM Profile’ from the drop menu. With the essentials license, we can only create an IPAM Profile.
1 – Name the profile. In this example we use: tanzu-ip. 2 – Under Cloud, select Default-Cloud in the drop menu. 3 – Under Usable Networks, click the ADD button, and in the lower menu, select the data network.
Click SAVE in the lower right.
Now the IPAM profile is configured.
Assign the IPAM profile to the Default-Cloud
Next we need to assign the IPAM to the default cloud. Click the Infrastructure tab, select Clouds, and then to the right of the default-cloud, click the edit button.
Now we can update the default-cloud IPAM Profile, with the IPAM profile just created.
Click SAVE in the lower right. Next, wait for the status to turn green if it hasn’t already.
Congrats! We have finished the setup for the NSX-ALB Controller, and are now ready to deploy Tanzu. I’ll cover that in my next blog. Stay tuned.
Blog Date: August 12, 2022 NSX-ALB Controller version: 22.1.1 vSphere version 7.0.3 Build 20150588
In this post, I’ll be going over the steps to replace the SSL/TLS controller certificate. You can either replace the controller certificate with one from a Certificate Authority, or you can create a self signed certificate that includes the FQDN and IP in the SAN. This information is missing in the default certificate, and it will cause the deployment of Tanzu to barf if not done. In this blog I will cover both methods.
(OPTION A) Replacing the NSX-ALB Controller SSL/TLS Certificate using a CA.
The following process is what I used in my lab to replace the NSX-ALB controller certificate using my Microsoft CA. Your mileage may vary.
Remember to take a snapshot of the NSX-ALB controller before proceeding.
1- First we will need to generate a CSR. In the NSX-ALB interface, go to: Templates -> Security -> SSL/TLS Certificates. 2 – Click the blue “Create” button in the upper right, and select Controller Certificate from the drop-down menu.
Enter the name of the certificate, and then under Type, select CSR.
Add the certificate information, and then add the FQDN and IP address to the subject Alternate Name (SAN). I am doing this example in my home lab, so I only have a single controller. Large deployments might have clustered controllers, so your configuration will very if you have multiple SANs to add.
Click the green “Save” button in the lower left.
Now you will see the certificate in a grey status in the list. Click the pencil icon on its right side to edit it.
Now we need to copy the Certificate signing request to take over to our CA. Click the “Copy to Clipboard” button.
Now we need to connect to the Microsoft CA web portal. Click “Request a Certificate”.
Click on “advanced certificate request”.Past the certificate request into the box. *Caution* -> Make sure there is no empty line after —–END CERTIFICATE REQUEST—–.
For this certificate, you will need to use the “Web Server” template on the Microsoft CA. I have created a VMware version of it following VMware’s KB article 2112009 for Creating a Microsoft Certificate Authority Template for SSL certificate creation in vSphere 6.x/7.x. I called my web server template VMware, so that is what I have selected here.
Click “Submit”.
Select “Base 64 encoded”, and then click the “Download certificate” link.
We also need to obtain a copy of the root ca certificate from the Microsoft CA to complete the chain.
In order to complete the cert chain, in a text editor like Notepad++, paste the NSX-ALB SSL certificate you created first, and then paste the root ca certificate from the Microsoft CA. It should look like this:
Now you have the full chain, so copy it. Go back to the pending controller certificate in the NSX-ALB interface, and paste the full chain into the window.
Click the “Save” button in the lower left corner.
Now back to the main SSL/TLS Certificates window, we should see that the controller SSL certificate has a green status. You might need to refresh the page a couple of times. If the controller SSL certificate is invalid, a little exclamation symbol will appear next to the certificate, and you will need to troubleshoot further.
Assuming the SSL certificate is proper, we now need to apply it to the NSX-ALB controller. Go to Administration tab, In the left menu expand Settings, and then select Access Settings.
Click the pencil in the upper right, to edit the system access settings.
On the Wizard that opens, on the left hand side you’ll see ‘SSL/TLS Certificate’ and it will have the default certificates in it. Click the ‘X’ on those to remove them.
Select the new NSX-ALB controller certificate you imported. Click the ‘Allow Basic Authentication’ check box as well and add the check mark.
Click the “Save” button in the lower right. After 5 to 10 seconds, the certificate will be updated on the controller. Close and reopen your browser to see the newly installed certificate.
That’s it for replacing the NSX-ALB controller certificate. Not terrible once you understand the process. Don’t forget to clean your room, and delete the snapshot you took prior.
(OPTION B) Replacing the NSX-ALB Controller SSL/TLS Certificate using self signed method.
1 – In the NSX-ALB interface, go to: Templates -> Security -> SSL/TLS Certificates. 2 – Click the blue “Create” button in the upper right, and select Controller Certificate from the drop-down menu.
Type: Self Signed will be selected by default. Add the certificate information, and then add the FQDN and IP address to the subject Alternate Name (SAN). I am doing this example in my home lab, so I only have a single controller. Large deployments might have clustered controllers, so your configuration will very if you have multiple SANs to add. Fill in the details below to complete the self signed certificate. Click Save.
We now need to apply it to the NSX-ALB controller. Go to Administration tab, In the left menu expand Settings, and then select Access Settings.
Click the pencil in the upper right, to edit the system access settings.
On the Wizard that opens, on the left hand side you’ll see ‘SSL/TLS Certificate’ and it will have the default certificates in it. Click the ‘X’ on those to remove them.
Select the new NSX-ALB controller self-signed certificate in the SSL/TLS Certificate drop menu. Click the ‘Allow Basic Authentication’ check box as well and add the check mark.
Click the “Save” button in the lower right. After 5 to 10 seconds, the certificate will be updated on the controller. Refresh the page a couple of times, and you might get the “Potential Security Risk Ahead” warning message in the browser. Else you might need to close and reopen your browser to see the newly installed certificate.
That’s it. Using a self-signed certificate is a straight forward process. Most production deployments will use a CA signed certificate. Don’t forget to clean your room, and delete the snapshot you took prior.
In my next blog: vSphere with Tanzu: Configuring the NSX-ALB Controller, I continue on and configure the NSX-ALB controller to use with Tanzu.
Blog Date: August 11, 2022 NSX-ALB Controller version: 22.1.1 vSphere version 7.0.3 Build 20150588
In my previous blog: vSphere with Tanzu: NSX-ALB Controller Requirements and Deployment Prep. I went over the basic requirements and prep work for the NSX-ALB controller to use with Tanzu. In this blog, I’ll demonstrate deploying the NSX-ALB into my home lab. In this blog, I will be doing the basic NSX-ALB controller in my lab with no NSX-T.
NOTE: Production deployments should use NSX-ALB controller cluster. In my lab however, I am only deploying a single controller for this example.
NSX-ALB Controller Deployment
Obtain the NSX-ALB controller by following the VMware KB Article 82049. In this example I am using version 22.1.1
Import the Controller OVA into the vCenter.
OVA Import Step 3: The controller would need to be deployed on a Compute Cluster that has access to the networks configured for this deployment. Click NEXT.
OVA Import Step 5: Select storage that the desired Compute Cluster has access to.
OVA Import Step 6: Select the management network distributed port group in the excel we filled in earlier, and click NEXT.
OVA Import Step 7: fill out the networking information for the NSX-ALB controller. Here we just need to add the controller IP, subnet, and gateway. Click NEXT.
OVA Import Step 8: Check for accuracy, and then click FINISH to start the deployment.
Once the NSX ALB OVA is deployed, start the VM. Wait for the VM to fully boot, and then access the web interface.
On the first login, you will need to create the admin password. Click CREATE ACCOUNT
After the admin account is created, you’ll need to add the DNS resolver(s) and DNS Search Domain. Click NEXT.
Add the “from” email address of your choosing. For this example, I am just using Local Host. Click NEXT.
For the multi-Tenancy, the defaults can be used unless otherwise specified. Toggle the check mark in the “Setup Cloud After” in the lower right because we want to configure the cloud component later, and click SAVE.
Now we are logged into the admin interface, and are immediately receive a controller faults error message that it doesn’t have license/subscription information, so we need to add it.
Click the Administration tab, and then on the left, expand Settings, and click Licensing. Click the gear.
Select the Essentials Tier for Tanzu license. Click SAVE.
You can see the interface has changed, and it wants us to add a license key, however we are in the essentials mode, and can only use essentials features. We do not need to change anything.
That covers the basic deployment for the NSX-ALB controller. In my next blog, I will walk through the process of assigning a signed certificate from my Microsoft CA to the controller. I will also show how to create and assign the self-signed certificate to the controller. Stay tuned.
Blog Date: August 11, 2022 NSX-ALB Controller version: 22.1.1 vSphere version 7.0.3 Build 20150588
VMware Customers can find additional details on system design options and preparation tasks in the vSphere 7 with Tanzu Prerequisites and Preparations Guide. This blog is a summary focused on requirements when using vSphere Networking and NSX-ALB Load Balancing Controller, and the deployment of the controller. This example is for the the deployment of the controller without NSX (NSXT).
Hardware Requirements
vSphere Cluster
No. of Hosts
CPU Cores Per Host
Memory Per Host
NICs Per Host
Shared Datastore
Minimum Recommended
3
16 (Intel CPU Only)
128GB
2x 10GbE
2.5 TB
Note: Increasing the number of hosts eases the per-host resource requirements and expand the resource pool for deploying additional or larger Kubernetes clusters, applications, other integrations, etc.
To mitigate deployment delays from troubleshooting infrastructure-related problems, customers need to preconfigure both NICs with the appropriate network access, as detailed in the table below, and test for connectivity in advance of any on-site work.
VLAN Description
Host vmnic(s)
Virtual Switch
MTU
IPv4 CIDR Prefix
Routable
Management Network*
NIC 1 & 2
vDS
≥≥1500
≥≥ /27
Yes
vMotion Network**
NIC 1 & 2
vDS
≥≥1500
≥≥ /29
No
Storage / vSAN Network
NIC 1 & 2
vDS
≥≥1500
≥≥ /29
No
Workload Network***
NIC 1 & 2
vDS
≥≥1500
≥≥ /24
Yes
Data Network
NIC 1 & 2
vDS
≥≥1500
≥≥ /24
Yes
* If the ESXi hosts’ mgmt vmkNIC and other core components such as vCenter operate on separate networks, the two networks must be routable.
** As opposed to a separate network, vMotion can operate on a shared network with ESXi hosts’ mgmt vmkNIC
*** The workload network hosts K8s control plane and worker nodes
When choosing the vSphere Networking model, all network segments and routed connectivity must be provided by the underlying network infrastructure. The Management network can be the same network used for your standard vCenter and ESXi VMKernel port functions, or a separate network with full routed connectivity. Five consecutive IP addresses on the Management network are required to accommodate the Supervisor VMs, and one additional IP is required for the ASX ALB controller. The Workload CIDRs in the table above account for the typical number of IP addresses required to interface with the physical infrastructure and provide IP addresses to Kubernetes clusters for ingress and egress communications. If the CIDR ranges for Workload and Frontend functions are consolidated onto a single segment, they must be different ranges and non-overlapping.
Additionally, the Workload Management enablement will default the IP address range for Kubernetes pods and internal services to 10.96.0.0/23. This range is used inside the cluster and will be masked behind the load balancer from the system administrators, developers, and app users. This range can be overridden if needed but should remain a minimum of a /24.
Tanzu Mission Control (if available):
TKG cluster components use TCP exclusively (gRPC over HTTP to be specific) to communicate back to Tanzu Mission Control with no specific MTU outbound requirements (TCP supports packet segmentation and reassembly).
Firewall Requirements
VMware HIGHLY RECOMMENDS unfiltered traffic between networks for the system. Reference the VMware vSphere 7 with Kubernetes Prerequisites and Preparations Guide for the summary firewall requirements.
If Tanzu Mission Control (TMC) is available, the platform needs internet access connectivity.
Storage
You will need to use vSphere supported shared storage solution. Typically, this is vSAN, NFS, iSCSI or Fibre Channel. Shared storage is required. Presenting storage volumes directly is not.
Enterprise Service Requirements
DNS: System components require unique resource records and access to domain name servers for forward and reverse resolution
NTP: System management components require access to a stable, common network time source; time skew < 10 seconds
AD/LDAP (Optional): Service bind account, User/Group DNs, Server(s) FQDN, and port required for authentication with external LDAP identity providers
DRS and HA need to be enabled in the vSphere cluster.
With the above mentioned requirements in mind, after the environment prep has been completed, I like to fill out the following excel with all of the information needed for the deployment and configuration of the NSX-ALB controller.
That covers the basic requirements and prep work for the NSX-ALB controller deployment. In my next blog: vSphere with Tanzu: Deployment of NSX-ALB Controller, I will walk through the basic deployment of the controller.
On engagements with customers, I’ll have them deploy a developer VM where we can work and I can get them started on their Tanzu and Kubernetes journey. This one VM will have docker, docker credential helper, and the Tanzu Kubernetes CLI installed. For the purpose of this blog series, I’ll do the same. For this blog, I’ll walk through the process using Ubuntu 22.04. In my lab, I have already configured my Ubuntu VM with a static IP and host name.
Getting Started with Ubuntu and Installing Docker
Docker is available for download and installation on Ubuntu 22.04. To get started, let’s update the package repo first, and then we can install docker.
$ sudo apt update
$ sudo apt install docker.io
After the Docker installation has completed successfully, let’s start the Docker service, and enable it to run on boot.
Let’s run the following command to see the Docker path. We should get a return /usr/bin/docker
$ which docker
Let’s make sure Docker is running with the following command.
$ sudo systemctl status docker
We should see an active (running) status.
You can verify the current Docker version installed, and see additional details with the following command.
$ sudo docker version
At this point, my customer would have completed to per-requisite to have an Ubuntu VM deployed for me. In this blog however, I will continue on assuming vSphere with Tanzu has already been enabled on a workload cluster in the environment.
Downloading The Kubernetes CLIvsphere-plugin.zip
The official VMware guide to Download and Install the Kubernetes CLI Tools for vSphere can be found here. In the vSphere Client, click the 3 hash marks in the upper left and select Workload Management. Select the Supervisor Clusters tab. Make note of the Control Plane Node Address (IP address). It will be needed for the next command to download the vsphere-plugin.zip.
In the example command, I am using the optional –no-check-certificate. Replace the IP address with your Control Plane Node IP Address
We are up to step 2, and we need to put the contents of the unzipped vsphere-plugin into your executable search path. Let’s execute the following commands.
Next, we’ll exit our current SSH session on the ubuntu dev vm.
$ exit
Now re-establish a SSH session to the ubuntu dev vm using the same account as before. Once logged in, let’s test that kubectl completion bash is working. We do this by entering ‘kubectl’ followed by a space and then hit tab twice.
$ kubectl
Test Connection to Tanzu Control plane (optional)
From the Ubuntu dev VM, use the kubectl CLI to connect to the vSphere with Tanzu control plane as the authenticated user. Here I am using the –insecure-skip-tls-verify optional parameter
It will prompt for a username and password. Once successfully logged in and if the account you chose has permissions to access the Tanzu control plane node, you should see a message stating you have access to the following contexts.
Now you are ready to begin working with Tanzu. More content to come. Stay tuned.
VMware vCenter Server 7.0 Update 2d used. VMware NSX-T Data Center 3.1.3.1 used.
Assumptions:
In a previous post titled vSphere with Tanzu on VMware Cloud Foundation/vSphere with NSX-T requirements, I went over the requirements I pass along to customers, along with the supporting VMware documentation, and this post assumes those requirements and those in the VMware documentation have been met. The same networking requirements exist here for standard vSphere 7 deployments with NSX-T.
Validate/Configure NSX-T IP Prefixes on the Tier-0 Gateway
Validate/Configure NSX-T Route Maps on the Tier-0 Gateway
Validate MTU greater than or equal to 1600 on all networks that will carry Tanzu traffic i.e. management network, NSX Tunnel (Host TEP, Edge TEP) networks, and the external network.
In the vSphere Client, select Menu > Workload Management.
Click Get Started. (The Enable Workload Management wizard opens.)
On the vCenter Server and Network section, select NSX-T. Click Next.
On the Select a Cluster section, select the ESXi cluster to support vSphere with Tanzu.
Next, select the size of the control plane. Click Next.
Fill in the Management Network details.
Scroll down, and fill in the Workload Network details. As mentioned in a previous post, I will argue that the API Server endpoint FQDN entry is mandatory when applying a certificate. NOTE: The Pod and Service CIDRs are non-routable. The UI provides default values that can be used, otherwise you specify your own. The Ingress and Egress CIDRs will be routable networks defined by the network team. Click Next.
Select the storage policy for Control Plane Nodes, Ephemeral Disks, Image cache. vSAN Default Storage Policy can be used if only storage/cluster provided. Click Next.
That’s it. Click Finish. The Tanzu deployment will now proceed (The entire process can take up to 45 minutes to complete).
The Control Plane Node IP address is the same API Server Endpoint we referred to earlier in this post. This will be the end point where you can download and install the vSphere plugin and the vSphere docker credential helper. To validate connectivity, simply open a web browser and go to the IP address http://<ip-address>
If you are not able to reach the Control Plane Node IP address/API Server Endpoint, it is possible that you might have invalid MTU settings in your environment that will require further troubleshooting. I did come across this at a customer site, and documented the MTU troubleshooting process here. Good luck.
Blog Date: December 5, 2021 Updated: August 8, 2022 VMware Cloud Foundation 4.3.1 used. VMware vCenter Server 7.0 Update 2d used. VMware NSX-T Data Center 3.1.3.1 used. VMware Photon OS 3.0
On engagements with customers, I’ll have them deploy a developer VM where we can work and I can get them started on their Tanzu and Kubernetes journey. This one VM will have docker, docker credential helper, and the Tanzu Kubernetes CLI installed. For the purpose of this blog series, I’ll do the same.
Getting Started with Photon OS and Installing Docker
The first step was to deploy the Photon OS ova: https://github.com/vmware/photon/wiki. This URL has all of the instructions on getting started as well as running Docker which only requires two commands:
The Docker service needs to be set up to run at startup. To do so, input the following commands:
$ sudo systemctl start docker
$ sudo systemctl enable docker
(Optional) Once that completes, run the following commands to allow docker to run as non-root:
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker
The following command will start docker if it is not already running. Likewise you can do a status instead of a start:
$ systemctl start docker
Downloading The Kubernetes CLI
First, if this is going to be a shared box, it will be a good idea to create a directory where we can place the files:
$ mkdir -p /opt/vsphere-plugin
If needed you can locate the control plane node IP address from the workload management section in vSphere.
The Kubernetes CLI can be downloaded from the https:// via wget.
Select the vCenter cluster where Workload Management and the embedded Harbor Registry are enabled. – Select Configure > Namespaces > Image Registry. – In the Root certificate field, click the link Download SSL Root Certificate. – Save the root-certificate.txt, and rename it to something like ca.crt.
Copy the embedded Harbor Registry ca.crt file that you downloaded to the /etc/docker/certs.d/IP-address-of-harbor/ created in the previous step.
That directory should now look something like:
/etc/docker/certs.d/IP-address-of-harbor/ca.crt
Restart the docker service so that the new certificate is used:
$ systemctl restart docker
To test that the docker credential helper is working, you can log into the embedded harbor registry using your fully qualified domain credentials. As long as you don’t get a certificate trust error, you are good to go.
$ docker-credential-vsphere login <harbor_ip>
This blog should have prepped the Developer VM (Photon OS) that we will be using going forward. There will be a future blog post on pushing a docker image to the embedded harbor registry, but I am not going to cover this here. In my next post, I’ll walk through the steps of installing a Tanzu Kubernetes Cluster inside the namespace we deployed using this VM. Stay tuned.
Blog Date: June 25, 2021 Tested on vSphere 7.0.1 Build 17327586 vSphere with Tanzu Standard
On a recent customer engagement, we ran into an issue where after we deployed project Contour, and created a TLS delegation “contour-tls”, but we ran into an issue where Contour did not like the public wildcard certificate we provided. We were getting an error message “TLS Secret “projectcontour/contour-tls” is invalid: Secret not found.”
After an intensive investigation to make sure everything in the environment was sound, we came to the conclusion that the “is invalid” part of the error message suggested that there was something wrong with the certificate. After working with the customer we discovered that they included the Root, the certificate, and the intermediate authorities in the PEM file. The root doesn’t need to be in the pem. Just the certificate, and the intermediate authorities in descending order. Apparently that root being in the pem file made Contour barf. Who knew?
You could possibly see that the certificate is the issue by checking the pem data for both the <PrivateKeyName>.key and the <CertificateName>.crt by running the following commands, and comparing the pem output. IF it doesn’t match this could be your issue as well. The “<>” should be updated with your values, and don’t include these “<” “>”.
Below are the troubleshooting steps we took, and what we did to resolve the issue. We were using Linux, and had been logged into vSphere with Tanzu already. Did I mention that I hate certificates? But I digress….
The Issue:
You had just deployed a TKG cluster, and then deployed Project Contour as the ingress controller that uses a load balancer to be the single point of entry for all external users. This connection terminates SSL connections, and you have applied a public wildcard certificate to it. You created the TLS secret, and have created the TLS delegation, so that new apps deployed to this TKG cluster can delegate TLS connection terminations to contour. However, after you deployed your test app to verify the TLS delegation is working, you see a status of “Invalid. At least one error present, see errors for details.”, after running the following command:
kubectl get httpproxies
Troubleshooting:
You run the following command to gather more information, and see in the error message: “Secret not found” Reason: “SecretNotValid“
kubectl describe httpproxies.projectcontour.io
2. You check to make sure the TLS Secret was created in the right namespace with the following command, and you see that it is apart of the desired namespace. In this example, our namespace was called projectcontour, and the TLS secret was called contour-tls.
kubectl get secrets -A
3. You check the TLS delegation to make sure it was created with the following command. In this example ours was called contour-delegation, and our namespace is projectcontour.
kubectl get tlscertificatedelegations -A
4. You look at the contents of the tlscertificatedelegations with the following command, and nothing looks out of the ordinary.
kubectl describe tlscertificatedelegations -A
5. You check to see the secrets of the namespace with the following command. In this example our namespace is called projectcontour and we can see our TLS delegation contour-tls.
kubectl get secrets --namespace projectcontour
6. You validate contour-tls has data in it with the following command. In this example, our namespace is projectcontour and our TLS is contour-tls.
kubectl get secrets --namespace projectcontour contour-tls -o yaml
In the yaml output, up at the top you should see tls.crt: with data after
Down towards the bottom of the yaml output, you should see tls.key with data after
Conclusion: Everything looks proper on the Tanzu side. Based on the error message we saw “TLS Secret “projectcontour/contour-tls” is invalid: Secret not found.” The “is invalid” part could suggest that there is something wrong with the contents of the TLS secret. At this point, the only other possibility would be that the public certificate has something wrong and needs to be re-generated. The root doesn’t need to be in the pem. Just the certificate for the site, and intermediate authorities in descending order.
The Resolution:
Create and upload the new public certificate for contour to vSphere with Tanzu.
We’ll need to delete the secret and re-create it. Our secret was called “contour-tls”, and the namespace was called “projectcontour”.
3. We needed to clean our room, and delete the httpproxies we created in our test called test-tls.yml, and an app that was using the TLS delegation. In this example it was called tls-delegation.yml
kubectl delete -f test-tls.yml
kubectl delete -f tls-delegation.yml
4. Now we created a new secret called contour-tls with the new cert. The <> indicates you need to replace that value with your specific information. The “<>” does not belong in the command.
5. Validate the pem values for .key and .crt match. The <> indicates you need to replace that value with your specific information. The “<>” does not belong in the command.
6. If the pem numbers match, the certificate is valid. Lets go ahead an re-create the tls-delegation. Example command:
kubectl apply -f tls-delegation.yml
7. Now you should be good to go. After you deploy your app, you should be able to check the httpproxies again for Project Contour, and see that it has a valid status
kubectl get httpproxies.projectcontour.io
If all else fails, you can open a ticket with VMware GSS to troubleshoot further.
Blog Date: June 18, 2021 Tested on vSphere 7.0.1 Build 17327586 vSphere with Tanzu Standard
On a recent customer engagement, we ran into an issue where vSphere with Tanzu wasn’t successfully deploying. We had intermittent connectivity to the internal Tanzu landing page IP. What we were fighting ended up being inconsistent MTU values configured both on the VMware infrastructure side, and also in the customers network. One of the many prerequisites to a successful installation of vSphere with Tanzu, is having a consistent MTU value of 1600.
The Issue:
Tanzu was just deployed to an NSX-T backed cluster, however you are unable to connect to the vSphere with Tanzu landing page address to download Kubernetes CLI Package via wget. Troubleshooting in NSX-T interface shows that the load balancer is up that has the control plane VMs connected to it.
Symptoms:
You can ping the site address IP of the vSphere with Tanzu landing page
You can also telnet to it over 443
Intermittent connectivity to the vSphere with Tanzu landing page
Intermittent TLS handshake errors
vmkping tests between host vteps is successful.
vmkping tests from hosts with large 1600+ packet to nsx edge node TEPs is unsuccessful.
These validations should have been completed prior to the deployment. However, in this case we were finding inconsistent MTU settings. So to simplify, these are what you need to look for.
In NSX-T, validate that the MTU on the tier-0 gateway is set to a minimum of 1600.
In NSX-T, validate that the MTU on the edge transport node profile is set to a minimum of 1600.
In NSX-T, validate that the MTU on the host uplink profile is set to a minimum of 1600.
In vSphere, validate that the MTU on the vSphere Distributed Switch (vDS) is set to a minimum of 1600.
In vSphere, validate that the MTU on the ESXi management interface (vmk0) is set to a minimum of 1600.
In vSphere, validate that the MTU on the vxlan interfaces on the hosts is set to a minimum of 1600.
Troubleshooting:
In the Tanzu enabled vSphere compute cluster, SSH into an ESXi host, and ping from the host’s vxlan interface to the edge TEP interface. This can be found in NSX-T via: System, Fabric and select Nodes, edge transport nodes, and find the edges for Tanzu. The TEP interface IPs will be to the right. In this lab, I only have the one edge. Production environments will have more.
In this example, vxlan was configured on vmk10 and vmk11 on the hosts. Your mileage may vary.
We are disabling fragmentation with (-d) so the packet will stay whole. We are using a packet size of 1600
If the ping is unsuccessful, we need to identify the size of the packet that can get through. Try a packet size of 1572. If unsuccessful try 1500. If unsuccessful try 1476. If unsuccessful try 1472, etc.
To test farther up the network stack, we can perform a ping something that has a different VLAN, subnet, and is on a routable network. In this example, the vMotion network is on a different network that is routable. It has a different VLAN, subnet, and gateway. We can use two ESXi hosts from the Tanzu enabled cluster.
Open SSH sessions to ESXi-01 and ESXi-02.
On ESXi-02, get the PortNum for the vMotion vmk. On the far left you will see the PortNum for the vMotion enabled vmk. Run the following command:
4. On the ESXi-01 session, use the vmkping command to ping the vMotion interface of ESXi-02. In this example we use a packet size of 1472 because that was the packet size the could get through, and option -d to prevent fragmentation.
# vmkping -I vmk0 -s 1472 -d <ESXi-02_vMotion_IP>
5. On the ESXi-02 session, we should now see six or more entries. Do a CTRL+C to cancel the packet capture.
6. Looking at the packet capture output on ESXi-02, We can see on the request line that ESXi-01 MAC address made a request to ESXi-02 MAC address.
On the next line for reply, we might see a new MAC address that is not ESXi-01 or ESXi-02. If that’s the case, then give this MAC address to the Network team to troubleshoot further.
Testing:
Using the ESXi hosts in the Tanzu enabled vSphere compute cluster, we can ping from the host’s vxlan interface to the edge TEP interface.
The edge TEP interface can be found in NSX-T via: System, Fabric and select Nodes, edge transport nodes, and find the edges for Tanzu. The TEP interface IPs will be to the far right.
You will need to know what host vmks the vxlan is enabled. In this example we are using vmk10 and vmk11 again.
In this example we are using vmk10 and vmk11 again. We are disabling fragmentation with (-d) so the packet will stay whole. We are using a packet size of 1600. These should now be successful.
On the ESXi-01 session, use the vmkping command to ping something on a different routable network, so that we can force traffic out of the vSphere environment and be routed. In this example just like before, we will be using the vMotion interface of ESXi-02. Packet size of 1600 should now work. We still use option -d to prevent fragmentation.
# vmkping -I vmk0 -s 1600 -d <ESXi-02_vMotion_IP>
On the developer VM, you should now be able to download the vsphere-plugin.zip from the vSphere with Tanzu landing page with the wget command.
You must be logged in to post a comment.