Posts from category "2021"

NSX-T T0 BGP routing considerations regarding MTU size

Recently I had serious NSX-T production issue with BGP involved and T0 routing instance on edge VMs cluster, in terms of not having routes inside routing table on T0 - which supposed to be received from ToR L3 device.

NSX-T environment has several options regarding connections from fabric to the outside world. These type of connections goes over T0 instance, which can be configured to use static or dynamic routing (BGP, OSPF) for this purpose. MTU is important consideration inside NSX environment, because of GENEVE encapsulation inside overlay (should be minimum 1600bytes - 9k ideally). Routing protocols are also subject to MTU check (OSPF is out of scope for this article, but you know that MTU is checked during neighborship establishment).

Different networking vendors are using various options for MTU path discovery - by default this mechanisms inside BGP should be enabled (but of course should be checked and confirmed). Problem arrives when you configure ToR as ie 9k MTU capable device, but BGP is having problems to exchange routes because of some facts - lets observe scenario when there is MTU mismatch on Edge inside NSX-T and T0 instance (ie 1600bytes by default):

- BGP session is established and UP

- routes are showing correctly inside ToR BGP/routing table

- THERE ARE NO ROUTES inside T0 routing table (probably they will show up if its small number of them - ie only DEFAULT etc)

- BGP session is going UP/DOWN after 3min (default BGP HOLD timer = 180s)

Different troubleshooting techniques could be used inside NSX to confirm this:

get logical-router - followed with correct vrf# T0 SR choice

get bgp neighbors summary

get route bgp - if we expect something here and see 0 - problem is present

capture packets (expression "port 179" could be useful to catch only on BGP stuff)

What's happening behind the scene:

- BGP session is UP because initial packet exchange is not going to be over edge MTU size, which is smaller

- ToR is having routes because it is configured for larger MTU

- T0 is NOT HAVING ROUTES because BGP update packet is going to be larger then configured (except in case only 1 route or smaller number, as I mentioned)

Summary - MTU should be checked and confirmed:

- using vmkping command inside overlay, between TEPs - transport and edge nodes

- ToR-T0 BGP session over Edge uplink could be with large MTU - but configured same on both sides. LARGE MTU IS NOT A REQUIREMENT on this BGP uplink, because traffic here is not a GENEVE traffic and, most probably, you will not change MTU to 9k through your whole networking infrastructure (in most cases)

- if ToR BGP neighbor is configured with system MTU of 9k - configure physical interface, or SVI if using VLAN based one, down to NSX T0 with same MTU as inside NSX

- changing MTU inside NSX is of course also an option here too

Windows 10 legacy BIOS boot change UEFI - without OS reinstall

If you are, like me, using Win10 with legacy BOOT option and would like to start playing with newest Windows 11 version - first problem probably will be TPM (Trusted Platform Module) chip... which can be easily added to VM - if it runs UEFI based boot firmware. Change, without proper existing disk preparation, could went wrong in terms of non-bootable machine etc.

Summary - it can be done without OS reinstall and here are proposed steps that worked for me:

- of course - some kind of backup/snapshot should be there, just in case ;)

- run command "Get-Disk" which should give you output that Partition style is MBR;

- before actual conversion you can run validation with command "MBR2GPT.exe /validate /allowFullOS" - using variable "allowFullOS" inside running VM;

- after successful validation - actual conversion is done by using command "MBR2GPT.exe /convert /allowFullOS";

- shut down VM after conversion is done and change to UEFI boot option - Fusion example below:

VMware Fusion change VM UEFI settings

- after change and powering VM up - "Get-Disk" should show GPT as Partition style like this:

Disk Partition style as GPT

- at the end you can easily add new device - TPM module to VM - again with shut downed Win10 environment :)

NSX Advanced Load Balancer (ex AVI Networks) Lets Encrypt script integration

I would like to share very useful setup for VMware NSX ALB (ex Avi solution) in terms of usage freely available Lets Encrypt certificate management solution. Basically, provided script gives you automation inside NSX ALB environment, without the need for some external tools or systems. Putting it summary these are the required steps:

- create appropriate virtual service (VS) which you will use for SSL setup with Lets Encrypt cert - this can be standalone service or SNI (Service Name Identifier) based (Parent/Child) if needed. Initially you can select "System-Default" SSL cert during the VS setup;

- create appropriate DNS records for new service in place - out of scope of NSX ALB most of the times. NSX ALB Controllers should have access to Lets Encrypt public servers for successfull ACME based HTTP-01 certificate generation/renewal;

- Download required script from HERE

- Follow rest of required configuration steps on this NSX-ALB-Lets-Encrypt-SETUP - in terms of user creation/script adding/CSR...

- I would like to give you an special attention in case you have split DNS scenario from VS and Controller perspective - last step during certificate generate/renewal process with script is verification of received Token using ACME HTTP-01 check, which will FAIL in case you have this type of DNS scenario (ie "Error from certificate management service: Wrote file, but Avi couldn't verify token at http://<URL>/.well-known/acme-challenge/<token-code>..."). Bellow image gives you an option to resolve this type of setup by using special, script integrated, variable:

NSX ALB Lets Encrypt split DNS verification bypass variable

After successfull certificate generation - you just need it to assign it to appropriate virtual service (created at beginning or existing one) and after that renewal process will be automated per default NSX ALB (Avi) policy on 30, 7 and 1 day before expiration.

NSX-T - North/South Edge uplink connection options and scenarios

In this, a little bit longer post, I'm going to explain a couple typical use case scenarios regarding different options used inside NSX-T environment for connection options on Edge side, regarding TEP and North/South traffic options. Every environment is special use case, but hope you will find here summary options which you can use for successful deployment and design planning. First, couple of assumptions that I made here:

- vSphere environment is v7.x

- vDS distributed switch is in place - this dramatically can simplify NSX-T design and implementation because of NSX support inside vSphere 7. vSphere ESXi is pre-requisite for this and if you have such kind of infra, unless you have some special reason - N-vDS is not necessary at all

- Basically, we are speaking here about post NSX-T v2.5 era where, similar to bare metal Edge nodes, Edge VM form factor also supports same vDS/N-VDS for overlay and external traffic - definitely no more "three N-VDS per Edge VM design", unless you really, really need it

- NICs per physical server - minimum 2 - ideally they should be at least 10Gbps, but for demo and labbing there is no problem going even with 1Gbps only. Also, depending on desired design someone can utilize even higher number of NIC ports, which of course will give you plenty of options where to dedicate different types of traffic (ie management, vMotion, vSAN, iSCSI etc.)

-It's worth mentioning that regarding stateful and stateless services interaction with edge nodes - T0 A/A or A/S design - you can choose where to deploy them in multi-tier setup (T0 and T1s). For example - you can configure T0s in A/A (this article is assuming A/A in it's scenarios) setup and T1 in A/S because you need some of the statefull services (ie NAT, VPN etc.). Also, for highly scalable environments, different Edge clusters hosting only T0 or T1 routing instances are possible.


SCENARIO 1 - Single upstream router / redundant UPLINKS

Next picture shows this type of scenario - in this case with dual uplinks, which can be Active/Active using ECMP or Active/Standby, using appropriate BGP config on T0 routing instance inside Edge cluster:

NSX-T - Single ToR

Relevant overview config, from NSX-T and Cisco IOS perspective, is also shown, with BFD in place as advanced mechanism for link failure detection.


SCENARIO 2 - Dual upstream router / single UPLINK per Edge

This setup involves redundancy at the ToR level. T0 gives option in A/A setup for inter SR BGP routing support on it. Prefer whole routing through just one ToR instance, or using both as A/A for subset of networks - all scenarios are possible using appropriate T0 BGP config. Inter SR routing exchange helps in scenario where ToR routing boxes have same information regarding rest of network infrastructure, inside their routing tables.

NSX-T - Dual ToR

Relevant config, with short scenario description is available.


SCENARIO 3 - Dual upstream router / dual UPLINK per Edge

If possible - there is also an option to jump with total 4 uplinks toward physical ToRs. This gives all the nice redundancy options from Scenario 2 plus high throughput by using ECMP and all active paths toward ToR boxes.

NSX-T - Dual ToR with dual Uplinks