TCP/IP Stack Configuration issue in vSphere 6

Recently, I get chance to upgrade and migrate my existing environment from vSphere 5 and 5.5 to version 6. During migration I have seen very strange issue in VMware TCP/IP Stack configuration. I am working on it from two days and finally I am able to figure out where the problem exists.

Here is my scenario. I have been trying to configure vMotion, NFS and Mgmt traffic on the single 10 G Ethernet. In which Mgmt is on layer 3 network. However NFS and vMotion is on layer 2 network.

I have read about the TCP/IP stack feature in vSphere and figure, let’s give him a try and use it to separate my vMotion and NFS traffic on different stack from the default TCP/IP stack. The default stack is by default used for Management traffic during the installation and configure of ESXi.

Those who used vSphere 6 will know there is a TCP/IP stacks exists for vMotion and Provision traffic on web client. You can see it from web client by going into the ESXi networking.

tcpipstackissue-01

You have ability to create your own custom stack. but its only used for IP based storage such as ISCSI and NFS.

NOTE: Unfortunately, custom TCP/IP stacks aren’t supported for use with fault tolerance logging, management traffic, Virtual SAN traffic, vSphere Replication traffic, or vSphere Replication NFC traffic.

If you see I have created one custom TCP/IP stack for NFS traffic.

tcpipstackissue-02

NOTE: At the time of this post. There is only one way to create custom stack by using esxcli command. Here is syntax of the command.

#esxcli network ip netstack add –N “NFS” 

If you want to see vmKernel adapters association with existing stacks. Here you can see in below screen it did not have any association right now. Let’s use them.

tcpipstackissue-03

There is only one way to use this stack. Which is at time of creation of vmKernel adapter. Let’s create one vmKernel adapter for NFS and vMotion.

To add the vmKernel adapter in ESXi server.

From web client -> select the ESXi Hosts->Manage -> Network->VMKernel Adapters-> Click on “Glob sign” to add a new vMKernel adapter.

tcpipstackissue-03-1

When wizard started. During vMotion vMkernel adapter. Choose the TCP/IP Stack “vMotion” from Drop down as highlighted.

tcpipstackissue-04

As you can see vMotion TCP/IP stack is selected for vmKernel adapters and it’s grayed out means you are not able to change it to select another service.

Follow the “new networking wizard” and create the vmKernel adapter.

tcpipstackissue-05

Once created. You will see vMotion vmKernel adapter has been added by using vMotion TCP/IP stack.

tcpipstackissue-06

Similarly add a vmKernel adapter for NFS and this time select Custom NFS stack which I have created for this purpose. Here is final configuration looks likes.

tcpipstackissue-07

NOTE: if you already created the vmkernel adapters and want to use the TCP/IP stack against these vmKernel adapters. Then you have to delete them first and recreate it by choosing the appropriate TCP/IP stack as I did in above steps.

At that time. I was thing. It’s look pretty good and everything should work. I tried to add a NFS share.

In “New datastore” adding wizard. When I reached at this screen I got below error. Which is not so helpful by reading its description. Any one guess what this error mean? Anyone?

Its means, there is a datastore already exists with the same name which i had entered in one of the host. Of course this has nothing to do with TCP/IP stack configuration. But this is what I find during configuration so I thought I would share this as well.

tcpipstackissue-08

Let’s try to add storage using vSphere Classic Client.

tcpipstackissue-09

While adding I got following error. Which tells me “unable to access or connect the NFS server”.

tcpipstackissue-10

As per the configuration which I did above.  It supposed to be working. But it’s not. Let’s troubleshoot it.

After hours of digging. I reached at this point. I SSH into ESXi host and check its routes. As you can see below there is no route for vmk1 and vmk2. Which supposed to here. If you are using different network for each traffic.

tcpipstackissue-11

Let’s see how to solve it.

The only solution which work for me to create the vmKernel with “default TCP/IP stack”. Means you should not use separate TCP/IP stack for each traffic. Which supposed to be work. But I don’t know its bug or I am using it wrong. Anyways let’s delete these vmKernel adapters and create new one with “default TCP/IP stack”.

Now again run the “add new networking” wizard as I did previous to add vmkernel adapter for vMotion and NFS. Make sure you should delete the old one first.

In wizard this time select TCP/IP Stack “default” for vMotion and select enable service “vMotion traffic”. Following the wizard and complete the steps.

tcpipstackissue-12

Once vmKernel adapter for vMotion is created. You can verify it as shown below. It’s on default TCP/IP stack.

tcpipstackissue-13

Similarly add vmKernel adapter for NFS traffic on default TCP/IP stack as shown below.

tcpipstackissue-14

Once done. Let’s check the route now on ESXi host. As you can see below now all relevant routes are there as expected. Let’s try to add “NFS” storage now.

tcpipstackissue-15

Now add storage from classic client or webclient. i find it easy to use classic client here.

tcpipstackissue-09

Now you see it’s added successfully.

tcpipstackissue-16

Final thoughts

I find it very strange that TCP/IP stack configuration is not working as i suppose. logically it should work if you use separate TCP/IP stack for each traffic. But it did not. Anyhow it might be bug and fixed in later releases or i am doing it wrong.if any one find it working please let me know. But right now you see the work around of the problems. If you find it your environment. It will save your tons of time.

Advertisements

9 responses to “TCP/IP Stack Configuration issue in vSphere 6

  1. The same thing happened to me. I thought I did something wrong, but after recreating the vmKernel adapter using the default TCP/IP stack I was able to communicate with the NAS and other devices on my NFS network …..

    Has anyone found a solution to this ??

  2. I’m not sure I’m following. The point of the custom stack is to be able to send certain traffic like NFS down a different gateway. Your workaround looks like all traffic has to go over the default gateway which is tied to vmk0. So, all traffic will go over vmk0. Am I seeing that wrong? Just sounds like the stack feature doesn’t work. The workaround is just the normal thing I’d do when faced with a single default gateway.

    • yes true. but if you are using layer two configuration for NFS than it will be fine.i have checked this configuration with vSphere 6 GA release.i hope it will be fixed in later release. please give a try to latest release vSphere 6 update 1.

  3. Hi,
    I’ve solved the issue in the same release vSphere 6u1 as follows(using vSphere 6 web client):
    1. add vmk2 kernel adapter, select vMotion tcp/ip stack trace (not Default stack trace) and selecting only vMotion service checkbox( and chose a new vSwitch1 and an available vmnic2;
    2. checked the list of the current tcp/ip stacks using the command:
    esxcli network ip netstack list
    with output:
    defaultTcpipStack
    Key: defaultTcpipStack
    Name: defaultTcpipStack
    State: 4660

    vmotion
    Key: vmotion
    Name: vmotion
    State: 4660
    (Before adding the vmk2, the command was listed only “defaultTcpipStack”)
    3. using the command:
    esxcfg-route -l
    we will see only routing table of the “DefaultTcpipStack”:
    Network Netmask Gateway Interface
    10.0.0.0 255.255.255.0 Local Subnet vmk1
    192.168.0.0 255.255.255.0 Local Subnet vmk0
    default 0.0.0.0 192.168.0.230 vmk0
    When I issued:
    esxcfg-route -N vmotion -l (using the parameter -N netstack_name)
    bingo!
    the output is:
    VMkernel Routes:
    Network Netmask Gateway Interface
    172.0.0.0 255.255.255.0 Local Subnet vmk2

    where 172.0.0.0/24 is the network used for vMotion traffic;

    4. For testing the connection between 2 esxi hosts (esxi1 = 172.0.0.2, esxi2 = 172.0.0.4) I issued the command:
    vmkping -S vmotion -I vmk2 172.0.0.4; from the host wih 172.0.0.2 and worked; (see -S netstack_name parameter I used)
    5. For monitoring vMotion traffic I used on both esxi hosts:
    pktcap-uw –vmk vmk2;
    and all traffic was generated only on the vmk2 kernel adapters.
    (tcpdump-uw only saw vmk0 and vmk1 kernel adapters for the DefaultTcpipStack)

      • 2 – if vmotion traffic is already selected on the Default stack, when you want to add another vmk adapter and select custom vmotion stack (not the Default stack) a warning message is displayed which says that you already checked vmotion on the Default stack and if you agree adding vmotion service to the vmotion stack, the vmotion service will be deleted from the Default stack and will be used only in the custom vmotion stack.
        Once you’ve added the custom vmotion stack (with vmotion service checked), the vmotion service will become disabled on the Default stack. If you have the vmotion service disabled probably you already created somewhere the custom vmotion stack with vmotion service activated.

  4. Then you can also add a default gateway or a static route to the “vmotion” tcp/ip stack:
    esxcfg-route -N vmotion -a 0.0.0.0/24 172.0.0.1

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s