Hi,
Before i describe the problem, here is an simplified overview of our environment:
We have an Windows 2012 R2 Hyper-V cluster with 8 nodes (HP Servers). Each nodes, is connected to two HP switch (5406zl) via 2x 10Gbe connections (dual port NC523SFP 10Gbe NICs). The NICs are in a switch independent team, on top of which we created a virtual
switch. Connected to the virtual switch, we have several virtual NICs (Management, Live Migration, Storage, Cluster).
The virtual machines running on the cluster are using SMB3 to connect to a scale-out file server who has 2 nodes. Each node also has 2x 10Gbe NICs (NC523SFP) which are in a LACP team. The virtual Machines configuration and virtual disks are located on the scale-out
file server disks and access via two shares.
Here is the problem:
Looking at the disk performance on the virtual machines, we get very bad response times (between 200ms and 500ms). After many days of troubleshooting, I have found that by removing the "QoS Packet Scheduler" from the virtual NICs (untick the box),
the performance goes back to normal and I get disk response times of 2-5ms.
I've tried to upgrade the driver and firmware on the NICs to the latest version, and also have applied all windows updates on the Hyper-V hosts and file cluster nodes.
I've also tried to remove the NIC teaming on the Hyper-V host, only using one of the 10Gbe card as a simple Hyper-V switch (without virtual NIC), but the same problem occurs.
I realize this is probably an HP NIC driver issue (I have a call open with HP support on this), but I thought, while I wait for HP to get back to me, I'd post the question here in case someone experienced the same problem.
Thank you,
Stephane