Hello everybody,
in my environment I do have an issue with failover clusters (Exchange, Fileserver) while performing a live migration of one virtual clusternode. The clustergroup is going offline.
The environment is the following:
2x Hyper-V Clusters: Hyper-V-Cluster1 and Hyper-V-Cluster2 (Windows Server 2012 R2) with 5 Nodes per Cluster
1x Scaleout Fileserver (Windows Server 2012 R2) with 2 Nodes
1x Exchange Cluster (Windows Server 2012 R2) with EX01 VM running on Hyper-V-Cluster1 and EX02 VM running on Hyper-V-Cluster2
1x Fileserver Failover Cluster (Windows Server 2012 R2) with FS01 VM running on Hyper-V-Cluster1 and FS02 VM running on Hyper-V-Cluster2
The physical networks on the Hyper-V Nodes are redundant with 2x 10Gb/s uplinks to 2x physical switches for VMs in a LBFO Team:
New-NetLbfoTeam -Name 10Gbit_TEAM -TeamMembers 10Gbit_01,10Gbit_02 -TeamingMode SwitchIndependent -LoadBalancingAlgorithm HyperVPort
The SMB 3 traffic runs on 2x 10Gb/s NIC without NIC-Teaming (SMB-Multichannel).
SMB is used for livemigrations.
The VMs for clustering were installed according to the technet guideline:
http://technet.microsoft.com/en-us/library/dn265980.aspx
Because my Hyper-V Uplinks are allready redundant, I am using one NIC inside the VM.
As I understand, there is no advantage of using two NICs inside the VM as long they are connected to the same vSwitch.
Now, when I want to perform a hardware maintenance, I have to livemigrate the EX01 VM from Hyper-V-Cluster1-Node-1 to Hyper-V-Cluster1-Node-2.
EX02 VM still runs untouched on Hyper-V-Cluster2-Node-1.
At the end of the livemigration I see error 1135 (source: FailoverClustering) on EX01 VM, which says that EX02 VM was removed from Failover Cluster and I have to check my network.
The clustergroup of exchange is offline after that event and I have to bring it online again manually.
Any ideas what can cause this behavior?
Thanks.
Greetings,
torsten