Hi
I have a Windows 2008R2 4 node cluster. The cluster is configured like so:
2 x nodes in Primary DC (1 is the active node)
2 x nodes in Secondary DC
1 x file share witness in third site
We had an issue last night whereby the 2 nodes in the secondary DC lost network communication due to a network event. The logs stated:
File share witness resource 'File Share Witness' failed a periodic health check on file share '\\fsw-01\Clus01'. Please ensure that file share '\\fsw-01\Clus01' exists and is accessible by the cluster.
The net effect was that the entire cluster stopped:
Cluster service was halted due to incomplete connectivity with other cluster nodes.
And:
Cluster node 'DC2-SQL1' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
Why has the entire cluster failed due to a networking issue that only affected 2 of the secondary nodes in the secondary site? The primary site nodes could still see the FSW.
Any insight would be great!
Thanks!