Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 5654

VRTX External Network Connectivity Loss Causes Windows 2012 R2 Cluster Failure

$
0
0

We have a VRTX server with 2 server modules, the internal Gb switch module, and a single PERC8 card. Both server modules are running Windows 2012 R2 as a host OS (running off the modules HDs). They are configured as 2 nodes (1A and 1B) of a Failover Cluster (with cluster services typically residing on 1A). The Failover Cluster consists of multiple virtual OS nodes all using shared storage built into the VRTX (two virtual drives: a 7.2TB data drive and a 10GB quorum drive) as Cluster Disks.

The problem that I am noticing is that if the VRTX loses connectivity to the network outside the VRTX, then that seems to be triggering a cluster failure event, which is bringing the virtual nodes down in a dirty fashion. The sequence of events seems to be:

1. External Network Connection Goes Down

2. There is a Critical Event 1135 on node 1B: "Cluster node '1A' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster."

3. There is an Informational Event 1650 on node 1A: Cluster has lost the UDP connection from local endpoint [IP address of 1A]:~3343~ connected to remote endpoint [IP address of 1B]:~3343~.

4. Roles move from 1A to 1B and go from offline to online.

There are also some errors that occur with respect to the cluster disk. Eventually, everything is up and operating on 1B (with some exceptions having to do with them coming back online in an order that isn't well supported).

The thing that makes this all odd is that the server modules/nodes never lose connectivity to each other (because of the internal VRTX switch). They really only lose connectivity to the outside world and I don't think anything in the cluster is dependent on the outside world.

Does anyone have any idea of why the cluster is failing because of an external network connection? And how to prevent it in future?

Thanks,

indyvql


Viewing all articles
Browse latest Browse all 5654

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>