I have a two-node active-active SQL Cluster.
Three database instances live on one node, a fourth on the other node.
The node with one instance experienced an outage on the fabric that connects it to the iSCSI VLAN. This meant an outage for all shared disks, SQL volumes and quorum.
The outage was not complete however, it was more the case of extreme packet loss. I received numerous events beginning with 1230, followed by 1069 and 1146's.
The other node in the cluster did not lose any disks and was 100% operational the entire time, I am trying to understand why over 45 minutes of this behavior the SQL instance never failed to the operational node.
Is there a configuration I have missed or have setup wrong?
The issue was eventually resolved by a manual reboot of the malfunctioning node.