Hi,
We have had major issues updating our hyperconverged S2D cluster with the May 17th 2018 Update Rollup.
The issue occurred while rebooting each cluster node as the node was shutting down to reboot.
Our cluster pool looked to have partially failed and some virtual machines crashed, failed over and restarted each time a node was rebooted to apply the update rollup.
Firstly, some background. This is a 4 node cluster with fully validated Dell R730XD servers. Cluster validation tests are all passed with success including 'Verify Node & Disk Configuration' for SES supported config. We have also verified and validated
our network configuration and switches with Dell.
We ensured no storage jobs were running and that all virtual and physical disks were healthy. File share witness was online and available during the patching.
We pause one node, then applied the update rollup, after successful installation clicked to reboot the node. As the node was shutting down we got the following events:
Event ID: 1289: Source: Microsoft-Windows-FailoverClustering.
The Cluster Service was unable to access network adapter "Microsoft Failover Cluster Virtual Miniport". Verify that other network adapters are functioning properly and check the device manager for errors associated with adapter "Microsoft
Failover Cluster Virtual Miniport". If the configuration for adapter "Microsoft Virtual Miniport" has been changed, it may become necessary to reinstall the failover clustering feature on this computer.
*******************************************************
Event ID: 5395: Source: Microsoft-Windows-FailoverClustering.
Cluster is moving the group for storage pool 'Cluster Pool 1' because current node 'HYPER2' does not have optimal connectivity to the storage pool physical disks.
***************************
I noted that event ID 5395 never referred to the node that was getting patched or rebooted, it was always another node in the cluster.
After the reboot and the node joined back into the cluster the repair jobs ran and completed successfully. When we carried out the same procedure on the other nodes the same issue occurred.
Has anyone else experienced these issues? We are tearing our hair out as Dell cannot find any issues and our customer has lost complete confidence with Storage Spaces Direct due to the contant instability with it.
Thanks,
Microsoft Partner