Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 5654

W2008 R2 SP1 - Add node validate cluster - losing disk tru event 1568

$
0
0

We had a problem adding a third node to our existing cluster with a communication time out. Therefor we choose to update the servers and try with the latest up-to-date fix levels.

When validating the cluster in order to add a third node, we saw in the validation log:

List disks visible to two or more nodes that will be validated for cluster compatibility. Online clustered disks will be excluded.

Disk with identifier 6390744f has a Persistent Reservation on it. The disk might be part of some other cluster. Removing the disk from validation set

Disk with identifier ca0db766 has a Persistent Reservation on it. The disk might be part of some other cluster. Removing the disk from validation set

And:

Cluster disk 8 from node SVR01.domain.local has 8 usable path(s) to storage target
Cluster disk 8 is not managed by Microsoft MPIO from node SVR02.domain.local

Cluster disk 8 is not managed by Microsoft MPIO from node svr03.domain.local

There are 11 disks, so 2 were excluded from validating and 1 disk failed MPIO which is strange as it is for sure on SVR02, the existing cluster node.  

And on every SVR node:

Getting SCSI page 83h VPD descriptors for cluster disk 8 from node SVR01.domain.local SCSI page 83h VPD descriptors for cluster disk 8 and 9 match

SCSI page 83h VPD descriptors for cluster disk 8 and 10 match

At the end of this test:

An error occurred while executing the test.
Specified argument was out of the range of valid values.
Parameter name: percentage

So it failed the validation test. We checked the cluster event log and saw no errors, some warnings and everything was online. We logged in to the VMs to check the event logs and on one server we were welcomed by a screen saying that a disk needed its MBR record to be set.

When checking the disk in disk management on the node we saw it was unallocated with status reserved. When looking under the Storage resource of the Cluster we can see the disk is online but the volume path is not there.

When looking at the cluster event log we can see:

Event 1568 - Cluster disk resource 'SQLProd_Log' found the disk identifier to be stale. This may be expected if a restore operation was just performed or if this cluster uses replicated storage. The DiskSignature or DiskUniqueIds property for the disk resource has been corrected.

This is a pass tru disk and the disk the VM wanted to set the MBR record on.

We removed the storage resource, the disk, MPIO and SAN volume and exposed a new SAN volume, set MPIO, disk and added the new storage resource and restore the data.

What can cause validating a cluster to create such a potentially disastrous problem?

TIA,

Fred

 




Viewing all articles
Browse latest Browse all 5654

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>