Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 5654

SAN HPE SV3200 iScsiIPrt errors crashing VMs and after cascade also Fail Over Server Nodes?

$
0
0

We have a three node W2012 R2 Fail over cluster that has been running spotless for years with the HPE P4300 SAN but after adding the HPE Storevirtual SV3200 as a new SAN we are having iScsiPrt errors that HPE Support cannot fix, crashing VMs and also two of the three fail over nodes. 

At first everything seemed to work, but after adding additional disks on the SAN a SAN controller crashed. That has been replaced under warranty but now when moving our servers and especially SQL 2008 Servers to the SAN, problems start to occur. The VHDX volumes of the SQL servers are thin provisioned.  

Live moving of the storage worked fine for none SQL servers. For some SQL servers the servers frooze and operation was halted, so we needed to perform an offline move. Then during high disk IO and especially during backups W2012 R2 FOC started to behave erratic eventually crashing VMs and in one instance rebooting two fail over nodes, as a result of a flood of iScsciPrt errors in the eventlog:

System iScsiPrt event ID 27 error Initiator could not find a match for the initiator task tag in the received PDU. Dump data contains the entire iSCSI header.
System iScsiPrt event 129 warning The description for Event ID 129 from source iScsiPrt cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\RaidPort4

the message resource is present but the message is not found in the string/message table

System iScsiPrt event ID 39 error Initiator sent a task management command to reset the target. The target name is given in the dump data.
System iScsiPrt event ID 9 error Target did not respond in time for a SCSI request. The CDB is given in the dump data.
System iScsiPrt event 129 warning The description for Event ID 129 from source iScsiPrt cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\RaidPort4

the message resource is present but the message is not found in the string/message table
System iScsiPrt event ID 27 error Initiator could not find a match for the initiator task tag in the received PDU. Dump data contains the entire iSCSI header.
System FailOverClustering event id 5121 Information Cluster Shared Volume 'Volume4' ('NEMCL01_CSV04') is no longer directly accessible from this cluster node. I/O access will be redirected to the storage device over the network to the node that owns the volume. If this results in degraded performance, please troubleshoot this node's connectivity to the storage device and I/O will resume to a healthy state once connectivity to the storage device is reestablished.

After a 2 hour period of these events the FailOver Cluster services started to give errors, VMs failed and finally 2 nodes of our 3 node failover cluster rebooted because of a crash.

Sofar HPE has not been able to fix this. The SV3200 logs has occasional ISCSI controller errors but the error logging in the SVMC is minimal. 

HPE support blamed using a VIP and using Sites (a label). Both are supported according to the HPE product documentation. This has been removed and ISCSI initiator has been set to the Eth0 bond IP adresses directly. As problems persist they blamed that we are using the Lefthand DSM MPIO driver on the initiator connections to the SV3200 which is not the case. Standard MS DSM. Yes the Lefthand driver is on the system for our old SAN but not configured for the SV3200 initiator sessions, which is round robin with supset.     

We  are currently facing a legal warranty standoff.

Any pointers  or other comparable experiences with the HPE Storevirtual SV3200 SAN?

TIA,

Fred


Viewing all articles
Browse latest Browse all 5654

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>