Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 5654

2012R2 SOFS with high Disk Response Times and Hyper-V VM peformance

$
0
0

We are running a 2012R2 Cluster with SOFS serving up Hyper-V.  Dell Hardware.  In the Event Logs |Application and Services Logs|Microsoft|Windows|SMBServer we're seeing hundreds of repeated Warning Events SMBServer Event ID1020

File system operation has taken longer than expected.

Client Name: \\[fe80::d199:8860:d21:d7d7]
Client Address: [fe80::d199:8860:d21:d7d7%26]:49353
User Name: XXXXX\CLIUSR
Session ID: 0x80C0400000061
Share Name: \\*\b03a302b-1fdc-4c75-8c79-25d058749253-135266304$
File Name: SHARES\SW01DATAVOL2\ts15075a59.NNN.XXXXXX.com\Virtual Machines\C92B52CC-5739-4747-B6AE-CF4725B0505E\C92B52CC-5739-4747-B6AE-CF4725B0505E.vsv
Command: 11
Duration (in milliseconds): 208159633
Warning Threshold (in milliseconds): 120000

Guidance:

The underlying file system has taken too long to respond to an operation. This typically indicates a problem with the storage and not SMB.

The Disk Response(ms) times are very as shown in the Task Manager Resource Monitor.  Currently in the 300-1000ms.  This is occurring on Standalone 2012R2 Storage Spaces servers along with the Clustered SOFS.  Performance of the VM is very bad if even able to logon.  Most times the servers become inaccessible and kick current user off the system.  We previously saw this in 2012R2 Clustered Storage Spaces in 2014.    Anyone else aware of this issue.

We had a MS ticket on it back in 2014 but dropped the case when it became to time consuming and returned the hardware. We could never get past the MS Tier1 and Tier2 Engineers to get the ticket escalated.  If I remember correctly the issue had to do with Disk Cache Flushing.    My understanding is that MS created a patch to resolve the issue for another company but since our ticket wasn't elevated MS was unaware of our issue until later.

Thanks

Update to this:  Back in 2014 when we went to a MS meeting the term was "excess disk cache flushes".  

This blog  https://blogs.msdn.microsoft.com/clustering/2014/06/05/cluster-shared-volume-performance-counters/

The perf counters for Cluster CSV File System Flushes.  The values for the 4 volumes are

401,161     272,914  115,836   778,944    

These seems to be very high but I don't have a gauge to determine it.  


Dave Kreitel



Viewing all articles
Browse latest Browse all 5654

Trending Articles