Performance degradation after adding Storage Spaces pool into Failover Cluster

Good day!

We encountered the problem of performance degradation after adding Storage Spaces pool into Failover Cluster. Outside the Failover Cluster Storage Spaces pool works just fine.

We have two servers running Windows Server 2012 R2 Standard with JBOD connected by SAS. At one of them we've created storage pool made of 72 SAS drives (12 SAS SSD 800 GB и 60 SAS HDD 1,2 TB). The pool contains 4 Virtual Disks (Space) of same configuration: 2-way mirror with tiering, 1GB writeback cache, 4 colums, 64 KB interleave. The pool also contains quorum Virtual Disk (witness disk) of the following configuration: 3-way mirror without tiering and writeback cache, 4 colums, 64 KB interleave.

We have tested performance for both Virtual Disks layers (SSDTier and HDDTier) with iometer. Their results were great just as expected – high IOPS and low latencies. For testing purposes we used file pinning with Set-FileStorageTier, then Optimize-Volume -TierOptimize.

Between two servers we’ve created Failover Cluster. During Cluster Validation Tests no problems were noticed. Then we added all Virtual Disks into the cluster and have assigned the witness-disk for them.

With Failover Cluster Manager we have added four roles of “File Server for general use“ (not Scale-Out File Server). For each of the file servers we have assigned separate Virtual Disk.

During the same iometer’s performance tests we saw noticeable performance degradation for all Virtual Disks. Result analysis revealed that the root cause of regression is highly increased HDDTier latency (from 2 to 5 times beginning with queue depth = 1) for both read and write operations.

We decided to disassemble the cluster and completely clear its Storage Spaces pool configuration. Then we reassembled the pool of Virtual Disks of the same configuration. New iometer test performance results were fine. Then we recreated the Failover Cluster and added disks into it. At this time we didn`t add File Server roles. And again performance test results showed us increased latencies (the same from 2 to 5 times).

We have repeated our experiment several times and results were the same – performance degraded right after the pool and Virtual Disks were added into the Failover Cluster. It’s became obvious that the cluster is the reason of degradation.

We have made full hardware testing with powershell ValidateStorageHardware.ps1 script

https://gallery.technet.microsoft.com/scriptcenter/Storage-Spaces-Physical-7ca9f304 and it didn’t found any problem.

We have changed the testing tool to diskspd. Test results were a slightly better than iometers so we decided that iometer doesn’t work right with clustered drives.

We decided to perform high load cluster testing at production environment. Just after the SSDTier was filled up and HDDTler started using we began to receive complains from our clients. Perfmon have detected high latencies (from 20 ms and more) although the workload was not very high.

At our production environment we have another Storage Spaces File Server (it is not a part of cluster). It is based on 30 drives pool (10 SATA SSD plus 20 SAS HDD) and works just fine – HDDTier latencies are never rise more than 6-8 ms nevertheless the workload is much higher than for the new one. The workload character is the same.

Our new pool and VirtualDisks were created with according to best practice advice and recommendations:

- drive count not more than 80 (we have 72)

- drive capacity not more than 10 TB (our VirtualDisk about 9 TB (1TB SSDTier + 8Tb HDDTier))

VirtualDisk was created with compatibility for FastRebuild (1 SSD and 2 HDD were reserved).

WriteBack cache is 1GB. Disks caching option are disabled.

Can anybody help us with this cluster situation?

Things we already have tried to do:

- checked write back cache amount influence– it doesn’t affected

- checked SAS HDD MPIO policy (by default– RR, try - LB and FOO) – it doesn’t affected

- checked disk own writeback policy settings (now it turned off on every our disks) – it also doesn’t

The cluster and the pool configurations were cleaned with Clear-SdsConfig.ps1 (https://gallery.technet.microsoft.com/scriptcenter/Completely-Clearing-an-ab745947).

Performance degradation after adding Storage Spaces pool into Failover Cluster

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112