Failover Cluster Manager - Ghost machine can't delete

July 19, 2017, 8:46 am

≫ Next: modify the virtual machine settings to limit the processor features used by the virtual machine

≪ Previous: Cluster goes offline whenever I upgrade vmware tools on offline node and reboot

Running a Windows Server 2016 Hyper-V Clustered environment. Was running VMM 2016 Clustered w/ SQL 2016 Clustered as well (Currently all shut down). Ran into an issue with a VM where we had to delete the VM. Machine did not cleanly delete, and left a ghost VM role in Failover Cluster Manager. It is unable to remove the VM role for this particular machine though there is nothing still allocated to the machine. The folder containing the VM was deleted, the SID folder for the machine was deleted. Still Failover Cluster Manager shows this VM role with no way to delete it. Has anyone run into this issue and a way to resolve it? It only exists in Failover Cluster Manager, as far as Hyper-V is concerned, the machine is gone... The error code when trying to remove the VM from failover cluster manager is: "Error Code: 0x8007012f The file cannot be opened because it is in the process of being deleted." The issue is that there is nothing for it to delete, nor have any of the forums i've come across this far adequately addressed how to resolve this issue...

↧

modify the virtual machine settings to limit the processor features used by the virtual machine

November 21, 2019, 5:53 am

≫ Next: All VMs pause when certain nodes own the CSV

≪ Previous: Failover Cluster Manager - Ghost machine can't delete

dears,

i have a hyperv 2 nodes cluster configured.

All virtual machines are on the first node and everything was working fine.

recently im facing issues in the live migration: (virtual machine is using processor-specific features not supported on physical computer host, to migrate this vm to a physical computers with different processors modify the virtual machine settings to limit the processor features used by the virtual machine)

however processors are identical, no changes have been made. All vms are failing with the live migrate from host1 to host2.

validation report shows no errors at all.

i created a vm on host2 and live migrated to host1 it worked.

but all the vms from host1 to host2 are failing.

im facing a serious issue as if host 1 is down my vms wont failover

your expertise is highly appreciated

thank you

↧

All VMs pause when certain nodes own the CSV

November 21, 2019, 6:51 am

≫ Next: S2D cloud witness connection failure - cluster down

≪ Previous: modify the virtual machine settings to limit the processor features used by the virtual machine

Hi.

So ive added 2 nodes to a 6 node Server 2016 Hyper V Cluster. Hardwarewise they are the same servers (dell 730s). At first all looked fine, VMs running on those nodes, can live migrate from and to with no issues. But when one of these two nodes get ownership of the CSV Volume on which the VHDs of the VMs reside, all VMs on the entire cluster stop. Cluster validations returns only minor warnings due to updates. i had pending updates on the cluster when i added these nodes - I updated the two additional nodes when they werent part of the cluster yet and the plan was to do a CAU run when they nodes have joined the cluster. But then it fell flat when one node went into maintanance and switch CSV Ownership over to one of the new nodes).Since then i tested this on the other node as well (on a weekend) and the same happens here.

Can these updates actually be the problem or is there anyother place I need to look into?

↧

S2D cloud witness connection failure - cluster down

November 22, 2019, 5:21 am

≫ Next: Error applying Replication Configuration Windows Server 2019 Hyper-V Replica Broker

≪ Previous: All VMs pause when certain nodes own the CSV

Hi everyone.

After 2 hours without internet connection in a 2node+Witness S2d cluster all virtual machines from one node entered in a unmonitorize statem followed by an isolated node issue, and live migration of all possible virtual machines.

Is that a normal behaviour? I´ve seen no documentation on that.

regards

↧

Error applying Replication Configuration Windows Server 2019 Hyper-V Replica Broker

August 9, 2019, 2:48 am

≫ Next: Drive on all nodes in SQL Availability Group "Formatted" at the same time (Cluster on Windows 2016 standard)

≪ Previous: S2D cloud witness connection failure - cluster down

Hello,

Recently we started replacing our Windows Server 2016 Hyper-V Clusters for Server 2019. On each cluster we have a Hyper-V Replica broker that allows replication from any authenticated server and stores the Replica Files to a default location of one of the Cluster Shared Volumes.

With WS2019 we run into the issue where we get an error applying the Replication Configuration settings. The error is as follows:
Error applying Replication Configuration changes. Unable to open specified location for replication storage. Failed to add authorization entry. Unable to open specified location to store Replica files 'C:\ClusterStorage\volume1\'. Error: 0x80070057 (One or more arguments are invalid).

When we target the default location to a CSV where the owner node is the same as the owner node for the Broker role we don't get this error. However I don't expect this to work in production (moving roles to other nodes).

Did anyone ran into the same issue, and what might be a solution for this? Did anything changed between WS2016 & WS2019 what might cause this?

Kind regards,

Malcolm

↧

Drive on all nodes in SQL Availability Group "Formatted" at the same time (Cluster on Windows 2016 standard)

July 31, 2019, 10:25 am

≫ Next: Storage Spaces Direct: Number of volumes per cluster?

≪ Previous: Error applying Replication Configuration Windows Server 2019 Hyper-V Replica Broker

We have a 2 node SQL Availability Group on a Windows 2016 Std Cluster.

SQL Server reported the databases suspect after the data drives on both servers appeared to have been formatted.

On one of the servers we found the following events:

Event ID 7036 on 7/26/2019 at 9:37:55AM

Event ID 98 on 7/26/2019 at 9:38:12AM

Event ID 98 on 7/26/2019 at 9:38:13AM

These appear to indicate that the drive was formatted.

We have tested and found that using the Powershell "Format-Volume" command (Run locally or remotely) against one server causes the same drive on both nodes in the Cluster/AG to be formatted.

One possible cause is a server build script has been run with incorrect server details and we are investigating this possibility.

My questions are:

Has anyone experienced drives being "Formatted" simultaneously across nodes in a Clustered SQL AG?

Is the formatting of drives on an Availability Group supposed to affect all nodes? I've not found documentation to explain this.

↧

Storage Spaces Direct: Number of volumes per cluster?

November 25, 2019, 2:09 pm

≫ Next: Will it cause any problems if two cluster nodes run with two different OS patch versions?

≪ Previous: Drive on all nodes in SQL Availability Group "Formatted" at the same time (Cluster on Windows 2016 standard)

In Planning volumes in Storage Spaces Direct it says:

We recommend making the number of volumes a multiple of the number of servers in your cluster. For example, if you have 4 servers, you will experience more consistent performance with 4 total volumes than with 3 or 5. This allows the cluster to distribute volume "ownership" (one server handles metadata orchestration for each volume) evenly among servers.

How seriously should I take this?

Can someone quantify the actual real world performance benefit of adhering to this recommendation? Or is it (as I suspect) a more theoretical benefit?

We have used S2D for a while now and I am tending more and more towards just creating as few volumes as possible simply to avoid allocating to much space to one volume, which I will then later need in another volume because volume growth did not go as expected.

Now: If it was easy/possible to shrink a volume, then it would be another matter. But I am not aware of any option for that.

We currently have 3 volumes: A 3-way mirror volume and two single parity volumes. We have just added a 4th server/node and I want to change everything to dual parity (with mirror acceleration) and I am very tempted to just create one single volume or maybe two. Not four.

Thoughts?

↧

Will it cause any problems if two cluster nodes run with two different OS patch versions?

November 25, 2019, 6:42 pm

≫ Next: SOFS with Storage Spaces cluster

≪ Previous: Storage Spaces Direct: Number of volumes per cluster?

Hi experts,

If my memory served my right, Microsoft's best practice always recommend all nodes in one cluster should run the same OS patch version. We decide to patch one node first and patch the second after one week later. Would this cause any negative malfunction on Windows Server Cluster.

-------------------------------------

TLDR; (too long don't read)

https://social.technet.microsoft.com/Forums/en-US/baa4a8b1-39b7-4dd3-b692-bdab5ccd30ae/what-is-best-practice-to-keep-windows-server-201620122008-most-secured-and-high-available?forum=winservergen

"Since Microsoft decides to deliver cumulative updates only(I don't know, maybe in these two years), I guess the windows updates will take a very long time. I download Nov 2019 cumulative update and its size is incredible 1.2GB. WOW!! I guess 80% ~ 90% are features updates maybe 10% or less are security updates. Why don't Windows split features updates and security updates. Install features on Windows Server are really unnecessary.

And Also, Microsoft recommend all nodes in Windows Failover Cluster should be have the same patch level which cause a very serious problem, your scheduled/planned downtime will increased significantly. Maybe from 10 minutes to 30 minutes or more, and if you add high-end hardware reboot time(cold start), it takes you another 15 minutes.

All this are not acceptable. We decide to patch one node first and patch the second after one week later. Would this cause any negative malfunction on Windows Server Cluster. Sorry, This question is off topic, it is not a windows updates issue any more, it is a cluster issue related. I will post this question in Windows Server Failover Clsuter. Sorry for any inconvenience."

↧

SOFS with Storage Spaces cluster

November 26, 2019, 5:50 am

≫ Next: install downlevel "FailoverClusters" module on Server 2016

≪ Previous: Will it cause any problems if two cluster nodes run with two different OS patch versions?

Guys,

I have Win 2012 R2 SOFS cluster on top of Storage Spaces cluster with tiered storage and all. I am planning to upgrade it to Server 2016 but there are less to no info on Storage Space cluster upgrade so though someone who might have done it can help?

Cheers

↧

install downlevel "FailoverClusters" module on Server 2016

November 26, 2019, 7:03 am

≫ Next: 2-node (Hyper-V) Failover Cluster dependency on Domain Controllers, DNS Servers, File Share Witness server

≪ Previous: SOFS with Storage Spaces cluster

quick version - I wrote a bunch of monitoring scripts that run on our VMM server, all start with "Import-Module FailoverClusters" on the first line.

The server was in-place upgraded to Server 2016, and now all the scripts fail with:

get-vm : The Hyper-V module used in this Windows PowerShell session cannot be used for remote management of the server 'HVSQLTCUCS04'. Load a compatible version of the Hyper-V module

Some internet articles suggest I should be able to do this:

Import-Module FailoverClusters -RequiredVersion 1.1

But the response is not ideal:

Import-Module : The specified module 'FailoverClusters' with version '1.1' was not loaded because no valid module file was found in any module directory.

also this:

Get-Module -Name FailoverClusters -listavailable
    Directory: C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules
ModuleType Version    Name                                ExportedCommands
---------- -------    ----                                ----------------
Manifest   2.0.0.0    FailoverClusters                    {Add-ClusterCheckpoint, Add-ClusterDisk, Add-ClusterFileSe...

I tried "Find-Module" and things like it, they all get an internet error:

WARNING: Unable to download from URI 'https://go.microsoft.com/fwlink/?LinkID=627338&clcid=0x409' to ''.
WARNING: Unable to download the list of available providers. Check your internet connection.
PackageManagement\Install-PackageProvider : No match was found for the specified search criteria for the provider 'NuGet'. The package provider requires 'PackageManagement' and 'Provider' tags. Please
check if the specified package has the tags.

So, How can I get a 2012 r2 compatible version or the module "FailoverClusters" loaded on my 2016 server, preferably manually copy and not automatic internet download (which may be blocked by policy or firewall)

Thanks in advance!

↧

2-node (Hyper-V) Failover Cluster dependency on Domain Controllers, DNS Servers, File Share Witness server

November 21, 2017, 7:20 am

≫ Next: How to Evict and Re-add node to 2016 Two Node Cluster

≪ Previous: install downlevel "FailoverClusters" module on Server 2016

We have recently configured a two-node Failover Cluster for a client (a large multi-campus university). It is a Storage Spaces Direct cluster and it runs HyperV virtual machines. It is Windows Server 2016.

The two servers sit in the same rack and are connected through two switches, which also sit in that rack. We were hoping that with this configuration the virtual machines would have fairly decent availability.

Yesterday the client had a mishap during an attempt to update firmware in a router. This router connects the cluster with the rest of their infrastructure, including:

- DNS Servers

- Domain Controllers

- The file server, which we use as a file share witness in the cluster.

The result of losing that connection for 5-10 minutes was that all the virtual machines in the cluster stopped abruptly (no proper shutdown). They were automatically started again once the network connection was re-established, but obviously it was not a nice experience.

A few questions:

Is this expected behavior?

To what degree are failover clusters dependent on access to domain controllers, DNS Servers and witnesses (in the two-node configuration) for their continued operations?

Could the stoppage of all virtual machines have been avoided if the file server that acts as the cluster witness was sitting inside the same rack and was directly connected to the same two switches as the cluster servers? I am thinking that it would not help because is was added as a witness with its share-name, so it is likely dependent on DNS lookup to access it.

Would it even help to add more nodes to the cluster? I realise that many frown upon the 2-node setup, but I suspect that having more nodes would not help in this case.

Are there recommendations for how to do something like this? Should we add a domain controller as a virtual machine in the cluster? Would that have avoided the vm stoppage? The folks that manage the AD at the client are apparently very restrictive about locations of domain controllers, so this is not something that we can easily do.

↧

How to Evict and Re-add node to 2016 Two Node Cluster

November 26, 2019, 9:12 pm

≫ Next: diskspd \ Mass Storage

≪ Previous: 2-node (Hyper-V) Failover Cluster dependency on Domain Controllers, DNS Servers, File Share Witness server

Hi All,

Can anyone let me know the process to evict node from cluster and re-add again.

This is 2 node 2016 Cluster.

Paramesh KA

↧

diskspd \ Mass Storage

November 27, 2019, 7:10 am

≫ Next: Hyper V - Two different cluster can migrate the Guest VM from Cluster 1 to Cluster 2

≪ Previous: How to Evict and Re-add node to 2016 Two Node Cluster

Anyone familiar with the diskspd utility? I have had success with it testing server storage but how would I test a mass storage device like a Compellent Storage\Cluster?

Can I create an iscsi connection to the storage from the server?

↧

Hyper V - Two different cluster can migrate the Guest VM from Cluster 1 to Cluster 2

November 27, 2019, 10:30 pm

≫ Next: NLB (Load Balancing) for Cross Domain enviroment

≪ Previous: diskspd \ Mass Storage

For instance, Servers A and B are part of the same failover cluster 1, Servers C and D are part of failover cluster 2.

Cluster 1 running on windows 2012 R2 ( Same San server with different LUN configured quorum (Configure a disk witness ))

Cluster 2 running on windows 2016 (Same San server with different LUN configured quorum (Configure a disk witness ))

We want to migrate the Guest VM from Cluster 1 to Cluster 2? Is that anyone can help to advise the steps? Or is that any option to merge the cluster and do the vmotion ?

↧

NLB (Load Balancing) for Cross Domain enviroment

November 28, 2019, 3:21 am

≫ Next: Resources in Failover clustering randomly go offline

≪ Previous: Hyper V - Two different cluster can migrate the Guest VM from Cluster 1 to Cluster 2

Hi,

I want to set up load balancing from windows 2016 over different domains.

e.g.

Cluster domain: example.com

Host1: first.example.com

Host2: second.example.com

Is it possible? And how, if yes?

Regards

↧

Resources in Failover clustering randomly go offline

November 28, 2019, 3:36 pm

≫ Next: How much can performance be lowered running on Redirected Mode?

≪ Previous: NLB (Load Balancing) for Cross Domain enviroment

We are using a 4 node Windows 2012 Microsoft cluster for our file servers in our environment.

Third time in this month we have noticed that some disks went offline in the Cluster Admin console.

Failover did not happen and the resources were not available until we had to manually get the resources online.

Shall send a screenshot in the next message as i cant do it here.

Note: Storage is HPE 3Par. We are using Thin provisioning on the disks and the disks keep filling up to result in auto extend. I would also like to know if there is a windows config option to tell windows not to offline the lun when the 3par NAKs the IOP.

The event logs in the cluster have a following sequence for the disks going offline.

Warning Event id 51 was reported last night at - An error was detected on device \Device\Harddisk18\DR18 during a paging operation.

Error event id 150 was triggered at - Disk 18 has reached a logical block provisioning permanent resource exhaustion condition.

Eventually it resulted in event error 1038 and 1069 - Ownership of cluster disk 'XXXPrd7' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration.

Shall appreciate inputs. Thanks.

- Shailesh

↧

How much can performance be lowered running on Redirected Mode?

November 29, 2019, 11:58 am

≫ Next: issue in cluster while loading web based portal

≪ Previous: Resources in Failover clustering randomly go offline

Recently we received some complaints regarding general VM performance on a 2-node Hyper-V cluster

We´re using a file system driver, visible via FLTMC as Legacy and this particula file system driver (ArcServe) is putting CSV in redirected mode

How much performance we could improve, putting CSV on Direct Access?

We´re talking about 10%? 100%?

↧

issue in cluster while loading web based portal

November 30, 2019, 3:23 am

≫ Next: windows server 2012 cluster

≪ Previous: How much can performance be lowered running on Redirected Mode?

Hi experts,

I am getting below issue while loading web-based portal in one of the node, please guide;

Cluster network name resource 'Cluster Name' encountered an error enabling the network name on this node. The reason for the failure was:

unable to login computer object "Cluster Name"

Event ID 1789

Ensure the domain controller is accessible to this node within the configured domain

↧

windows server 2012 cluster

December 1, 2019, 10:35 pm

≫ Next: Problems on deleting big files from CSV volumes

≪ Previous: issue in cluster while loading web based portal

Hello every one.. need some help on the cluster. currently i have cluster setup with 2 hosts. This is setup'd in abc.com domain. now company ask me to change the domain name to xyz.com.. i already have domain controller for xyz.com. it is virtual machine.

now my question is how do i change two cluster hosts from abc.com to xyz.com and keep the cluster configuration? or do i have to put these two hosts on xyz.com and re configure the cluster setup? xyz.com domain controller is a VM and it is sitting on the same cluster.

appreciate any help plz.

↧

Problems on deleting big files from CSV volumes

December 2, 2019, 7:13 am

≫ Next: Unable to perform cluster validation -- "You do not have administrative privileges on the server ".

≪ Previous: windows server 2012 cluster

I have a 2 HCI cluster nodes, volume are CSVFS_ReFS, Im using Disk Speed to test storage performance, the results of the tests are ok, but when I delete the file created by the test (around 500GB) the volume hangs, and it takes 15 minutes to the volume become responsive again. The problem does not happen with small files.

Any ideia on what is causing this issue.

↧