Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

CNO missing from ADUC and live migrations.

$
0
0
New to Failover clustering in Microsoft so this may be a fairly basic question. 

2 nodes in a failover cluster running Windows server 2016 and for some reason the CNO is not in ADUC (don't know whether it was ever there or whether it has been deleted as I did not build this cluster). Live migrations were failing so I built a test lab, replicated the exact problem and tried two different things - 

1) created an entry in ADUC in the computers group and then within FCM took the cluster server name offline, ran a repair and then brought it back online which allowed live migrations to work but in Server Manager -> All Servers it is still complaining about not being able to find the CNO (kerberos authentication error reported) and if I try to add it manually using "Add Servers" it cannot find the CNO in AD. 

I suspect this is because when the entry is created automatically when setting up a new failover cluster there is more going on than what I did by simply adding an entry manually. 

2) destroyed the cluster and recreated it which resulted in everything working correctly ie. not just live migrations but also the CNO is now displayed as online in Server Manager -> All Servers etc and the CNO entry was created automatically in ADUC computer group.

So my question is do I have to rebuild the cluster to avoid any future problems ie. if I do option 1) I can do live migrations but the fact that I am still getting errors in Server Manager and the CNO is not online makes me think I may face other issues further down the line. 

Is there a way to go with option 1 and resolve all the issues ? 

If not and rebuilding the cluster is the best thing to do are there any gotcha's I should look out for (my test lab rebuild went fine but the production setup is a lot more complicated). 

Many thanks for any pointers. 

Failover cluster very slow

$
0
0

Hi everyone,

I took over admin tasks after one of admins left. He was responsible for failover cluster but I heard that users complained a lot. Main issue is that is very very slow. I logged in and check how it looks like.

It is 2 node cluster with witness disk.

Under network it has 2 networks both allowing cluster and client. There is no heartbeat network for cluster communiction. Both nics are in NIC Teaming mode. 

Storage - 2 csv are configured and each node is owner of 1.  Storage type is SAN and it is directly attached to the servers.

How to troubleshoot slow performance, what to think about or how to start troubleshooting?

IF you need more info just ask. Thank you all


S2D 2 node cluster

$
0
0

Hello,

We have 2 node S2D cluster with windows server 2019. Between two nodes we have directly connected RDMA storage network (Cluster only) and client-facing network based on LACP teaming on each node (Client And Cluster). We have done failover test and it works: when we power off one node, virtual machines migrates to another host as expected. But when we unplug client facing adapters (two adapters in LACP) on one node, where VM are resides, VM migration fails and after some time Cluster network name and Cluster IP address also failed. When we plug again client facing adapters (two adapters in LACP) to failed node, cluster IP address recover and VM client network works again. So the problem: cluster migration fails after unexpectedly shutdown of client facing network of one node, where VM are resides. Nodes can communicate with each other through Storage network and all nodes are up in Failover Cluster manager. So when client network is down, VM should migrate to another node with working client-facing network. But cluster fails and VM do not migrate. Where we can fix this behaviour? Has anyone met this before?

2-Node Stretch Cluster with Storage Replica?

$
0
0
I want to create a 2-node stretch cluster with Storage Replica.  I was able to setup the replica but once I create the cluster it breaks the storage replica.  I'm not sure if that is by design or not.  Is there something I am missing that prevents this type of scenario?  If I create the cluster first it does not allow me to create the replica with an error that it cannot find the volume on the source server that it must be a CSV on the cluster or added to a role on the cluster.  Neither of those are possible without the Replica running between the cluster nodes.

un-clustering hyperv nodes into standalone hosts

$
0
0

As the title suggests, I have a pair of server 2016 hosts which are currently operating as a 2-node failover cluster.

Due to relocation of resources, I'm looking to break up the nodes into two standalone servers. One of the servers is to be decommissioned from Hyper-V use, the other to remain as an operational stand-alone Hyper-V host.

Is there a known process for un-clustering nodes back to stand along hyper hosts.

I have a number of sizable VM's (storage size wise) which currently sit on the C:\ClusterStorage\ location, I'm trying to figure out if I need to create a new LUN on my SAN and migrate the VMs onto this new LUN, or whether I could utilize the existing disk LUN currently used by the CSV volume?

Is there scope for me to powerdown the VM's, uncluster the node, convert the CSV volume to a normal SAN based disk and power back up the VMs on the now standalone host?
*EDIT*
I know there's the option to 'remove from cluster shared volumes' but I'm not sure what the impact is of doing this is..






Update-ClusterFunctionalLevel : You do not have administrative privileges on the cluster.

$
0
0

I have just finished updating our cluster nodes from Windows server 2016 to 2019.

On 2 of our clusters there were no issues, but on a 3rd I am having the following issue.

When O go to update the functional level, I get this.

Update-ClusterFunctionalLevel : You do not have administrative privileges on the cluster.

I am a domain admin, in the local admins group on each node, and have full cluster access.

I can run any other administrative powershell cluster command without issue, and can fully admin via the GUI, but running the command Update-ClusterFunctionalLevel gives the no privileges error.

If I ask my other admin to run it, he gets the same.

If I create a net new AD account, assign it local admin on each node, and grant-cluster access full  , that account also get it.

I opened a MS support ticket, but after 10 days I have still yet to get a call back. I even called back in (Sev-B = 4 hours...) and they said, yes I am still in the queue - wth....)

Anyway, I am assuming it is likely a bad registry entry, or something messed up on an AD object perhaps, but not sure where to look.

대구오피 “uuzoa2.com ” {유유닷컴} 오피가격

대전오피 「uuzoa2.com 」 ▷유유닷컴◁ 오피가격


S2D StoragePool and Virtual disk size

$
0
0

Hi, All.

I'm testing a failover cluster with S2D enabled on WS2019. I have 3 VMs with 2 HDD 5GB on each. I've created failover cluster, enabled S2D and created a storage pool.

PS C:\Windows\system32> Get-ClusterS2D

CacheMetadataReserveBytes : 34359738368
CacheModeHDD              : ReadWrite
CacheModeSSD              : WriteOnly
CachePageSizeKBytes       : 16
CacheState                : Disabled
Name                      : s2d-cluster2
ScmUse                    : Cache
State                     : Enabled

PS C:\Windows\system32> Get-StorageSubsystem *cluster* | Get-PhysicalDisk

DeviceId FriendlyName        SerialNumber                     MediaType CanPool OperationalStatus HealthStatus Usage       Size
-------- ------------        ------------                     --------- ------- ----------------- ------------ -----       ----
3002     VMware Virtual disk 6000c29503bcbdb2cf84ec2867ea371b HDD       True    OK                Healthy      Auto-Select 5 GB
1002     VMware Virtual disk 6000c29a55adf0ba2fb870ad3a9dfd32 HDD       True    OK                Healthy      Auto-Select 5 GB
3001     VMware Virtual disk 6000c291b4372413c4f7feaa10fe9beb HDD       True    OK                Healthy      Auto-Select 5 GB
2002     VMware Virtual disk 6000c2993762bd6617b3cd5eef1ff9d0 HDD       True    OK                Healthy      Auto-Select 5 GB
2001     VMware Virtual disk 6000c292a3946d8f1b33bb0f716ecd44 HDD       True    OK                Healthy      Auto-Select 5 GB
1001     VMware Virtual disk 6000c294f797902a11829554da27c6ae HDD       True    OK                Healthy      Auto-Select 5 GB


Questions about space allocation:
1. After creating a pool I see only 26.9GB free space. What is AllocatedSize? And where is gone also 1.6GB?

PS C:\Windows\system32> Get-StoragePool

FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly    Size AllocatedSize
------------ ----------------- ------------ ------------ ----------    ---- -------------
Primordial   OK                Healthy      True         False        70 GB       29.9 GB
S2D_4        OK                Healthy      False        False      26.9 GB        1.5 GB
Primordial   OK                Healthy      True         False        70 GB       29.9 GB

2. I try to create virtual disk 3GB with mirror. I'm expecting that pool decrease by 6GB, but really footprint is 8GB

PS C:\Windows\system32> New-VirtualDisk -StoragePoolFriendlyName "S2D_4" -FriendlyName disk1 -ResiliencySettingName Mirror -NumberOfDataCopies 2 -ProvisioningType Fixed -Size 3GB

FriendlyName ResiliencySettingName FaultDomainRedundancy OperationalStatus HealthStatus Size FootprintOnPool StorageEfficiency
------------ --------------------- --------------------- ----------------- ------------ ---- --------------- -----------------
disk1        Mirror                1                     OK                Healthy      3 GB            8 GB            37.50%


PS C:\Windows\system32> Get-StoragePool

FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly    Size AllocatedSize
------------ ----------------- ------------ ------------ ----------    ---- -------------
Primordial   OK                Healthy      True         False        70 GB       29.9 GB
S2D_4        OK                Healthy      False        False      26.9 GB        9.5 GB
Primordial   OK                Healthy      True         False        70 GB       29.9 GB


3. Next I try to create virtual disk 500MB with mirror. I'm expecting that pool decrease by 1000MB, but really footprint again is 8GB

PS C:\Windows\system32> New-VirtualDisk -StoragePoolFriendlyName "S2D_4" -FriendlyName disk2 -ResiliencySettingName Mirror -NumberOfDataCopies 2 -ProvisioningType Fixed -Size 500MB

FriendlyName ResiliencySettingName FaultDomainRedundancy OperationalStatus HealthStatus Size FootprintOnPool StorageEfficiency
------------ --------------------- --------------------- ----------------- ------------ ---- --------------- -----------------
disk2        Mirror                1                     OK                Healthy      3 GB            8 GB            37.50%


PS C:\Windows\system32> Get-StoragePool

FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly    Size AllocatedSize
------------ ----------------- ------------ ------------ ----------    ---- -------------
Primordial   OK                Healthy      True         False        70 GB       29.9 GB
S2D_4        OK                Healthy      False        False      26.9 GB       17.5 GB
Primordial   OK                Healthy      True         False        70 GB       29.9 GB

Another question, if I try to create virtual disk from GUI, most of options are absent and I can only set name and size of new virtual disk. For example I can't set resiliency two-way mirror.

Help please, what did I do wrong?

Up But Isolated Cluster Node

$
0
0

I'm running Server 2016 fully patched in a 5 node cluster.  Hyper-V and S2D for a hyper-converged solution running a few hundred VMs.  Two days ago one of my nodes decided that it wanted to be cranky.  This caused the roles to rearrange on the systems and ended up putting one of my healthy nodes in the "Isolated" state.  <g class="gr_ gr_456 gr-alert gr_gramm gr_inline_cards gr_run_anim Grammar only-ins replaceWithoutSep" data-gr-id="456" id="456">Root</g> cause for the other node that went out to lunch is still unknown and is being researched separately.  However, this other healthy node has been stuck in an online but isolated state.  See screenshot.  I've seen plenty of examples where the node is offline and isolated, typically a network problem(network looks line. I have three separate NICs with separate switches/<g class="gr_ gr_3558 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling" data-gr-id="3558" id="3558">vlans</g>/IP space).  I can live migrate VMs, my S2D storage is fully healthy on the cluster.  No issues using this node, but I don't like the "isolated" state.  I ran the cluster validation test for networking and it returns healthy.  No warnings or errors in the validation test.  Event logs show that the node when isolated, but in the same second I have a follow-up event that it's no longer isolated.  These events exist on all nodes in the cluster, so there is no reason why it should be isolated.  I'm sure if I rebooted this node(or even restarted the cluster service) that it would come back online as healthy, but another node in the cluster is having hardware issues, so that's not an option at the moment.  Any thoughts would be appreciated on how to remove the isolated state.  The end of the <g class="gr_ gr_5208 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" data-gr-id="5208" id="5208">powershell</g> command shows it all.  <g class="gr_ gr_5545 gr-alert gr_gramm gr_inline_cards gr_run_anim Grammar only-ins doubleReplace replaceWithoutSep" data-gr-id="5545" id="5545">State</g> is Up. StatusInformation is Isolated...


Cannot create checkpoint when shared vhdset (.vhds) is used by VM - 'not part of a checkpoint collection' error

$
0
0

We are trying to deploy 'guest cluster' scenario over HyperV with shared disks set over SOFS. By design .vhds format should fully support backup feature.

All machines (HyperV, guest, SOFS) are installed with Windows Server 2016 Datacenter. Two HyperV virtual machines are configured to use shared disk in .vhds format (located on SOFS cluster formed of two nodes). SOFS cluster has a share configured for applications and HyperV uses \\sofs_server\share_name\disk.vhds path to SOFS remote storage). Guest cluster is configured with 'File server' role and 'Failover clustering' feature to form a guest cluster. There are two disks configured on each of guest cluster nodes: 1 - private system disk in .vhdx format (OS) and 2 - shared .vhds disk on SOFS.

While trying to make a checkpoint for guest machine, I get following error:

Cannot take checkpoint for 'guest-cluster-node0' because one or more sharable VHDX are attached and this is not part of a checkpoint collection.

Production checkpoints are enabled for VM + 'Create standard checkpoint if it's not possible to create a production checkpoint' option is set. All integration services (including backup) are enabled for VM.

When I delete .vhds disk of shared drive from SCSI controller of VM, checkpoints are created normally (for private OS disk).

It is not clear what is 'checkpoint collection' and how to add shared .vhds disk to this collection. Please advise.

Thanks.

VMs Unable to Live Migrate

$
0
0

I have a Failover Cluster running on two Server 2012 R2 Datacenter nodes hosting our Hyper-V environment.  Recently, we have run into an issue where the VMs won’t migrate to the opposite node unless the VM is rebooted or the Saved State data is deleted.  The VMs are stored either on an SOFS volume on a separate FO Cluster or a CSV volume both nodes are connected to.  The problem occurs to VMs in either storage location.

Testing I’ve done is below.  Note that I only list one direction, but the behavior is the same moving in the opposite direction, as well:

- Live Migration: if a VM is on Node1 and I tell it to Live Migrate to Node2, it begins the process in the console and for a split second shows Node2.  It immediately flips back to Node1.  If the VM has rebooted since the last migration, it will go ahead and migrate to Node2.  It will not migrate back until the VM has been rebooted again.  The Event Log shows IDs 1205 and 1069.  1069 states “Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.”  All resources show Online in Powershell.

- Quick Migration: I initiate a Quick Migration and the VM will move from Node1 to Node2, but will fail to start on Node2.  Checking the Event Log I see Event IDs 1205 and 1069.  1069 states “Cluster resource 'Virtual Machine IDF' of type 'Virtual Machine' in clustered role 'IDF' failed. The error code was '0xc0370027' ('Cannot restore this virtual machine because the saved state data cannot be read. Delete the saved state data and then try to start the virtual machine.').”  After deleting the Saved State Data, the VM will start right up and can be Live or Quick Migrated once.

- Shutdown VM and Quick Migration: I have not had an occasion of this method fail so far.

- Rebooting the Nodes has had no discernable effect on the situation.

- I’ve shut down a VM and moved its storage from SOFS to the CSV and still have the same issues as above.  I moved the VHDX, the config file, and saved state data (which was empty while the VM was powered down) to the CSV.

Items from the FO Cluster Validation Report:
1. The following virtual machines have referenced paths that do not appear accessible to all nodes of the cluster. Ensure all storage paths in use by virtual machines are accessible by all nodes of the cluster.
Virtual Machines Storage Paths That Cannot Be Accessed By All Nodes 
Virtual Machine       Storage Path      Nodes That Cannot Access the Storage Path 
VM1                       \\sofs\vms         Node1

I’m not sure what to make of this error as most of the VMs live on this SOFS share and are running on Nodes1 and 2.  If Node1 really couldn’t access the share, none of the VMs would run on Node1.

2. Validating cluster resource File Share Witness (2) (\\sofs\HVQuorum).
This resource is configured to run in a separate monitor. By default, resources are configured to run in a shared monitor. This setting can be changed manually to keep it from affecting or being affected by other resources. It can also be set automatically by the failover cluster. If a resource fails it will be restarted in a separate monitor to try to reduce the impact on other resources if it fails again. This value can be changed by opening the resource properties and selecting the 'Advanced Policies' tab. There is a check-box 'run this resource in a separate Resource Monitor'.

I checked on this and the check-box is indeed unchecked and both Nodes report the same setting (or lack thereof).

3. Validating cluster resource Virtual Machine VM2.
This resource is configured to run in a separate monitor. By default, resources are configured to run in a shared monitor. This setting can be changed manually to keep it from affecting or being affected by other resources. It can also be set automatically by the failover cluster. If a resource fails it will be restarted in a separate monitor to try to reduce the impact on other resources if it fails again. This value can be changed by opening the resource properties and selecting the 'Advanced Policies' tab. There is a check-box 'run this resource in a separate Resource Monitor'.

Validating cluster resource Virtual Machine VM3.
This resource is configured to run in a separate monitor. By default, resources are configured to run in a shared monitor. This setting can be changed manually to keep it from affecting or being affected by other resources. It can also be set automatically by the failover cluster. If a resource fails it will be restarted in a separate monitor to try to reduce the impact on other resources if it fails again. This value can be changed by opening the resource properties and selecting the 'Advanced Policies' tab. There is a check-box 'run this resource in a separate Resource Monitor'.

I can’t find a place to see this check-box for the VMs.  The properties on the roles don’t contain the ‘Advanced Policies’ tab.

All other portions of the Validation Report are clean.

So far, I haven’t found any answers in several days of Google searching and trying different tactics.  I’m hoping someone here has run into a similar situation and can help steer me in the right direction to get this resolved.  The goal is to be able to Live Migrate freely so I can reboot the Nodes one at a time for Microsoft Updates without having to bring down all the VMs in the process.




Data Migration

$
0
0

Hi,

Server connected with local hard disks which has data about 5 TB. No SAN Attached. 

We need to move the data to different location. We cant ship the server. 

Dont want to do via ROBOCOPY. 

Please let me know other possibilities 

Can't add new node to existing failover cluster

$
0
0

Hi,

i have problem adding new node to existing failover cluster. Existing failover cluster is two node cluster with node and file share majority. i'm using this cluster for SQL AlwaysOn Availability group. There are no shared volumes.

when i use failover cluster manager console i'm getting error:

The server 'N3.local' could not be added to the cluster.
An error occurred while adding node 'N3.local' to cluster 'Cluster1'.

The parameter is incorrect

Also i have this error in Application and services log/Microsoft/FailoverClustering-Manager/Diagnostic:

Exception occurred in background operation - System.ApplicationException: An error occurred while adding nodes to the cluster 'Cluster1'. ---> System.ApplicationException: An error occurred while adding node 'N3.local' to cluster 'Cluster1'. ---> System.ComponentModel.Win32Exception: The parameter is incorrect
   --- End of inner exception stack trace ---
   at MS.Internal.ServerClusters.ClusApiExceptionFactory.CreateAndThrow(Cluster cluster, Int32 sc, String format, Object arg0, Object arg1)
   at MS.Internal.ServerClusters.Cluster.AddNode(String nodeName, ClusterActionCallback callback)
   at MS.Internal.ServerClusters.Configuration.AddNodeManagement.AddNodes(ActionArgs actionArgs, ActionUpdateHelper updateHelper)
   --- End of inner exception stack trace ---
   at MS.Internal.ServerClusters.Configuration.AddNodeManagement.AddNodes(ActionArgs actionArgs, ActionUpdateHelper updateHelper)
   at MS.Internal.ServerClusters.Configuration.AddNodeManagement.PerformAddNodes(ActionArgs actionArgs)
   at MS.Internal.ServerClusters.Configuration.ConfigurationBase.PerformActionWrapper(BackgroundOperationStatus backgroundOperationStatus, BackgroundOperationArgs parameter)
   at MS.Internal.ServerClusters.BackgroundOperation`2.BackgroundOperationProc(Object state)


i have tried to add node from powershell with same error (parameter is  incorrect). I have tried to remove Failover cluster role and add it again but i'm still getting the same error.

Please advice,

Thank you

High Availability Cluster without Shared Storage

$
0
0

Hi Experts, 

I've been doing some research on how to achieve this goal and what's the best practice.

We are planning to do a high availability cluster for our server running the following services.

  1. Active Directory
  2. DNS and DHCP
  3. File Server

Currently we have one fully operational Windows server 2016 running in Dell R530. Since we have 2 set of dell server, we want to configure a HA cluster for downtime protection. And  we want to set it up in a way were we have a main server that will be doing all the workload and a backup server that will replace the main if it fail without downtime.

But most common reference I found related to our goal involved a shared network. Now what I wanted to know are:

  • Why is it recommended to have a shared storage 
  • Is it possible to configure HA without shared storage
  • If possible, what are the risk of not having a shared storage

Thank you in advance experts.


File Server Clustering between Two Domain Controller

$
0
0

Hi all,

Is it possible to cluster file server between two active directory domain controller.

As of now our server is still in standalone. We will soon add another active directory in our domain for fault tolerance if the server fail. 

Our current server runs the following services which we want to add redundancy that's why we want to add new server.

  • Active Directory
  • DNS
  • DHCP
  • File Server

In my research, Active Directory and DNS High Availability will be achieved once we add another domain controller in our current domain. And in DHCP there's a feature called DHCP Clustering.

But regarding File Server, I haven't found any clear ideas on how to achieve this.

Thanks in advance for your advises.

Error with cluster-aware updating

$
0
0
I currently have one cluster with two nodes.  I manually apply updates to the nodes with cluster-aware updating.  The last several updates have gone fine on node 1 but I get an error "partially failed" on node 2.  The description is "Node "xxx" failed to leave maintenance mode".  The node is up and running, all vm's hosted on it are fine as well.  Does anyone have any idea why I'm getting this error and how to resolve it?  Nothing has changed on the cluster or the nodes that I'm aware of.

Setting up generic service as Active-Active cluster

$
0
0

Hi All,

I am going to config (MySQL - generic service) as Active-Active cluster in WSFC.

Ref: http://www.clusterdb.com/mysql/mysql-with-windows-server-2008-r2-failover-clustering

Both node (Master & Slave) data will replicate by "MySQL Replication" like "MS Availability Group".

However service will only run as "Active-Passive", service "Running" on owner node but "Stopped" on passive node, in this case slave node cannot replicate because service stopped.

Was though an idea to keep both node service running, is using a script to monitor service per 10 second... and auto start the service if stopped... May I know is there any solution to keep service "Running" even switchover-ed?

Thanks for assist :)


Why Clustering Domain controllers is a bad approach?

$
0
0

Hi Experts,

I would like to ask your insights about why is it bad to cluster domain and what are the risk . I have read some forums and pages concerning about this, but it seems I can't get a clear picture of it.

I understand that DC doesn't need to be clustered for failover environment. but what if there are services in our environment that needed to be clustered such as File Server. 

Thanks in advance.

Clustering with Exchange 2007

$
0
0

Hi

We have an exchange 2007 server running on windows 2008 R2. A spare server recently became available and I have been asked to see if its possible to create a fail over cluster with our existing environment.

I appreciate that we have an un-supported version of exchange and windows but my boss wants to keep costs down so I want to want to see if its possible and if not what version do i need to go to. (Also he has ruled out virtualization)

I think its not possible as purchasing licences for old software could prove difficult.

Any feedback would be greatly appreciated.

Cheers

Colin

Viewing all 5654 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>