Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

Windows cluster fileshare witness fails to configure - win 2019

$
0
0

I encounter this error when i try configuring the file share witness on 2 node windows cluster on AWS EC2 instances.  I ensured cluster and windows node objects have enough permission on share folder. But still i get this error, i have no clue why this occurs. 

An error was encountered while modifying the quorum settings.
Your cluster quorum settings have not been changed.

There was an error configuring the file share witness '\\x.x.x.x\share folder'.

Unable to save property changes for 'File Share Witness'.

Unknown error (0xc000006b)


Problems with a volume in a Windows Server 2012 R2 Hyper-V cluster

$
0
0

Hi!

A problem started to arise.
You have a Windows Server 2012 R2 Hyper-V cluster. Three HP Proliant DL380 servers. HP MSA 2050 shared disk subsystem.
An error occurs on the cluster:
The shared cluster volume "Volume7" ("FS-FC") is suspended on the node due to "(c00000b5)". All I / o operations will be temporarily queued until the path to the volume is re-set.
When I look at the Cluster Manager, it already exists in the system. It seems to be temporarily disabled, but it affects the operation of servers and users.

System Center registers an error:
Shared Volume IO is paused warning Description
Warning source: cluster Service
Path: SFH 05.sth. local\Cluster service
Warning rule: Shared Volume IO is paused
Created: 03.04.2020 19: 50:05
The shared cluster volume "Volume7" ("FS-FC") is suspended on the node due to "(c00000b5)". All I / o operations will be temporarily queued until the path to the volume is re-set.

Volume can be other.

Event Details

Event ID:

5120

Source:

Microsoft-Windows-FailoverClustering

Symbolic Name:

http://go.microsoft.com/fwlink/?LinkId=83027

Solutions

CSV - Review events related to communication with the volume

There has been an interruption to communication between a cluster node and a volume in Cluster Shared Volumes. This interruption may be short enough that it is not noticeable, or long enough that it interferes with services and applications using the volume. If the interruption persists, review other events in the System or Application event logs for information about communication between the node and the volume.

If you do not currently have Event Viewer open, see "To open Event Viewer and view events related to failover clustering."

To perform the following procedure, you must be a member of the localAdministrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

...

You can understand that there was a timeout due to the unavailability of the DNS server.
Can anyone have any ideas about this?

VM is not listed to add to the failover clustering manager

$
0
0

I'm facing a particular issue:

I have a S2D solution with hyper-v Windows Server 2019. There's 4 host with 20 VMs in total.

I configured the roles for 19 VMs without problem, but the 20th VM called "SRVAULV01" is not listed to configure as a VM role, but it shows up in hyper-v manager.

Is in the same CSV as the other VMs.

S2D setup with cache drive

$
0
0

I've got 2 server in a cluster running 2019 and I am trying to setup Storage spaces direct. Each server has 4 SSD drives for the S2D pool as well as a NVMe drive. 

When I do Get-PhysicalDisk it shows the disk and appears to be right. However I do notice the meida type is ssd and not nvme which I don't see as an option. Does that matter?

Number FriendlyName               SerialNumber                     MediaType   CanPool OperationalStatus HealthStatus U
                                                                                                                     s
                                                                                                                     a
                                                                                                                     g
                                                                                                                     e
------ ------------               ------------                     ---------   ------- ----------------- ------------ -
5      SAMSUNG MZVLB256HBHQ-000L7 0025_3881_01C9_1EE0.             SSD         True    OK                Healthy      A
3      ATA INTEL SSDSC2BX80       BTHC609000ZY800NGN               SSD         True    OK                Healthy     A
4      DELL PERC H730 Mini        00f6a497031afb152600894b80a06d86 Unspecified False   OK                Healthy      A
1      ATA INTEL SSDSC2BX80       BTHC6090012G800NGN               SSD         True    OK                Healthy     A
0      ATA INTEL SSDSC2BX80       BTHC609000Z9800NGN               SSD         True    OK                Healthy     A
2      ATA INTEL SSDSC2BX80       BTHC609000YY800NGN               SSD         True    OK                Healthy     A

When I go to enable storage spaces it works, but at the end it says there was no cache drive and it doesn't add the cache drives.

Where have I gone wrong? These NVMe drives are in a Dell R730 with the x4x4x4x4 pcie adapter card (dell branded)

Thank you

Domain rename and Hyper-V Cluster

$
0
0

Hello,

I'm planning an Active Directory Domain rename. We have also a cluster with Hyper-V 2016, I've tried to search about this operation but somewhere it seems that's supportted and somewhere they says that's not supported and I need to rebuild the cluster.

Has anyone some experience with this? what is the correct path?

Thanks for your help


Do I need to use Raid 0 or 5 when I'm using iSCSI virtual disk VSAN

$
0
0

Hello,

I  know that in VMware  VSAN  I don't need to use Hardware Raid to have a VSAN share  Drives,  The hard drives on the servers  can be independent, no RAID using the hardware SCSI controller,  the VSAN will take care of the redundancy. 

but now I'm setting up a fail over cluster using 3 Windows server 2019 standard edition to use for fileserver and HyperV high availability. 

My question is:   Do I need to use the hardware RAID 0 or 5  using a SCSI controller,  then setup the Virtual disk using iSCSI target ?   or the iSCSI target will take care of the redundancy?  

Current scenario: 

I have 3 servers with 4 x 2 TB sata, I'm using now  RAID 5  that give me only 5.5 TB instead of 8 TB, on top of that I created the iSCSI target and a virtual hard drive, since I have 2 nodes then the virtual hard drive size is 11 TB.   but I'm wondering if I use only the iSCSI  and no hardware raid then I can use the full hard drive capacity and still have redundancy. 

I hope my question is clear   :)  lol 

Thanks 

 


Cluster Name Object and Group policy

$
0
0

Hi All,

Are the group policies applied to (virtual) cluster name objects?

I think, no. But I don't find any evidence on Microsoft sites.

Thanks,

GY

the cluster to which you are attempting to connect is not a version of the cluster supported on Windows 10

$
0
0
I am using the RSAT tools on Windows 10 2004 and am trying to manage a server 2016 cluster(Functional level 9) but I get the "the cluster to which you are attempting to connect is not a version of the cluster supported" message.. Cluster manager should be N-1 so a 2016 cluster should be supported.. How can I fix this?

Do failover clusters require Storport-controlled disks?

$
0
0
I have an MPIO-like product that virtualizes disks on a virtual (non-Storport) bus in Windows.  Within Failover Cluster Manager, these disks cannot be added to a cluster.  If the disks are just attached to an FC HBA (with a Storport miniport driver), they can be added to a cluster.   Do failover clusters require Storport-controlled disks?

New Architecture with Windows Failover Cluster

$
0
0

Hello
  I need to configure a new architecture on Windows Server 2019 with Windows 10 end-points. This configuration will sustain domain controllers (ADDS & DNS), DHCP, KMS, certification services, Log Server, SCCM and Mail Server (Exchange). The mail section will be configured in the end separately with DAG.

  I'm a beginnner with Failover CLuster but I chose this solution in order to implement HA. I have tested the configuration with 3 Nodes on FileServer Role. It works wonderfully.

  I need some expert advice on the manner of configurating the cluster's roles and on the way of implementing the services.
  So ... can someone, please, help me with the answers to the following questions:
 
  - If i have over 10000 end-points, are there 3 cluster's nodes enough ? And a single server iSCSI Target ?
 
  - Wich is the best configuration of DHCP Server ?
    I'm thinking about 3 possible solutions:
    * 2 DHCP Servers with a embeded DHCP Failover installed on 2 VM by using DHCP role from Failover Cluster ;
    * 2 DHCP Servers installed directly on DHCP role from Failover cluster ;
    * 2 DHCP Servers within a DHCP configuration from embedded failover (outside the cluster).

  - iSCSI Server target should be a domain controller or just a memeber server ?
  - Taking into account that the WSUS Server determines a lot of traffic should it be configured outside the cluster ? Same question for SCCM or Log Server (Event Forwarding/Subscription).

  - I have over 50 file servers on Samba (RedHat) on different subnets which shall be transferred on Windows File Server. Taking into account the fact that SCSI disk are local (not UNC Path) it results a limited number of file servers. In this case which do you think that is the best solution to solve the problem ?

Best regards

VM,s Failed while Draining the Roles

$
0
0

Hi Team,

We have 10 Node clustering Environment. In that when i drain the role from one node  for patching maintenance. while Draining the role few of the VM's like 10 to 15 VM's  getting  failed status in the Roles section from different nodes.

I tried bring Power ON those VMs from Hyper-V but it's failed.

And Entire cluster is getting refreshing every 5 Mints.

But our Onsite folks solved the problem after 10 Mints when he come to know while i am asking he is not sharing the information to me. 

I know it should be the know issue i researching many website but i am still curious how will be solved by 10 Mints.

Cluster Event IDs:

1205,1254,1069,21502

Thanks and Regards 

Mohanbabu M

Windows Server 2016 Failover Cluster Fails Active Directory Validation

$
0
0

Hello All.

I have an environment with three domain controllers all within the same site that are replicating between each other.  We set up a Failover Cluster on two Windows 2016 nodes and noticed that it failed the Active Directory Configuration validation tests.  The nodes failed this one test 100% of the time.  After digging in the Event Viewer, we noticed that the error messages for cluster creation included the message "A more secure authentication method is required."  We require the the group policy setting "Domain controller: LDAP server signing requirements" to be set to "Require signing", but out of curiosity we set it to "None" and lo and behold, the nodes started connecting to Active Directory.

But it doesn't end there.  Although the Active Directory validation tests started succeeding, they only succeeded sometimes.  In other words, sometimes Node 1 would succeed and Node 2 fails, sometimes Node 1 fails but Node 2 succeeds, sometimes they both fail, and sometimes they both succeed.  Through a long mess of troubleshooting, we found out that if we removed one of the domain controllers from the DNS IP list on the nodes' NIC IPv4 properties, the validation tests would succeed 100% of the time.  This points to a DNS issue, I'm guessing, but I'm not too sure.

When querying the domain suffix in nslookup, all three domain controllers return the correct IPs.  All three domain controllers respond to port 636 and offer the correct certificate.

So my question is two-fold:  what is preventing the nodes from connecting to Active Directory while LDAP server signing is required, and what manner of DNS issue prevents them from connecting if it is not?

Multicase or Unicast

$
0
0

Hi,

I am configuring NLB on two Windows 2016 Virtual Servers to host sharepoint. Which cluster operation mode is recommended. Do i have to enable mac spoofing if i use unicast only?

Thanks.

Dup pings on one node

$
0
0

Hi,

We have a failover cluster on Windows Server 2019 with 2 nodes.

When I ping a VM from a linux machine I get duplicated pings if the VM is on the first node. All VMs on that node do the same behaviour. If I move the VM to the second node there are no dup pings anymore.

As I read this situation is normal on NLB cluster and can be avoid with "FilterIcmp=0". But it is not an NLB cluster therefor I cannot find FilterIcmp property.

HW environments are the same, driver versions are the same.

Do you have any recommendation how can I track down the root cause of the issue?



Windows 2008 R2 Cluster refuses to start on one node.

$
0
0

Hi,

I'm having an intresting error on a SQL Cluster

System is 2*Windows 2008 R2 SP1 & SQL 2008 R2 with A/P-cluster configuration.

After a crash which happened by unknown reasons one of the nodes refuses to start. Eventlog gives me the following error:

Eventid: 1574 - The failover cluster database could not be unloaded. If restarting the cluster service does not fix the problem, please restart the machine.

The KB-articles about it hints that I should restart the server. And that has been done. I even had the whole cluster down and started them in diffrent orders without any success.

Furthermore when I look into the HKLM-keys of the registry i can not find the "cluster" entry. Ive been pondering with the idea of exporting the "HKLM\cluster"-key from the living cluster node.

The cluster log clearly gives indications that its something fishy.

---------------

00000ad0.0000093c::2011/09/07-13:04:58.867 INFO  [NETFT] Disabling IP autoconfiguration on the NetFT adapter.
00000ad0.0000093c::2011/09/07-13:04:58.867 INFO  [NETFT] Disabling DHCP on the NetFT adapter.
00000ad0.0000093c::2011/09/07-13:04:58.867 DBG   [NETFT] Disabling DHCP on NetFT interface name ethernet_11.
00000ad0.0000093c::2011/09/07-13:04:58.867 INFO  [CS] Starting DM
00000ad0.0000093c::2011/09/07-13:04:58.867 INFO  [DM] Node 1: Reading quorum config
00000ad0.0000093c::2011/09/07-13:04:58.867 DBG   [DM] Unloading Hive, Key \Registry\Machine\Cluster.restored, discardCurrentChanges true
00000ad0.000005b4::2011/09/07-13:04:58.867 INFO  [CS] Disabling connection security.
00000ad0.00000960::2011/09/07-13:04:58.867 DBG   [NETFTAPI] received NsiAddInstance  for 169.254.1.91
00000ad0.0000093c::2011/09/07-13:04:58.867 INFO  [DM] Key \Registry\Machine\Cluster.restored does not appear to be loaded (status STATUS_OBJECT_NAME_NOT_FOUND(c0000034))
00000ad0.0000093c::2011/09/07-13:04:58.867 WARN  [DM] Node 1: Failed to unload restored hive from the registry with error STATUS_INVALID_PARAMETER(c000000d)
00000ad0.0000093c::2011/09/07-13:04:58.867 INFO  [DM] Node 1: loading local hive
00000ad0.0000093c::2011/09/07-13:04:58.867 ERR   [DM] Node 1: failed to unload cluster hive, error 2.
00000ad0.0000093c::2011/09/07-13:04:58.867 ERR   Hive unload failed (status = 2)
00000ad0.0000093c::2011/09/07-13:04:58.882 DBG   Hive unload failed: set netft heartbeat interval to 900 seconds
00000ad0.0000093c::2011/09/07-13:04:58.882 ERR   Hive unload failed (status = 2), executing OnStop
00000ad0.0000093c::2011/09/07-13:04:58.882 INFO  [DM]: Shutting down, so unloading the cluster database.
00000ad0.0000093c::2011/09/07-13:04:58.882 INFO  [DM] Shutting down, so unloading the cluster database (waitForLock: false).
00000ad0.0000093c::2011/09/07-13:04:58.882 WARN  [DM] Trying to Unload when no Hive is loaded, ignored
00000ad0.0000093c::2011/09/07-13:04:58.882 ERR   FatalError is Calling Exit Process.


MCITP, MCP, VCP, AASE & Insane!

S2D IO TIMEOUT when rebooting node

$
0
0

I am building a 6 Node cluster, 12 6TB drives, 2 4TB Intel p4600 PCIe NVME drives - Xeon Plat 8168/768GB Ram, LSI9008 HBA.

The cluster passes all tests, switches are properly configured and the cluster works well, exceeding 1.1 million IOPS with VMFleet. However, at current patch as of now (April 18 2018) I am experiencing the following scenario:

When no storage job is running, all vdisks listed as healthy and I pause a node and drain it, all is well, until the server actually is rebooted or taken offline. At that point a repair job is initiated, and IO suffers badly, and can even stop all together, causing vdisks to go in to paused state due to IO timeout. (listed as the reason in cluster events) Exacerbating this issue, when the paused node reboots and joins, it will cause the repair job to suspend, stop, then restart (it seems.. tracking this is hard was all storage commands become unresponsive while the node is joining) At this point io is guaranteed to stop on all vdisks at some point for long enough to cause problems, including causing VM reboots. The cluster was initially formed using VMM 2016. I have tried manually creating the vdisks, using single resiliency (3 way mirror), multi tier resiliency, same effect. This behavior was not observed when I did my POC testing last year. Its frankly a deal breaker and unusable, as if I cannot reboot a single node without stopping entirely my workload, I cannot deploy. I'm hoping someone has some info. I'm going to re-install with Server 2016 RTM media and keep it unpatched, and see if the problem remains. However it would be desirable to at least start the cluster at full patch. Any help appreciated. Thanks


Limit of amount of ClusterGroups

$
0
0
Hi,

We have an issue with Windows Failover Cluster.
OS: Windows Server 2019 Standard (EN).
It is a test cluster, it was created on VMs on Windows Server 2019 Standard Hyper-V Cluster. All VMs are on the same node.
AD: 2 vCPUs, 4GB of RAM.
Nodes: 4 vCPUs, 16GB of RAM.

For test purposes we are creating Cluster groups with two resources each: IP address and Network name.
It is possible to create about 423 groups. After this cluster can't enable network name for the next group. The only error which we were able to find in cluster log - network error. However, it didn't generate a lot of traffic (~ 10mbit/s) and we can't understand the root of issue. According to monitoring everything is good - CPUs, RAM, drive, network - looks like it has more then enough resources.

Can somebody tell what is the limitation in this case?

Permission only to do the cluster fail over activity

$
0
0

He Team

I do have a requirement to provide an AD account access only fail over cluster management of few servers.
The account shouldn't have any permission to restart\shutdown servers OR any other high privileges on these servers.

 

Network Load Balancing

$
0
0

Hi All,

I hope someone can help and offer some advice. I am studying for my MCSA. I have arrived at chapter 5 Configuring High Availability, as part of the exercise I have to create two nodes and one client machine. I have installed and setup the necessary NLB software on both nodes, and setup the cluster. I am now able to test by logging on using the client machine and then accessing IIS web page with no problems on node 1. I switch off node 1, and am able to access node 2 with out any interruption or problems.

All good but I am unsure how this all works. When setting up NLB are the nodes the same? i.e. are the web pages identical and how do they get updated if there are any changes?

Any help and advice would be greatly appreciated.

Regards. 

When adding a new node to an existing failover cluster it becomes a possible owner and takes ownership

$
0
0

Hello.

Apologies if this is not the correct place for this.

I have a WSFC that has multiple nodes which have SQL Server 2016 SP2 installed on them. We are running Windows Server 2016 Datacenter.

When I add a new node to the cluster it automatically becomes a "possible owner" and immediately the cluster tries to make it the "Current Host Server". We do not want to to become a possible owner or the "Current Host Server".

Is there a way to add a node without this happening?

Thank you.

Viewing all 5654 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>