Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

Windows 2016 Hyper-V Cluster IP Addressing

$
0
0

I am trying to understand how and why this is working the way it is. We have a pair of Windows Server 2016 Standard servers running a Hyper-V cluster for Remote Desktop Services. This system has been running fine for almost a year and a half without any real problems. The servers are identical in hardware and software. They are updated at the same time to the same levels of update. They are both also running Starwinds iSCSI software.

Server 1
IP 1 - 10.0.0.70  (Primary/Native IP)
IP 2 - 10.0.0.200 (Cluster IP, assigned at cluster setup)
IP 3 - 10.0.254.70 (synchronize address for Starwinds)

Server 2
IP 1 - 10.0.0.71  (Primary/Native IP)
IP 2 - 10.0.0.200 (Cluster IP, assigned at cluster setup)
IP 3 - 10.0.254.71 (synchronize address for Starwinds)

Before I updated these servers last weekend the .200 address was only visible either via the Cluster Manager, ipconfig, or the DNS name 'ServerCluster'. It was something I could not find a place to set and know that I set it up during the original cluster setup. This .200 address could not be seen via the Network Management tool. None of the adapters showed the address thus it wasn't manipulatable. Furthermore there were no errors related to both servers having this IP address.

After applying several updates last weekend one thing I noticed is that the .200 address is now shown on one adapter in the Network Manager and within the same other places it could be seen. I can even delete or change the address if I wanted to. What's strange though is that the Cluster Manager is spitting out a ton of 'duplicate IP address' errors. However, each server is communicating with the other at the .200 address but with the partner server's MAC address. So Server1 is communicating cluster data to 10.0.0.200 at Server2's MAC address for .200.

Can anyone shed some light on this? Are there changes or other updates that I need to apply to rectify the addressing problem?


Cluster-Aware Updating giving FAILED HRESULT 0x80244022 on multiple clusters

$
0
0

I currently have 3 clusters (total will be around 20 when done running HYPER-V at our remote locations.  All three are getting the following Event:

Scan failed with HRESULT 0x80244022

We have a WSUS server that pushes updates across the domain and I am able to ping it from all of the various hosts.  I am still learning about MS Failover Clustering so if there is additional information needed please let me know.

I would include a screenshot but my account is still being verified.

FIle Cluster 2016 _ New SMBshare : The request is not supported.

$
0
0

HI 

I have built a fresh windows 2016 File server Failover cluster. I have been trying to create file shares through Powershell where in i face this error. 

New-SmbShare : The request is not supported.
At line:1 char:1
+ New-SmbShare -Name "T1" -Path "e:\Test2" -FullAccess "example\Testuser1"
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo       : InvalidOperation: (MSFT_SMBShare:ROOT//Microsoft/Windows/SMB/MSFT_SMBShare) [New-SmbShare
   ], CimException
    + FullyQualifiedErrorId : Windows System Error 50,New-SmbShare

Please help. 

CLUSTER COMMUNICATION NETWORK FIGURE

$
0
0

HELLO MICROSOFT

i hope to fix cluster configuration in next windows server version stable... to be easy setup and more clear.

I spend a week to figure cluster communication network between two nodes servers.. here is my case study to see if i can get answer to my questions.

I do different Scenarios for bellow two node in real physical hardware and virtualization to be manage storage in PowerEdge 430r both nodes

NODE#1

SERVER NAME: CLUSTER-A

OS: WINDOWS SERVER 2019 DATACENTER

NIC1: 192.168.2.1 as Management (Mgmt)

NIC2: 172.16.1.1 as (SMB01)

NIC3: 172.16.2.1 as (SMB02)

NIC4: 192.168.1.1 as Internet

NODE#2

SERVER NAME: CLUSTER-B

OS: WINDOWS SERVER 2019 DATACENTER

NIC1: 192.168.2.200 as Management (Mgmt)

NIC2: 172.16.1.2 as (SMB01)

NIC3: 172.16.2.2 as (SMB02)

NIC4: 192.168.1.200 as Internet

_________

Scenario#1

_________

1- Connect both nodes (CLUSTER-A,CLUSTER-B) in >>real physical switch<<between CLUSTER-A NIC1 and CLUSTER-B NIC1

2- Install hyper-v and other cluster features tools in both nodes.

3- create virtual switch(Mgmt) and virtual switch (Internet) in both CLUSTER-A,CLUSTER-B.

4- create virtual primary domain server in CLUSTER-A and virtual secondary domain server in CLUSTER-B.

5- connect both virtual domains by virtual switch(Mgmt) and virtual switch (Internet).

6- restart both and be sure they are working fine.

7- create virtual GATEWAY server inside CLUSTER-A and create other in CLUSTER-B to manage nodes in future and connected with virtual (Mgmt) and virtual Internet.

8- NOW.. direct connect CLUSTER-A NIC2 with CLUSTER-B NIC2 by cat6 cable and CLUSTER-A NIC3 with CLUSTER-B NIC3 with other cat6 cable.

9- DELETE .. virtual (Mgmt) in CLUSTER-A THEN manage nodes from GATEWAY IN CLUSTER-B to create cluster network configurations and virtual SET switch in CLUSTER-A by powershell

Invoke-Command -ComputerName "CLUSTER-A" -ScriptBlock {New-VMSwitch -Name SETSwitch -  EnableEmbeddedTeaming $TRUE -EnableIov $true -NetAdapterName NIC1,NIC2,NIC3

RESULT >>

NIC1 - (Mgmt)

NIC2 - (SMB01)

NIC3 - (SMB02)

10- be sure to complete all cluster configure  and check new virtual SETSwitch to hold NIC1 as management network

11- now connect GATEWAY , Primary Domain to SETSwitch in CLUSTER-A.

12- do the same ABOVE steps in CLUSTER-B.

THE RESULT>>>

every things work good NO ERROR when i RUN powershell 

    Test-Cluster -Node "CLUSTER-A,CLUSTER-B" -Include "Storage Spaces Direct","Inventory","Network","System Configuration","Hyper-V Configuration"

### BUT ###

when i restart any node i can't connected from LAN even i ping to it.. until i take off both cable from NIC2 and NIC3 then the system work

i have tried to check from inside by access ANY GATEWAY server i see i can ping and every thing work !! before i take off both cable.. 

if i connect both nodes  NIC2,NIC3 to physical switch and run test cluster i got error

the same cluster network, yet address  is not reachable from 172.16.1.1 using UDP on port 3343. 

MY QUESTION HERE >> is it important to be my domain outside nodes or must used other cables rather than cat6 for connect two nodes directly .. or any suggestions?? 

_________

Scenario#2

_________

1- create 4 nodes virtual environment by HYPER-V for CLUSTER-A, CLUSTER-B, DOMAIN and GATEWAY

2- create 3 virtual switch adapter (Mgmt),(SMB01),(SMB02)

2- connect all by virtual adapter  (Mgmt)

3- add (SMB01),(SMB02) to CLUSTER-A and CLUSTER-B

4- make sure  (SMB01),(SMB02) configure to include MAC address spoofing in adapter feature.

5- from GATEWAY server install cluster feature and network configuration in both CLUSTER-A, CLUSTER-B and test cluster.

RESULT>>>

Fine NO ERROR

BUT

when i disable MAC address spoofing in adapter feature and test cluster in both nodes

the same cluster network, yet address  is not reachable from 172.16.1.1 using UDP on port 3343. 

MY QUESTION 

is possible to do enable MAC address spoofing in real physical adapter

or you have any ideas about that



Windows Cluster 2012 R2 - Impersonation Level Error

$
0
0

Hello,

 

We are receiving an error saying "An unexpected error is keeping you from renaming the folder. If you continue to receive this error, you can use the error code to search for help with this problem. Error 0X80070542: Either a required impersonation level was not provided, or the provided impersonation level is invalid. The error occurs when attempting to rename the folder or file in the Volume folder. However, this only occurs on one of the nodes, depending on which node is the owner.

 

We experienced this with 2 of our customers. However, one of the customers upgraded the OS from 2012 R2 to 2016, therefore, the error did not occur again.

netft.sys is the cause for the bugchk blue screen on the server Windows 2008 R2 Datacenter

$
0
0

Hi

we have the server geting rebooted by a bugchk error for netft.sysPlease let me know if we have any fix for this issue. i am not sure wht is causing the issue on the server

the server is windows 2008 R2 Datacenter and it is on the HyperV cluster

Thanks in advance

Trying to add disk to Failover Clustering

$
0
0

Failover cluster was setup by a previous staff member. Since not all the NAS was used, we are trying to add another LUN to the cluster. Disk shows in Disk Management, then trying to do a simple partition and mount it under the Cluster Storage folder on the server. When trying to browse for drive path to mount, we receive an "Access id Denied". Are we doing something wrong?

Disk Management - bring disk online

Mount in the following NTFS folder:

C:\clusterstorage make new folder "Volume 3"

Access is denied

Boy do I have a good one.

$
0
0

Over the past two weeks I have been having a lot of trouble with my cluster in my production and DR sites.  Well to make a long story short I have gotten it figured out but ran into something I have never seen before and I want to see if any of you have.

The Cluster, Cluster-Prod01 was down and would not come up.  So to see where the resource was assigned I went into PowerShell and and ran Get-ClusterResources and found that Cluster Name was assigned to a VM in the Cluster.  this VM was a casualty of the issue of the last two weeks and I thought was DOA.  When I look at the resources in Cluster Managers for the VM, sure enough there was the Cluster Name in an offline start.  I debate on removing it or bring it online.  Not knowing what removing it would be to the rest of the cluster I opted to bring it online.   Everything seems to be working, but I'm not sure for how long.

Has anyone else come across this and if so how did you fix it?

Thanks in advance and any help.

John    


Failover Cluster Voting problem

$
0
0

May I know why there is still vote for the downnode ?

I shut down the node one by one from 04 -> 03 -> 02. and there is a FSW .

Cloning of existing Cluster and create same setup with new IP and hostname

$
0
0

Hello All,

Could you please help me to understand whether cloning works expected in failover cluster?

i have 1 SQL cluster and 1 FS cluster.. both are 2 nodes. we are planning to setup these same clusters for different entity so instead of creating new server and configure, we are planning to do cloning and change the server and IP address.. Is it possible? will it work as expected? Please provide suggestion.

Regards

Gunalan

SQL cluster occurs ID 1207 error frequently

$
0
0
SQL cluster  occurs ID 1207 error frequently, and id 1207 error occurs four times every 15 minutes. The system is windows server 2008R2, and there are two member servers in the cluster.

Check the cluster and member server "useraccountcontrol" attribute value is 0x1020, modify it to the normal value of 0x1000, 15 minutes after observing the error

Check CNO computer object  permissions are normal

Attempt to take the cluster offline, repair the secure channel, and find that the repair option is still gray after offline and cannot be repaired

See the logs about "unable to update password for computer account on PC" and "automatic password rotation failed with status 5" in the cluster

At present, another cluster in the environment has also found this error. No specific reason has been found. Do you have any other suggestions? Thank you!

Change the account under which the Cluster service run in windows server 2008

$
0
0

Hi Experts,

in my environment windows cluster service runing by Local service account. i want to run this service with newly created service account. This Service account has full control on both Node of Cluster

i searched on google and found some solution i treid the same. But it did not worked for me.. for example:-

To change the account under which the Cluster service runs

  1. Stop the Cluster service on all nodes:

    • Make sure the account has membership in the local Administrators group on all nodes.
    • Open Local Security Policy and Grant the following rights to the account, or to the local Administrators group, on all nodes:
    Where? Security Settings/Local Policies/User Rights Assignments

    • Act as part of the operating system
    • Back up files and directories
    • Restore files and directories
    • Adjust memory quotas for a process
    • Log on as a service
    • Increase scheduling priority
    By default, the Cluster service account inherits the following user rights as a result of being a member of the local Administrators group:

    • Manage auditing and security log
    • Debug programs
    • Impersonate a client after authentication
    If your organization has removed these user rights from the default set of privileges assigned to the local Administrators group, you need to specifically assign these user rights to the Cluster service account.

    • Open Computer Management.
    • In Computer Management, double-click Services and Applications, and then clickServices.
    • In the details pane, click Cluster Service.
    • On the Action menu, click Stop.
  2. Repeat step 1 on all other nodes.
  3. In the details pane of one node, double-click Cluster service.
  4. On the Log On tab, type the account name in This account, type the password inPassword, and then confirm the password and click OK.
  5. On the Action menu, click Start.
  6. Repeat steps 2, 3, 4, and 5 on all other nodes.

Notes

  • To perform this procedure, you must be a member of the Administrators group on the local computer, or you must have been delegated the appropriate authority. If the computer is joined to a domain, members of the Domain Admins group might be able to perform this procedure. As a security best practice, consider using Run as to perform this procedure.
  • To open Computer Management, click Start, click Control Panel, double-clickAdministrative Tools, and then double-click Computer Management.
  • To open Local Security Policy, click Start, point to Settings, click Control Panel, double-click Administrative Tools, and then double-clickLocal Security Policy. Select Show Advanced User Rights to view all the available security settings.
  • The Cluster service on all nodes must be stopped and restarted during this procedure. The Cluster service must use the same account and password at all times on all nodes within the cluster.
  • You must use Active Directory Users and Computers to view account properties if the account is in a Windows 2000 or Windows Server 2003 family domain. You must use User Manager for the Domain to view account properties if the account is in a Windows NT 4.0 domain. User Manager for the Domain can be installed using the client-based Network Administration tools by running Setup.exe from the Windows NT Server 4.0 media in the Clients\Srvtools\Winnt\ directory or by running Usrmgr.exe from the Clients\Srvtools\Winnt\I386 or Clients\Srvtools\Winnt\Alpha directory.

But when i run the cluster service i got the error message.

Event ID:- 7000

The Cluster Service service failed to start due to the following error:

A privilege that the service requires to function properly does not exist in the service account configuration. You may use the Services Microsoft Management Console (MMC) snap-in (services.msc) and the Local Security Settings MMC snap-in (secpol.msc) to view the service configuration and the account configuration.

Pls Help



Balwan Singh



Event ID 1212 - Cluster network name resource 'Cluster Name' cannot be brought online.

$
0
0

Hello,

We implemented a Windows Server 2012 hyper-V cluster recently. All was working correctly. However we decommissioned one of our 2003 Domain controller recently. Since then whenever I try to move core cluster resource to another, the following error is display in the event viewer.

Cluster network name resource 'Cluster Name' cannot be brought online. Attempt to locate a writeable domain controller (in domain\\nameofdecomissionedomaincontroller.domainname.local) in order to create or update a computer object associated with the resource failed for the following reason:

The server is not operational.

The error code was '8250'. Ensure that a writeable domain controller is accessible to this node within the configured domain. Also ensure that the DNS server is running in order to resolve the name of the domain controller.

 But the core resource is brought online without any issue and the cluster is working correctly.

I did a search nameofdecomissionedomaincontroller.domainname.local  into the registry, the only entry I found is below.

I guess this is where failover clustering is caching this setting and trying to contact the demoted DC every time I try to move a resource. I already tried to restart each cluster node and checked that the DC was decommissioned correctly.

Is it safe to edit the registry with existing DC name? Or any other solution is most welcomed.


Irfan Goolab SALES ENGINEER (Microsoft UC) MCP, MCSA, MCTS, MCITP, MCT


Hyper-V Virtual Machine Move Fails to another CSV - General Access Denied Error (0x80070005)

$
0
0

Good Morning,

I setup a 2 nodes Hyper-V fail-over cluster 2012 R2 with external SAN storage.

I am running to an issue where a VM fails to move from one CSV to another. The VM VHDX that I am testing with is about 10GB in size. 

The error message image is attached General Access Denied Error (0x80070005).

The move fails in both situations where the Source Cluster storage (CSV) owner node and the destination cluster storage owner node are the same or different. 

I recreated the 2nd CSV volume and still receive the same error.

Any feedback would be appreciated. 

Thank You

Raed

Cloud Witness and Split Brain Question

$
0
0

Hi all,

I've been looking around but have not yet found an answer for the following scenario, if using a Cloud Witness on a two node, multi-site cluster.

If the two nodes are unable to communicate with each other but Internet access is remains active on both data centers, will split brain occur or will the secondary node in Site B be able to know that Node 1 in site A is active through the communication with the Cloud Witness?

Thank you!



Failover and DNS name resolution

$
0
0

hi guys,

I have a primary and secondary failover setup on windows server 2016. My failover works successfully, and when it failover, at the DR it picks the DR IP fine. However, the hostname isn't reachable for approximatley 10 to 15 min, which is longer than the business would allow. Is there a way to ensure the server hostname is reachable just as fast as the IP, considering the applications use hostnames instead of IPs.


S2D 2 node cluster

$
0
0

Hello,

We have 2 node S2D cluster with windows server 2019. Between two nodes we have directly connected RDMA storage network (Cluster only) and client-facing network based on LACP teaming on each node (Client And Cluster). We have done failover test and it works: when we power off one node, virtual machines migrates to another host as expected. But when we unplug client facing adapters (two adapters in LACP) on one node, where VM are resides, VM migration fails and after some time Cluster network name and Cluster IP address also failed. When we plug again client facing adapters (two adapters in LACP) to failed node, cluster IP address recover and VM client network works again. So the problem: cluster migration fails after unexpectedly shutdown of client facing network of one node, where VM are resides. Nodes can communicate with each other through Storage network and all nodes are up in Failover Cluster manager. So when client network is down, VM should migrate to another node with working client-facing network. But cluster fails and VM do not migrate. Where we can fix this behaviour? Has anyone met this before?

VMs Failing to Automatically Migrate

$
0
0
I come in every morning to find a hand full of my VMs indicating "Live Migration was canceled." This seems to be happening around 12:00 - 1:00 AM, but I can't find anything configured to tell it to migrate so I'm not sure why it is happening to begin with. The event logs are not helpful... Cluster Event ID is 1155 "The pending move for the role 'server name' did not complete." The Hyper-V-High-Availability log shows Event ID 21150 "'Virtual Machine Cluster WMI' successfully taken the cluster WMI provider offline." which was right before the 21111 Event ID "Live migration of 'VM Instance Name' failed. It is typically the same VMs, but not always. I see the error on both Nodes (2 node cluster, 2 CSVs). Hyper-V-VMMS logs show 1940 "The WMI provider 'VmmsWmiInstanceAndMethodProvider' has shut down." Then 20413 "The Virtual Machine Management service initiated the live migration of virtual machine  'VM Name' to destination host 'Other Node' (VMID)." for each of the VMs running on that node. Some are successful, but a few get 21014 "Virtual machine migration for 'VM Name' was not finished because the operation was canceled. (Virtual machine ID)" and finally 21024 "Virtual machine migration operation for 'VM Name' failed at migration source 'Host Name'. (Virtual machine ID)". I can manually live migrate all VMs back and forth all day. I have plenty of resources on both nodes (RAM & CPU), and I have turned off the Hyper-V cluster balancer to automatically move machines. We used to have SCVMM installed but it was overkill for our small environment so it was decommissioned. While I would like to resolve the failures, I would be happy just knowing what was causing the VMs to migrate in the first place since it isn't necessary for them to do this every night. The cluster is not configured with CAU. Any guidance would be greatly appreciated!!

Duel: Non cluster aware APPLICATION vs SERVICE

$
0
0

Hey,

I need to solve another conundrum. Here are some beautiful pictures for all to behold:

Can somebody explain me in plain language- when do I use "App" vs "Service". It looks too easy, however I don't see a clear "fine line" between the two. Question time:

1) Is this true that in at least some instances one can go either app or service?

2) I understand my app must be installed first. Correct? 

3) Can I cluster this way (I guess "Service") AD CS, AD RMS, AD FS?

4) If there is a custom app- is it "service" or "app" and perhaps: why? why? why? :D

I need to draw the fine line between two of them. No general statements please. Let the experience speak! 

Thanks a lot. God bless.

Event id 260 - Hyper-V-VmSwitch - Failed to move RSS queue

$
0
0

Hello!

So, I have a Windows Server 2019 Datacenter Hyper-V Failover Cluster with two Broadcom 10GbE interfaces on a Switch Embedded Team and RDMA enable for LiveMigation. Since I've put them in production I've noticed some intermittent messages like the ones below:

Failed to move RSS queue 1 from VMQ 3 of switch 752B1093-0029-4E22-8D90-FDFE839B99C2 (Friendly Name: SET_Team), ndisStatus = -1071448015 .

Failed to move RSS queue 8 from VMQ 3 of switch 752B1093-0029-4E22-8D90-FDFE839B99C2 (Friendly Name: SET_Team), ndisStatus = -1071448015 .

Failed to move RSS queue 9 from VMQ 3 of switch 752B1093-0029-4E22-8D90-FDFE839B99C2 (Friendly Name: SET_Team), ndisStatus = -1071448015 .

And so on.

They always happen on a short time frame, like for 10 minutes, and I haven't noticed any degradation so far, but it worries me. 

I have distributed RSS and VMQ across the system's CPU as recommended, but I've never done it to a Server 2019 Cluster before, so I'm worried I might have missed something.

Here are RSS and VMQ settings for the physical interfaces:



Any thoughts?

Regards,

Giovani

Viewing all 5654 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>