Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

new-cluster static address was not found on any cluster network

$
0
0

Hi guys,

Recently my 2 node cluster got into some issue one of my node was not able to start up. I tried running start-custernode -forcequorum

but i got error Start -

start-clusternode: the system cannot find the file specified.

No solution found. So i went to the other node which still with the cluster and remove the cluster.

Removing of cluster is fine no issue. However i face another problem the moment i wanted to recreate the cluster.

The error shown was -

New-Cluster: Static address 'x.x.x.x' was not found on any cluster network.

If anyone know what going on please let me know. Thanks.



new-cluster static address was not found on any cluster network

$
0
0

Hi guys,

Recently my 2 node cluster got into some issue one of my node was not able to start up. I tried running start-custernode -forcequorum

but i got error Start-clusternode: the system cannot find the file specified.

No solution found. So i went to the other node which still with the cluster and remove the cluster.

Removing of cluster is fine no issue. However i face another problem the moment i wanted to recreate the cluster.

The error shown was -

New-Cluster: Static address 'x.x.x.x' was not found on any cluster network.

If anyone know what going on please let me know. Thanks.

Windows 2012 rolling upgrade to 2016 file server

$
0
0

Hi Folks,

    I am not even sure is this fully supported by Microsoft .  I am doing  POC for this one. Sharing with the rest of you of what I have out found also to spare you some time of troubleshooting.

1. Adding 2016 node must be done in Windows 2016 Failover clustering manager.  Failing to do so cause the cluster to go offline.

2.In mixed mode ironically, configuring the file server role in Windows 2016 Failover clustering manager will not work .  I need to do it in Windows 2012 node for it to work.

3. At this point everything seems ok. Both Cluster  and Client Access Point are up.

4. Problem arises when I want that 2016 node to take ownership of the File server role (need to do this as I want to evict 2012 node one by one). The File server role immediately gone down when I do  this.

Troubleshooting steps taken:

1. Delete the Virtual name computer object and create the new one.

Any help and tips will be appreciated. Thank you.

Replication of VMs with shared VHDX is not supported

$
0
0

A typical Microsoft half baked solution (shared vhdx) - can use them, but nothing else, no backup, no replication.

Obviously for DR purposes that is unacceptable!

How can I DR my environment offsite when it is not possible? All my services (File/print/sql/mis) run on 2012 R2 clusters spread across different hosts

Any ideas?

Seb

how to check RCA for heartbeat missing

$
0
0
  • \

  • The cluster service was halted to prevent an inconsistency within the failover cluster . the error code was 1359

  • Server : windows 2016

    As per my investigation , the  network adapter reset issue was observed at the same timestampi.e., 3:18:26 AM on 07-01-2019. Please be informed that cluster logs timezone will be in GMT timezone.


    00000c64.00001950::2019/07/01-07:18:33.587 INFO  [IM - Cluster Network 1] Resetting interface state calculation state

    00000c64.00001950::2019/07/01-07:18:33.587 INFO  [IM] Leader is sending request for all interfaces in the current view

    00000c64.00000b44::2019/07/01-07:18:33.587 INFO  [DCM] Force disconnect payload: netname \xxxxxxx, requested disconnect status (0), src <null>, dest <null>

    00000c64.00000b44::2019/07/01-07:18:33.587 ERR   [DCM] Force disconnect failed on DisconnectSmbInstance::CSV, status (c000000d)

    00000c64.00000b44::2019/07/01-07:18:33.587 INFO  [DCM] Force disconnect(DisconnectAll): server \169.254.2.228, DisconnectSmbInstance::CSV

    00000c64.00000b44::2019/07/01-07:18:33.587 INFO  [DCM] Releasing RDR handle for target node id 2

    .000006ec::2019/07/01-07:19:02.884 ERR   [NODE] Node 1: Connection to Node 2 is broken. Reason (10054)' because of 'channel to remote endpoint 169.254.2.228:~3343~ has failed with status 10054'

    00000c64.000006ec::2019/07/01-07:19:02.884 WARN  [NODE] Node 1: Initiating reconnect with n2.

    00000c64.000006ec::2019/07/01-07:19:02.884 INFO  [MQ-thpqhms0] Pausing

    00000c64.000008dc::2019/07/01-07:19:02.884 INFO  [Reconnector-thpqhms0] Reconnector from epoch 1 to epoch 2 waited 00.000 so far.

    00000c64.00001930::2019/07/01-07:19:03.012 INFO  [IM] got event: Node with FaultTolerantAddress xxxxx:~0~ has gone down with fatal error\crash

    00000c64.00001930::2019/07/01-07:19:03.013 ERR   [IM] Couldn't find node id for remote virtual IP xxxxxxxx:~0~

    0000194c::2019/07/01-07:19:14.683 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 10.81.64.153:3343 remote address 10.81.65.25:3343

    00000c64.00001930::2019/07/01-07:19:14.683 INFO  [IM] got event: Remote endpoint 10.81.65.25:~3343~ unreachable from xxxxx

    00000c64.00001930::2019/07/01-07:19:14.683 INFO  [NDP] Checking to see if all routes for route (virtual) local xxxxx:~0~ to remote 169.254.2.228:~0~ are down

    00000c64.00001930::2019/07/01-07:19:14.683 WARN  [NDP] All routes for route (virtual) local 169.254.1.43:~0~ to remote xxxxxxxxx:~0~ are down

    00000c64.00001924::2019/07/01-07:19:14.683 INFO  [CORE] Node 1: executing node 2 failed handlers on a dedicated thread

  • Also found this in event logs :

    07-02-2019           7:20:42 AM           Warning thpqghs0.prod.travp.net     10400    Microsoft-Windows-NDIS   N/A         N/A         The network interface 'vmxnet3 Ethernet Adapter' has begun resetting.  There will be a momentary disruption in network connectivity while the hardware resets. Reason: The network driver detected that its hardware has stopped responding to commands. This network interface has reset 1 time(s) since it was last initialized.

Please let me know if this causing the issue

Windows 2012 R2 rolling upgrade to 2016 file server

$
0
0

Hi Folks,

    I am not even sure is this fully supported by Microsoft .  I am doing  POC for this one. Sharing with the rest of you of what I have out found also to spare you some time of troubleshooting.

1. Adding 2016 node must be done in Windows 2016 Failover clustering manager.  Failing to do so cause the cluster to go offline.

2.In mixed mode ironically, configuring the file server role in Windows 2016 Failover clustering manager will not work .  I need to do it in Windows 2012 node for it to work.

3. At this point everything seems ok. Both Cluster  and Client Access Point are up.

4. Problem arises when I want that 2016 node to take ownership of the File server role (need to do this as I want to evict 2012 node one by one). The File server role immediately gone down when I do  this.

Troubleshooting steps taken:

1. Delete the Virtual name computer object and create the new one.

Any help and tips will be appreciated. Thank you.


FIle Cluster 2016 _ New SMBshare : The request is not supported.

$
0
0

HI 

I have built a fresh windows 2016 File server Failover cluster. I have been trying to create file shares through Powershell where in i face this error. 

New-SmbShare : The request is not supported.
At line:1 char:1
+ New-SmbShare -Name "T1" -Path "e:\Test2" -FullAccess "example\Testuser1"
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo       : InvalidOperation: (MSFT_SMBShare:ROOT//Microsoft/Windows/SMB/MSFT_SMBShare) [New-SmbShare
   ], CimException
    + FullyQualifiedErrorId : Windows System Error 50,New-SmbShare

Please help. 

High Availability file service will not sync - no offline files are available.

$
0
0

I set redirected folders to a HA file service.  When I click on the documents folder the Status: Online icon shows up.  When I look at the offline files folder there is nothing there.

If I map and drive to the folder, the "Make available Offline" selection does not show up.  If I use a non-HA file share, the "Make available Offine" option appears.

Any ideas?

The cache is enabled on the share.


Guest file server cluster constant crashes

$
0
0

Hi

I have make working a guest file server cluster with Windows Server 2019. the cluster crash constantly, being very slow and finally crashing all my hypervisors servers....

Hypervisor infrastructure:

  • 3 hosts windows server 2019 LTSB datacenter
  • iSCSI Storage 10 Gb with 11 LUNs
  • cluster valid for all tests

Guest file server cluster, 2 VM with the same config:

  • VM 2nd generation with 2019 LTSB Server
  • 4 virtual UC
  • 8GB of non-dynamic RAM
  • 1 SCSI controller
  • primary hard drive: VHDX format, SCSI Controller, ID 0
  • empty DVD drive on SCSI controller, ID 1
  • 10 VHDS disks on SCSI controller, ID 2 to 11, same ID on each node
  • 1 network card on virtual switch routing to 4 physical teamed network cards.
  • Cluster is valid for all tests except the network with one failure point for non redundancy.


after some time, the cluster become very slow, crash and make all my hypervisors crashs. the only errors returned by Hyper-V is some luns became unavalaible due to a timeout with this message:

Le volume partagé de cluster « VSATA-04 » (« VSATA-04 ») est à l’état suspendu en raison de « STATUS_IO_TIMEOUT(c00000b5) ». Toutes les opérations d’E/S seront temporairement mises en file d’attente jusqu’à ce qu’un chemin d’accès au volume soit rétabli.

I have checked every single one parameters on VM and Hyper-V config, search with each hint I was given by logs but nothing and the crashes remains....

and sorry for my poor language, english is not my main ability for speaking

CSV - CLUSTERING Hper-V Hosts!

$
0
0

I have a 4 node cluster of Hyper-V Hosts Win2012R2 with Clustered Shared Volumes

I have around 22 VMs spread across these nodes sitting on the CSVs

Every now and then VMs get in failed state and I have to COLD Boot my Hosts to get them back online

Is it because of CSV....? I have a VM which is on C Drive and it remains fine, no issues never

Please advice, whats the best way for VMs to be highly available going Win2019

Thanks a lot

PS I have a 1TB SAN


SV

Failovercluster Management Network Interface

$
0
0

Hello Friends,

my question today is revolving around Failoverclustering (which is sitting, in our case, on top of a S2D deployment). Everything is working fine in that regard.

Next week we will get new switches and I plan to trunk/lacp both SFP+ ports on the servers (at the moment only one SFP+ port of every server is connected to the switch) and connect them to the new switches.

Iam aware how trunking/lacp works on the powershell, thats not the problem. What I fear is the connectivity of the failovercluster itself. When I have to dissolve the Hyper-V Virtual Ethernet Adapter to form a new LACP, so I can create a new Virtual Adapter on top of the LACP, the failovercluster will lose the only management interface. 

When I look into the Failovercluster MMC Snap-In under "Network" I can see my interfaces, but I dont see any buttons to add interfaces and declare them as management network (my idea was to use the copper ports of the server nic to create a temporary management interface until I have built the LACP/virtual adapter and integrate them as the primary management interface).

So the question remains: can I add (maybe through powershell) additional interfaces as management to the failovercluster or do I actually have to dissolve the whole cluster and build anew?

Thanks in advance.

Best Regards,

Constantin


Unable to mount the File cluster resource shares ( Windows 2016 Cluster) on AIX systems

$
0
0

Hi 

I have been trying to upgrade file cluster from windows 2008R2 to 2016. I decided to build new cluster in 2016 and do storage remapping from old cluster to new cluster which i was successful. I have a file cluster resource name (Test-Batch) created on 2016 cluster. We have some AIX systems which needs to mount this shares. In my old 2008R2 cluster , it worked very fine. When i use 2016 cluster. the AIX systems couldn't mount the file shares like they can mount till \\test-batch\ but not beyond that. Meaning i have couple of shares under  \\test-batch\ lest say T1,T2..T10. The AIX systems can mount till \\test-batch\ but not \\test-batch\t1 , \\test-batch\t2 ..etc. I have checked all the permissions and everything was perfect working. 

I have also noticed like if i do telnet to my resource name over port 139 it works for 2008R2 cluster but not for my 2016 cluster. AIX team could mount the file share using samba client for 2016 cluster. It works because samba uses port 445 for file share. But for windows it uses 139 as this is not working AIX team says. we need to fix it. Im completely clueless who to fix that. 

Port 139 is opened on the new 2016 nodes and works, it doesnt just work for the file cluster resource name. 

Even i changed the SMB version usage on 2016 to use smb1.0 still it didnt work. 

Any suggestions please how to fix this.

Server2016 Cluster network traffic coming from host ip rather than role ip

$
0
0

Hello

I have two 2016 vm's in a hyper-v environment that are clustered. Each VM is on a separate physical host.

Each VM only has 1 nic. My clusters ip's are as follows:

172.18.1.113 ProductionIP - Role IP
172.18.1.114 Cluster IP
172.18.1.115 VM Host A
172.18.1.116 VM Host B

I've added the Role IP address (172.18.1.113) to an ipsec tunnel on my firewall, but my firewall see's the traffic as coming from either of the 2 host ip addresses (.115 or .116).  If I ping the remote end of the ipsec tunnels host from the either host A or B and source it as the .113 the ping works, but by default it always takes host ip and fails. 

How do I get the clusters nodes to always send traffic out of the role ip no matter which node is active? 

Thanks

Dan

File Server Role in Cluster-to-Cluster Replica

$
0
0

I have an asynchronous cluster to cluster storage replica setup across a metro link and I'm replicating two volumes that are assigned to a general use file server role and providing a coupled shared folders. That role (just called "storage1", accessible at \\storage1) is working fine and the volumes are all replicating correctly.

When I set this up, the second cluster was also given the same type of role (called "storage2", accessible at \\storage2). However, when I enabled storage replica, that role disappeared. Now I'm a bit confused as to how we actually are going to go about failing from one site to the other - I had assumed it would involve a powershell command and a change in DNS to point the storage target from storage1 to storage2 and wait for clients to update, but now I'm wondering if that's actually the case. Will the entire role (and subsequently the DNS) for "storage1" be migrated to the replica site? What are the implications of the two sites losing connection ... is there any automated movement I should be aware of that may take place or trigger a split-brain scenario?

Disks at site 2 assigned to storage2 file server role:

No role actually listed at site 2 now:






Paul Hite - MCSE, MCITP

部署DHCP Cluster,管理DHCP服务器时显示“无效的扩展名 CLuster_SNAPIN_EXTENSION_NAME ”

$
0
0

部署DHCP Cluster,管理DHCP服务器时显示“无效的扩展名 CLuster_SNAPIN_EXTENSION_NAME ”

请问该如何解决?谢谢。


Windows 2016 MPIO Does Not Appear (COMPLENTCompellent Vol)

$
0
0

Hi,

we added support for ISCSI, but no (COMPLENTCompellent Vol) appeared.

It's all working, but I was curious why this happened.

Two other servers appeared successfully and have the same version of Windows 2016.


Tks.

s2d down after adding hard drives

$
0
0

s2d newbie here.  This is a test environment, so nothing is probably supported hardware. 

the setup

Running two optiplex 3050's with windows server 2016.  They each have a spinning disk and an ssd via sata.  the spinning disk is partitioned for the operating system and the rest i dumped into the s2d pool.  With this hardware i setup a fail-over cluster w/ quarm coming from my domain controller, and the clusters' storage coming from the s2d pool.  Everything was working well, but terrible slow.  

the issue

To cure the slowness i decided to add a pcie, m.2  drive to each.  After adding it to one machine the cluster and s2d drive came back w/ an error, i ran a repair inside of server manger.  After that completed i added the same drive to the other machine and my s2d drive has been gone since.  I've tried removing the last HD I added, rebooted each several times w/ no luck.

error's

when i look at the critical events for the cluster's disk in "fail-over cluster manager"  there are a lot of repeating event ID.  5142, 1069, 1793

Any help would be greatly appreciated.  I'd like to see if i can fix this in a test environment before i see it in production.

many thanks!


IT guy

Error applying Replication Configuration Windows Server 2019 Hyper-V Replica Broker

$
0
0

Hello,

Recently we started replacing our Windows Server 2016 Hyper-V Clusters for Server 2019. On each cluster we have a Hyper-V Replica broker that allows replication from any authenticated server and stores the Replica Files to a default location of one of the Cluster Shared Volumes.

With WS2019 we run into the issue where we get an error applying the Replication Configuration settings. The error is as follows:
Error applying Replication Configuration changes. Unable to open specified location for replication storage. Failed to add authorization entry. Unable to open specified location to store Replica files 'C:\ClusterStorage\volume1\'. Error: 0x80070057 (One or more arguments are invalid).

When we target the default location to a CSV where the owner node is the same as the owner node for the Broker role we don't get this error. However I don't expect this to work in production (moving roles to other nodes).

Did anyone ran into the same issue, and what might be a solution for this? Did anything changed between WS2016 & WS2019 what might cause this?

Kind regards,

Malcolm

2012 R2 Scale-Out File Server Performance Issue

$
0
0

I'm implementing a product called AppLayering by Citrix in a VMware environment. It creates a unique .vhd for each piece of software you install and want to deploy to end users. We created a Scale-Out File Server for the share so that we could have 100% up time from crashes and updates/reboots. The end user machines mount the .vhds at login; usually anywhere from 5-15 of these .vhds which range from 1Gb to 12GB in size.


Now that I'm increasing the amount of machines accessing this share, sometimes I experience a very long delay, as much as 6 minutes, before the layers are mounted. They usually mount within seconds. However, it's not consistently worse the more machines that are logged in, rarely it's still instant, but it does seem to get worse in general the more machines are mounting these layers.


The only performance settings I've tried to tinker with is the MaxThreadsPerQueue from 20 to 64. This reg entry was not in the registry by default, I had to make it myself, so I'm not sure if that means anything. Also not sure if 64 is even a good number to change it to either, just shooting in the dark here, any help would be much appreciated!


Darin

Failover Cluster Validation Fails with SMB Share Access Error

$
0
0

The two-node cluster is set up and running without any problems as far as I can tell BUT: The cluster validation test gives this warning.

Failed to validate Server Message Block (SMB) share access through the IP address of the fault tolerant network driver for failover clustering (NetFT). The connection was attempted with the Cluster Shared Volumes test user account, from node Hyper-V22.contoso.local to the share on node Hyper-V23.contoso.local. Access is denied.

and

Failed to validate Server Message Block (SMB) share access through the IP address of the fault tolerant network driver for failover clustering (NetFT). The connection was attempted with the Cluster Shared Volumes test user account, from node Hyper-V23.contoso.local to the share on node Hyper-V22.contoso.local. Access is denied.

During the test, this error occurs in the SMBClient log of the node.

169.254.1.88 is the address of the other node's Tunnel Failover Cluster Virtual Adapter.

I cannot ping the remote node's Failover Cluster Virtual Adapter IP address. The firewall is off on all profiles. What is wrong here?

Viewing all 5654 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>