Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

Dynamic Quorum - node votes not being dynamically removed

$
0
0

I'm posting this here as I think this is more of a clustering question than Exchange issue.

Setup:

ServerA - 2012R2 with Exchange 2013 DAG Primary Active Mounted

ServerB - 2012R2 with Exchange 2013 DAG Passive Healthy

ServerC - 2012R2 File Share Witness

Name  DynamicWeight NodeWeight Id State
----  ------------- ---------- -- -----
ServerA              1          1 1     Up
ServerB              1          1 2     Up

If I shutdown ServerB gracefully (not running any of the Exchange maintenance scripts), the vote is never removed from ServerB. In cluster manager the votes never downgrade to zero, node goes offline but votes are not dynamically removed as I am expecting.

With Dynamic Quorum, I am expecting that if I shutdown ServerB gracefully it should remove its vote, secondly if I shutdown the fileshare witness, ServerA should not lose quorum and continue up.

Can someone let me know if I am missing something here, or if there is a specific process/command I have to issue when I shutdown ServerB, to cause it or make sure it looses it vote?

Thanks

Jason


WIMMount (HSM) causing cluster storage to go redirected (2012r2 DC)

$
0
0

Looking for options to resolve this error and prevent it in the future.

Thanks for any help.

Hardware:

2 node Dell HV cluster running2012r2 DC

8 NIC/Ea in Multiplex mode 4x hyper-v 4x hosts

Storage:

2x Synology NAS, accessed through Iscsi to Hosts and cluster

Relevant logs:

Log Name:      Microsoft-Windows-FailoverClustering/Diagnostic
Source:        Microsoft-Windows-FailoverClustering
Date:         
Event ID:      2050
Task Category: None
Level:         Warning
Keywords:      
User:          SYSTEM
Computer:      l
Description:
[DCM] filter WIMMount found at unsafe altitude 180700

Log Name:      System
Source:        Microsoft-Windows-FailoverClustering
Date:        
Event ID:      5125
Task Category: Cluster Shared Volume
Level:         Warning
Keywords:      
User:          SYSTEM
Computer:
Description:
Cluster Shared Volume 'Volume1' ('Cluster Disk 1') has identified one or more active filter drivers on this device stack that could interfere with CSV operations. I/O access will be redirected to the storage device over the network through another Cluster node. This may result in degraded performance. Please contact the filter driver vendor to verify interoperability with Cluster Shared Volumes.

Active filter drivers found:
WIMMount (HSM)

PS C:\Windows\system32> fltmc instances
Filter                Volume Name                              Altitude        Instance Name       Frame   SprtFtrs  VlS
tatus
--------------------  -------------------------------------  ------------  ----------------------  -----   --------  ---
-----
CCFFilter             \Device\Mup                               261160     CCFFilter                 0     00000003
CsvFlt                \Device\HarddiskVolume50                  404800     CsvFlt Instance           0     00000003
CsvNSFlt              C:                                        404900     CsvNSFlt Instance         0     00000003
FsDepends             C:\ClusterStorage\Volume1                 407000     FsDepends                 0     00000003
FsDepends                                                       407000     FsDepends                 0     00000003
FsDepends             C:                                        407000     FsDepends                 0     00000003
FsDepends             D:                                        407000     FsDepends                 0     00000003
FsDepends             I:                                        407000     FsDepends                 0     00000003
FsDepends                                                       407000     FsDepends                 0     00000003
FsDepends             \Device\HarddiskVolume50                  407000     FsDepends                 0     00000003
FsDepends             \Device\Mup                               407000     FsDepends                 0     00000003
ResumeKeyFilter                                                 202000     ResumeKeyFilter           0     00000003
ResumeKeyFilter       \Device\HarddiskVolume50                  202000     ResumeKeyFilter           0     00000003
WIMMount                                                        180700     WIMMount                  0     00000000
WIMMount              C:                                        180700     WIMMount                  0     00000000
WIMMount              D:                                        180700     WIMMount                  0     00000000
WIMMount              I:                                        180700     WIMMount                  0     00000000
WIMMount                                                        180700     WIMMount                  0     00000000
WIMMount              \Device\HarddiskVolume50                  180700     WIMMount                  0     00000000
luafv                 C:                                        135000     luafv                     0     00000003
npsvctrig             \Device\NamedPipe                          46000     npsvctrig                 0     00000000
svhdxflt              \Device\HarddiskVolume50                  135100     svhdxflt                  0     00000003

VM Windows deactivated

$
0
0

Hi ,

We are running hyper v 2008 r2 failover Cluster. Problem is when we moves the virtual machines from one node to another node VM windows is deactivated and needs to be activated again.

Please advise

Regards

Differentiate CNO - VCO from computer objects in AD.

$
0
0

In my attempts to distinguish between computers & VCO / CNO, the only solution I’ve is to check the [editable] description field.

another option to check the servicePrincipalName/s is there but that's again editable. In an unguarded environment any one with Admin Rights can change it. Further there's this OperatingSystem which is populated only for boxes & not for VCO/CNO.

  If any one of you can help me find a PSH / filter to do this in a proper way.

yup

Removing Unused Network (Clustered Network Cards)

$
0
0

Good afternoon folks, 

I am trying to remove two old clustered network interfaces from Failover Cluster Manager, so they can be used for other purposes.

The connections are both set to not use cluster network communications. These are interfaces that were previously connected to an OLD storage device on it's own SAN, however we now need the ports to be used as individual management interfaces for the physical servers.

Can anyone advise on this?

Regards
Neil Cotton

Quorum Failover Cluster - Windows Server 2008 r2

$
0
0

Hello,

I have a fileserver cluster environment with two members, NODEA and NODEB. The cluster name is FS01.

We have the quorum disk in the cluster that is active on owner node. 

My question is, when we perform the failover of cluster resources for NODEB, the quorum disk must migrate to NODEB too? If the quorum disk is not migrated instantly, that means we have a problem in the cluster?

Another question is, my cluster is making failover of resources with a certain frequency, it is possible to detect the cause? it is possible to increase the cluster threeshould?

Thank you

Remove node from cluster. It wont leave the cluster!

$
0
0

Had a cluster once...
If I open FCM I dont see any cluster.
Cant get rid of the Cluster membership!

PS C:\Users\administrator.LAB> Remove-ClusterNode hyperv1 -force
Remove-ClusterNode : The remote server has been paused or is in the process of being started.
    The remote server has been paused or is in the process of being started

I have stopped, restarted the cluster service, and still not able to get it out from the cluster.

Anyone who have any solution?

MSMQ options in WSFC 2012 R2

$
0
0

Hi,

If you want to make MSMQ higly available, most guides I have seen is about creating the Message Queuing role. That's fine.

However, I'm in a process of sorting out the other options of MSMQ in a Failover Cluster and I have done those tests:

1. Created an Message Queuing resource in existing application group, made it dependent on the disk and network name already in there. Test ok.

2. The same as above but used a separate shared disk only for MSMQ. Test ok.

3. Created a Message Queuing role and moved that one into an existing application group. Test ok.

4. The other way around from point 3; Moved an existing application group into the Message Queuing group. Test ok.

"Test ok" in my case is that it didn't produce any errors, but I have not tested the MSMQ functionality because I'm actually not an application guy, more on the infrastructure level.

Also, one thing about creating a separate role for MSMQ is that you get the "Manage Message Queuing" in the GUI, that's not the case with options 2 and 3 if you don't tweak the registry or use gwmi (http://blogs.msdn.com/b/clustering/archive/2010/01/12/9946994.aspx).

Now to my question :-)

Are any or all of those other options described valid ways to set up MSMQ i a clustered environment? What is your exeprience out there?


MSDTC fail to be online

$
0
0

Hi

I have windows server 2008 r2 with sql cluster with MSDTC but Msdtc offline and cannot be online ,

TRACE_INFO] DtcOnlineThread (d:\w7rtm\com\complus\dtc\shared\mtxclu\src\dtcresource.cpp@563):Setting up resource files...

00000e10.00000f28::2015/06/05-23:17:07.009 ERR  [RES] Distributed Transaction Coordinator <MSDTC-PMDTC>: 06-06-2015 02:17:07:009 : [ e10. f28] 0x800710dc [TRACE_RESOURCE] [ TRACE_ERROR] DtcOnlineThread (d:\w7rtm\com\complus\dtc\shared\mtxclu\src\dtcresource.cpp@569): Failed to create resource files

00000e10.00001508::2015/06/05-23:17:58.845 ERR  [RHS] s_RhsRpcCreateResType: ERROR_NOT_READY(21)' because of 'Startup routine for ResType MSMQTriggers returned 21.'

000002d8.0000103c::2015/06/05-23:17:58.845 WARN [RCM] Failed to load restype 'MSMQTriggers': error 21.

please need help


MCP MCSA MCSE MCT MCTS CCNA


Windows 2012 Failover Clusters "An error occurred connecting to the Cluster"

$
0
0

Good Morning

I have 4 Failover clusters.  1 SQL Cluster, 1 HyperV Cluster, 1 IIS Cluster (Not NLB) and 1 File Cluster.  All running windows 2012.  The File cluster is fully upto date and has all the latest Firmware and Drivers installed and a couple off Windows Hotfixes for Windows Clustering.  The Hyper-V and SQL Cluster is scheduled for updates in the couple off weeks but both have been updated within the last couple weeks.  The problem I have on all four cluster is after a period off time, we are no-longer able to connect to the cluster from any node in the cluster.  What I mean is on the SQL Cluster after a period of time (This might be 3-4 weeks), if I connect to any of the nodes in the SQL Cluster and open fail-over cluster manager I am unable to connect to the cluster.  Fail-Over cluster manager first starts with "Connecting to Cluster" "The Operations is taking longer than expected", then after about 2-3 minutes an error comes up saying "The Operations has Failed", "An error occurred connecting to the Cluster '<Cluster-Name'>", If I then click on "See details" I get "An error occurred Trying to Display the cluster information", "One or more errors occurred","Provider load failure".

This same problem happens on all of our windows 2012 Clusters, we do have a 2012 R2 Hyper-V cluster but this is managed by a different team and I don't know if they have the same issue or not.

It seems to be one of the host that is causing the problem, normally the host that has Quorum, and if we reboot that host, which causes the cluster to fail over all the roles to a different node the cluster is then accessible again. Once the node that was rebooted comes back online everything is fine again for a period of time until this problem happens again.

If anyone has any suggest I would be very grate full.

Richard 


DTCProxy is not running: java.net.ConnectException: Connection timed out

$
0
0

Hi All,

While starting the jboss server we are facing below issue on MSDTC. The DB used is SQL Server 2008 r2. This is a clustered DB environment and MSDTC is working fine on non-clustered environments.

2015-06-09 23:48:18,444 ERROR [STDERR] (main) javax.transaction.xa.XAException: DTCProxy is not running: java.net.ConnectException: Connection timed out

2015-06-09 23:48:18,445 ERROR [STDERR] (main) at com.inet.tds.b.a(Unknown Source)

2015-06-09 23:48:18,445 ERROR [STDERR] (main) at com.inet.tds.b.start(Unknown Source)

2015-06-09 23:48:18,445 ERROR [STDERR] (main) at com.inet.tds.e.start(Unknown Source)

2015-06-09 23:48:18,445 ERROR [STDERR] (main) at org.jboss.resource.adapter.jdbc.xa.XAManagedConnection.start(XAManagedConnection.java:213)

Please help.

(resource type '', DLL 'vmclusres.dll') either crashed or deadlocked.

$
0
0

Hi

For the past few weeks we have been experiencing a strange issue with our failover cluster.

We currently have the following:

7 Nodes : Windows HyperV 2008 R2 Core Servers

SAM driven with 2 controllers, each have 2 network connections to each host.


We seem to be having an issue where random servers within the nodes are rebooting. This can happen at any time during the day, sometimes more than once a day and can easily be more than 1 host at a time or throughout the day. The actual nodes themselves do not restart and appear to be running fine with no connection loss. The servers within the nodes reboot but do not failover to another server.

When looking through the event logs on the failover cluster manager I see this happens everytime: 

Event ID: 1230
Cluster resource 'SCVMM RGVSVR031-T' (resource type '', DLL 'vmclusres.dll') either crashed or deadlocked. The Resource Hosting Subsystem (RHS) process will now attempt to terminate, and the resource will be marked to run in a separate monitor.

Event ID: 1146
The cluster resource host subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource vendor.

I have looked through logs on the machines that have caused the deadlock but nothing is apparent. Its never a set time or day its completely random. The servers do come back online but its a pain taking out our systems for at least 15 minutes.
Its not always the same server as its completely random but multiple servers have been logged more than once. 

Really stuck what to do next or have any idea whats causing this? All nodes are fully patched with the latest server pack 1.

Windows 2012 R2 Cluster stays up after two of three Voters in Quorum fails

$
0
0

We have a five node windows 2012r2 stretched cluster, running SQL. Three nodes are on our production site and two nodes are on our DR site. We have multiple SQL instances running in production, we have SQL replication running over to our DR site, which is running multiple SQL instances too.

We have configured our three production nodes with quorum votes, the DR site does not have any votes.

So we have three quorum voters at our production site, the two DR nodes have no votes.

If two of the Quorum voters fail, we would expect the cluster to fail, as the votes left (i.e. 1) is less than half.

We have not configured any disk witness.

In a full DR scenario, we are planning to manually configure a disk witness and assign votes to the DR nodes.

We are puzzled why our cluster continues to function, when two of our three quorum voters are down. i.e. two of our production nodes go down, the third one continues to function.

The cluster validation report states "The quorum will be able to sustain failure of two nodes", I am reading "the quorum" being the nodes with votes, if this is the case, then the validation report is telling us, correctly, how the production nodes behave, when two go down, but we are still puzzled with this behavior.

Can anyone explain how the cluster stays up when two of three voters go down?

Some VM start unexpected rebooting on one node (Hyper-V cluster 2012R2)

$
0
0

Hi,

Hyper-V cluster environment:

JBOD->2* storage spaces server 2012 R2->…network share-> Hyper-V cluster 2012R2

Some VM start unexpected rebooting on one node. In last 2 days 3 times.

In event log I found events:

Inside VM last event before crash: 2015-06-11 10:13:58
The system failed to register pointer (PTR) resource records (RRs) for network adapter
with settings: …..

Hyper-V host 2015-06-11 10:15:53
<my_vm> has encountered a fatal error.  The guest operating system reported that it failed with the following error codes: ErrorCode0: 0x7A, ErrorCode1: 0x7CC960, ErrorCode2: 0xC00000C0, ErrorCode3: 0x332D4880, ErrorCode4: 0xF992C850.  If the problem persists, contact Product Support for the guest operating system.  (Virtual machine ID ……)

Hyper-V host 2015-06-11 10:15:54
[RHS] Resource SCVMM <my_vm> IsAlive has indicated failure.

Hyper-V host 2015-06-11 10:15:54
Cluster resource 'SCVMM <my_vm>' of type 'Virtual Machine' in clustered role 'SCVMM <my_vm> Resources' failed.

After reboot VM started in other node.

Where I have to look for cause of problem?

Thanks

Question about DAG and half hour network outage

$
0
0
We have a 6 member DAG in exchange 2010 sp3 ru9 and one hub witness server. We are planning a 30 minute complete network outage. What would be the best course of action to prevent split brain issue and dismounting database confusion in the DAG. We discussed failing over active databases to 3 servers and stopping the exchange services on the other 3 prior to the outage. Would this help or would we also have to stop services on witness server?

can not fix corrupt system files

$
0
0

I am doing the  cluter node health examination, but can can not fix the corrupt file.

PS C:\Users\Administrator.000> Dism /Online /Cleanup-Image /RestoreHealth

Deployment Image Servicing and Management tool
Version: 6.3.9600.17031

Image Version: 6.3.9600.17031

[==========================100.0%==========================]

Error: 0x800f0906

The source files could not be downloaded.
Use the "source" option to specify the location of the files that are required to restore the feature. For more informat
ion on specifying a source location, see http://go.microsoft.com/fwlink/?LinkId=243077.

The DISM log file can be found at C:\Windows\Logs\DISM\dism.log

PS C:\Users\Administrator.000> sfc /scannow

Beginning system scan.  This process will take some time.

Beginning verification phase of system scan.
Verification 100% complete.

Windows Resource Protection found corrupt files but was unable to fix some
of them. Details are included in the CBS.Log windir\Logs\CBS\CBS.log. For
example C:\Windows\Logs\CBS\CBS.log. Note that logging is currently not
supported in offline servicing scenarios.
PS C:\Users\Administrator.000>

Please help

Hyper-V data cannot be stored on a disk witness that is not already used by another virtual machine.

$
0
0

I'm trying to move a VM from local to shared storage to make it highly available. The shared storage (S:) was active on the server I moved the files from. I imported it into Hyper-V and selected it as a VM when configuring it as a HA virtual machine role in Failover Cluster Manager. It failed with the following:

There was a failure configuring the virtual machine role for 'TEST'.
The path 'S:\Hyper-V' for a virtual machine configuration is on the disk witness for the cluster. Hyper-V data cannot be stored on a disk witness that is not already used by another virtual machine.

This seems to suggest that there must always be at least one VM on the shared storage before you can import another? How do you get the first VM on there?

Windows 2012 Cluster with Exchange 2013 DAG

$
0
0

Hi,

I have a question regarding Windows 2012 Failover clustering.

We have Windows 2012 running with Exchange 2013 DAG. 8 nodes 1 witness server 

There are few instances where one of the nodes lost the quorum due to network issues.  When ever that happens cluster service goes in restarting (crashing).  I tried to change Cluster service to manual and then start it but, it just keep crashing until I restart the server after that it works fine that node once again gets added into the quorum without any issues.

My question - Is it normal behavior if node lose the quorum cluster service keep restarting until you restart the server?  Or is there any way to bring back that server in the quorum without restart of the server.

clussvc.exe version 6.2.9200.21268

Error

The Cluster Service service terminated unexpectedly.  It has done this 15 time(s).  The following corrective action will be taken in 60000 milliseconds: Restart the service.

Thanks,



Raman

Issues with MSMQ over HTTP in Windows Cluster - Not Working

$
0
0

Background:

We have set up a cluster 'net_cluster' and configured message queuing service 'net_clusterMsmq' in it. Please refer the [screen-shot] below for the cluster configuration. We have two physical servers in cluster. Screen shot shows two IP addresses which are virtual each of which points to the physical server. We have created non-transactional private queues on physical servers.



Please see this image https://social.technet.microsoft.com/Forums/getfile/668427

net_clusterMsmq has two Virtual IPs,  VIP1 and VIP2 which point to .NET1 and .NET2 respectively. Currently .NET1 is up and running. 

VIP1 ----> PIP1 (.NET1) (PIP = Physical IP)

VIP2 ----> PIP2 (.NET2)



Please see this image https://social.technet.microsoft.com/Forums/getfile/668451

H$ - This is storage drive attached to either .NET servers whichever is active. It contains msmq\storage and msmq\mapping.



Issue

NOTE: Everything works and has been working well since long time if I use OS:\net_clusterMsmq. Problem starts when I use HTTP.

We are having issues with MSMQ over HTTP in Windows Cluster environment. When app sends message to the queue using HTTP:

  1. Outgoing queues on IIS (Web App) server shows referred queue with "Waiting to Connect" State and "Connection is ready to transfer messages" message in Connection History column. And queue messages stay stuck forever.
  2. IIS logs on .NET1 server show:
    [VIP1 here] POST /msmq/private$/queuename - 80 - [IIS - Web app server IP here] - 200 0 0 46
    This clearly tells that post request from Web App IIS server was received by .NET1 server. Status = 200. However in S-IP field appears Virtual IP (VIP1) that points to .NET1 server. This is due to the fact that we send requests via cluster node.


Below is what I have checked/tried so far with no luck:

  1. Checked if port 1801 is listening - Yes
  2. Modified sample_map.xml file in H$ (storage drive attached to active .NET server) as well as C:\Windows\System32\msmq\mapping and restarted MSMQ service but didn't work.
    This was done because I found a blog stating message request reaches msmq server but local queue manager does not recognize the Virtual IP (VIP1) and looks for the Physical IP (PIP1) in received message. Since it does not find it, discards the message.
  3. Added ANONYMOUS LOGON with full rights to destination queue on .NET1 server.


NOTE: MSMQ over HTTP works fine in Non-Cluster environment. So this is definitely cluster specific issue.





Dynamic Quorum - node votes not being dynamically removed

$
0
0

I'm posting this here as I think this is more of a clustering question than Exchange issue.

Setup:

ServerA - 2012R2 with Exchange 2013 DAG Primary Active Mounted

ServerB - 2012R2 with Exchange 2013 DAG Passive Healthy

ServerC - 2012R2 File Share Witness

Name  DynamicWeight NodeWeight Id State
----  ------------- ---------- -- -----
ServerA              1          1 1     Up
ServerB              1          1 2     Up

If I shutdown ServerB gracefully (not running any of the Exchange maintenance scripts), the vote is never removed from ServerB. In cluster manager the votes never downgrade to zero, node goes offline but votes are not dynamically removed as I am expecting.

With Dynamic Quorum, I am expecting that if I shutdown ServerB gracefully it should remove its vote, secondly if I shutdown the fileshare witness, ServerA should not lose quorum and continue up.

Can someone let me know if I am missing something here, or if there is a specific process/command I have to issue when I shutdown ServerB, to cause it or make sure it looses it vote?

Thanks

Jason

Viewing all 5654 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>