Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

Network Load Balancing (NLB)

$
0
0

Dear Support,

I have install NLB feature on two windows 2008 R2 servers both have configured with IIS. However I have tested both are working fine and both are accepting the request successfully.

Since we have performed load test the load balance between the server is not equal

Host 1

CPU Utilization =53%

Host 2

CPU Utilization = 10%

Please suggest.

Regards

Naveed


Naveed Amir


Cannot connect to Cluster

$
0
0
Three questions, I had two (Primary and Replica) Server 2012 R2 physical hyper-v hosts running VM replication. This was working perfectly...even did a planned and non-planned failover...awesome. I then proceeded to install Failover Clustering, added the two hosts to a cluster, validated them and created the cluster (DAVGeoClusterTest cluster name get's truncated to DAVGeoclusterte), so far so good. IT was end of the day so i stopped right there, basically stopped at the end of the Create Cluster Wizard. A day later i cannot see the cluster in Failover Clustering Manager, I try to connect to it and I get "The RPC Server is Unavailable - Exception from HRESULT: 0x800706BA" any ideas? Also i can only Remote Desktop into one of the two WMs at a time, the other loses connectivity, is this to avoid the split brain problems? Is there a way to totally undo/delete and go back to my simple VM replication I had going?

Windows Failover Cluster (Errors retrieving file shares)

$
0
0

I'm having an issue with Windows Failover Cluster with a Windows Server 2012 R2  machine. I have two cluster nodes (nodeA and nodeB). My issue is that when nodeA is the owner node, and I open failover cluster manager  <clusterName> >> roles >> <fileserver role> >> shares tab it will hang and say that it is loading, but this will occur infinitely. Although when I go to nodeB (not the owner node) and I go to shares it will show me all of the shares that I have. Next when I go to <clusterName> >> Nodes  >> click on Roles tab the information says "There were errors retrieving file shares."

Now when I switch the nobeB to the owner node, I cannot view the shares on that machine but can now view them on nodeA. 

We alse have a test network where I have recreated the machines, environment and the failover cluster to as close as the production network as I can except everything works great in the test network 

Adding nodes to Windows Server 2008 R2 Hyper-V Cluster..

$
0
0

Currently we have a 3 node Windows Server 2008 R2 Hyper-V Cluster in production. There are about 3 terrabytes worth of VMs running across these nodes.

It is over-committed, so i've setup two new nodes to add to the cluster.

I've done this before in a SQL cluster but never a Hyper-V cluster.

If I don't run validation when adding the nodes, will there be downtime?

The quorum is setup for disk majority, everything is identical on all nodes that needs to be. Shared storage is recognized and ready on the new nodes. I've gone through every checklist that Microsoft has. I'm just curious if the virtual machines will go offline on the current nodes when i add the two new nodes.

Everything is identical down to the wsus updates installed. From networking to storage everything is perfect.

I don't want to run validation as I know that'll take everything offline.


WMI Permissions on Cluster Name

$
0
0

We have recently implemented SCOM 2012 with the SQL Management Pack.  On our SQL Clusters (SQL 2008), we have added a low level privilege account to the local administrators group on both nodes and have given read permissions to the databases in SQL (as recommended by Microsoft).   The servers are Windows Server 2008 R2 SP1.


The issue is, we were receiving a script error on a getSQL2008DBFilesFreeSpace.vbs that the script was terminated because it was running over the 300 second timeout.   Upon further investigation, we have found the real issue.

The low-level privilege account did not have the ability to browse WMI using the Cluster Name.   We were receiving an access denied.   If you logged in as the low-level privilege account locally on one of the SQL Cluster nodes and run wbemtest and then connected to \\<clustername>\root\cimv2, we received an access denied error.   If you connect to \\<clusternodename>\root\cimv2, it would connect successfully.


The low level privilege account is a local administrator of the server and is in the local administrators group.  If you look a the WMI permissions, the local administrators group has full permissions on both the node name and the SQL Cluster name.


To resolve this issue, we had to edit DCOM permissions for the "My Computer" object for Launch and Activation Permissions/Edit Limits and then add the low level priviledge account explicitly with Remote Activation permissions.


My question.....why do you have to add EXPLICIT permissions for an account in DCOM when the account is already a member of local administrators group that HAS the remote activation permission defined????


I appreciate it if any one can tell me if there is a patch or something that I am missing that corrects this condition.  I think it is silly that I have to add this account explicitly if it is already a member of a group that has the permissions defined?


Thanks!



SQL Server Instance is not coming online on windows 2008 R2 cluster

$
0
0

Hi,

We have implemented a two node Windows fail-over cluster on which SQL server instance has been installed. As part of Fail-over testing, we have disabled network adapter in first node (preferred owner) and the SQL instance moved to the second node without any issue. But when we tried to enable network adapter and move back the SQL instance to first node it fails. The status is like Online Pending for sometime and will become Failed. We are getting Cluster event logs error like 1069 and 1205. We tried to check with all possible solutions in the internet referring to this event ID (1069 and 1205) with no luck.

We also tried the solution for this issue "SQL Server cluster resource goes to a "failed" state when you try to bring the resource online in SQL Server" http://support.microsoft.com/kb/883732 provided in Microsoft support page too by checking registry resource parameters and could not find any missing parameter in the registry side also.

So can anyone help us regarding this issue?

Regards,

Kiran

Understanding Quorum Disk

$
0
0

We've got a simple 3-node Hyper-V cluster setup which initially passed the verify tool 100%.  We created it, added the drives from the SAN, everything works great (including planned/unplanned fail-overs) BUT.... when we now re-verify the cluster we get a quorum warning.

I understand that we can designate one of the disks on the SAN as a quorum disk, and if we do that warning goes away, but if we do that does that mean we cannot use that disk for actual storage of VMs?

Clustered VM

$
0
0

Hi!

Is it possible to have a highly available VM also clustered so that if the VM itself goes down the clustered copy comes online?

Thanks.


creating cluster log file with C#

$
0
0

I can't find internet help on creating Windows 2008 cluster log via C#.

can some help with samples?

One of cloud doesn't work in failover with Event ID 1205 and 1069 and unable to move services and application another node

$
0
0

Any one please respond.



I've been working on creating a windows server 2008 R2 cluster for about a month now and I keep getting an error whenever I try to add a 2nd node. I keep getting The cluster node is not reachable. However, when I validate the configuration, everything goes through success. I created and destroyed the cluster a couple of times, but no luck. I even re-installed Windows Server 2008 R2 on both servers, re-configured the iSCSI, and the same warning keeps coming up. 

I installed a new a new SQL Server 2008R2 Named Instance on an exisiting SQL Server cluster node (Node 1 &Node2). When I try to manually fail over 

the Services and applications  to another node (Node1 to node2) using Failover Cluster Manager, I get the errors

The cluster has these Events:

Event ID1069: Cluster resource 'IP Address xx.xx.xx.xx' in clustered service or application 'ClusterDtc' failed.

Event ID 1205:The Cluster service failed to bring clustered service or application 'ClusterDtc' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application..

Why the cloud server doesn’t move to node2? you can see screen shorts below.

1.

in system configuration-

3. in system configuration---->warning--->Validate All Drivers Signed

after creating cluster I got below errors.

Any info you might know would be really helpful.


SAN Zoning for a Hyper-V Cluster

$
0
0

Best Practice for a FC SAN Zone has always been one Iniator Port and one Target Port per Zone. This insures that communication between the Server and Storage remain "clean" and also makes Troubleshooting easier.

If your lazy and do not feel like making so many Zones, you can also have a Zone with one Iniator Port and multiple Target Ports, if your Storage has multiple Controllers/Ports. Of course, when you have multiple Fabrics, you should not mix the two together.

That being said, I was recently in a MOC Hyper-V + VMM Course and the Instructor said that Microsoft Best Practise for a Hyper-V Cluster is to put one Iniator Port from each Cluster Node (multiple Iniator Ports) with multiple Target Ports all in one Zone. The reason being, he said, was that the Cluster reacted faster in case of a Failover. I have tried to find Microsoft Documentation that Support this statement but have not been able to find any.

Can anyone point me to any Documentation that backs this "Best Practise" Statement up or if you are doing the same thing in your Organisation?

 Thanks in advance!

Paolo

UPS Shutdown of cluster

$
0
0

We're setting up a simple HyperV cluster, just 3 Nodes, and it occurred to me to give some thought to unattended shutdown during extended power failure.

If I understand the workings of the cluster correctly, simply having each host independently shutdown at 50% battery (or whatever) is not going to produce a graceful shutdown of the cluster itself.

Is there any common way to link the shutdown signal from the UPS software to the cluster shutdown command (and then subsequently shutdown the machine itself)?

HOW TO CONFIGURE SEPARATE NETWORK FOR HEARTBEAT FOR ALWAYS ON CLUSTERING

$
0
0

  I NEED TO CONFIGURE A S EPARATE NETWORK FOR HEARTBEAT COMMUNICATION (2012 SRVER ,SQL 2012  FILE SHARE WITNESS) HOW I CAN DO THIS 

THANK YOU FOR YOUR HELP

Nics for a Hyper-V cluster

$
0
0

Hello!

http://alexappleton.net/post/44748523400/step-by-step-configuration-of-2-node-hyper-v-cluster-in

Each server has a total of 8 NIC’s and they will be used for the following:

1 – Dedicated for management of the nodes, and heartbeat
1 – Dedicated for Hyper-V live migration
2 – To connect to the shared storage appliance directly
4 - For virtual machine network connections

Tell me please whether I'm right or not:

1) 1 – Dedicated for management of the nodes, and heartbeat - it's not good: it's better to have a separate Nic decicated  for heartbeat and a NIc  dedicated for management.

2) I can use the same Nic(s) for the nodes management and for virtual machine network connections.

Thank you in advance,

Michael

SQL Server Failover Clustering error

$
0
0

Hi All,

I am currently setting up 2-node SQL Server cluster but I am getting error when doing the failover test from Node2 to Node1.

Here is the quick overview of what I have so far.

1. Setup the failover cluster for both nodes, public and private network, cluster disks for Quorum, MSDTC and SQL, etc.

2. Run validation configuration before creating the cluster. Validation report completed successfully with no errors/warnings.

3. Created cluster, created MSDTC cluster and installed SQL server on both nodes.

Now I am doing some failover test on whether cluster resources will failover from Node1 to Node2 and Node2 to Node1.

Failover Test: Active Node is Node1.

1. Disable Public network on Node1.

2. Failover to Node2 -> successful

3. Enable Public network on Node1.

Problem:

After the failover to Node1, I tried to failback the resources from Node2 to Node1 by disabling the public network on Node2 (which is the active Node after the failover from Node1 to Node2) but the cluster resources won't failback to Node1.

Failback from Node2 to Node1 -> failed

1. Disable Public network on Node2.

2. Failback to Node1 -> failed

                  - Cluster Name and Cluster IP ->failed

                  - SQL cluster group (SQL name, SQL IP address, Analysis, SQL server and SQL Server Agent) ->failed

                  -MSDTC cluster group -> failed back successfully to Node1

3. Enable Public network on Node 2.

4. Manually online Cluster Group and SQL cluster group

I tried to Manually online the Cluster Group and SQL cluster group but it CANNOT be online unless I enable the Public network on Node2. I have checked on the cluster event log and I am getting some event ID 1077 and 1069 errors and Event ID 1069 and 1205.

Here are some of the logs on the cluster events.

Event ID 1069: The Cluster service failed to bring clustered service or application 'SQL_Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

Event ID 1205: Cluster resource 'SQL IP Address 2 (db-vip)' in clustered service or application 'SQL_Group' failed.

Anyone experience the same issue before? Appreciate if someone can point me to right direction to resolve the issue.

Thanks in advance for your feedback.

BTW, failover and failback works perfectly when I try to reboot the Active node. Resources failed over successfully from Node1 to Node2 and vice versa when I reboot the server.

Thanks again.

Regards,

Ivan


Meaning and default value of WMI MSCluster_Resource class property ResourceClass

$
0
0
Somehow cannot get clear with MSCluster WMI class MSCluster_Resource property named ResourceClass. Does anybody know the meaning and the value by default?

Guntars Svilpe

File Share Witness Resouces Errors in a SQL 2012 Alwayson Availability Group Environment

$
0
0

Hi I am getting the following error in WFC Manager and in my system event log:

Event ID1564: 

File share witness resource 'File Share Witness' failed to arbitrate for the file share '\\SQL2012ClusterWitnessPath'. Please ensure that file share '\\SQL2012ClusterWitnessPath' exists and is accessible by the cluster.

Event ID 1069: 

Cluster resource 'File Share Witness' of type 'File Share Witness' in clustered role 'Cluster Group' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Event ID 1205:

The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

These errors showed up every hour on the hour and then suddenly stopped.  I tried looking at the cluster.log file but there wasn't anything recorded there.  The file share witness shows to be online and my AG did not fail over to another node.  The cluster has read and write permissions to the share.  I did not find any error messages about the witness share on the remote server. 

I am wondering what caused these series of events to occur?

Thanks.

Cluster Shared Volume (CSV) not failing over on Hyper-V 2-node Cluster

$
0
0

Hi,

We have a 2-node Hyper-V cluster in testing and I'm having a problem with the disks on it. Both nodes have 1 storage HBA each which connect to a SAN over 2 switches (the nodes stay connected as long as 1 switch is running). On the SAN I have a volume configured as a CVS that contains all VM info and VHDs. The quorum volume is also on the SAN.

In Failover Cluster Manager under Storage > Disks I have the Hyper-V volume and it's assigned to to Cluster Shared Volumes. It is set to owner node 1 in this instance.

Now whenever I simulate a HBA failure on Node 1 (I just remove the cables) the Hyper-V CVS will go offline and stay offline, taking the VM roles with it. Is it not supposed to failover to Node 2 and redirect access via Node 2 to Node 1? Should it be in redirected access? I also have the owner node of the Disk Witness set to Node 2 and when I "fail" the HBA on Node 2 the quorom also goes offline. My quorum is set to Node and Disk Majority, should it be Node and File Share Majority for a SAN volume?

I appreciate all the help in advance.

EDIT: just some more info, when the cluster fails to bring the CSV back online after the HBA fails and I try to move it to Node 2, it will give me an error: Move didn't complete can't find the file specified.

An error occured attempting to read properties for 'Cluster Group' group. The remote procedure call failed. Error ID:1726 (000006be).

$
0
0

Hi All,

I have a two node 2003 cluster, When any one of the cluster node holding resources goes down, the resources are not failing over to the running node.
In the running node I receive the pop-up error message
""An error occured attempting to read properties for 'Cluster Group' group. The remote procedure call failed. Error ID:1726 (000006be).""

After I click on ok on the pop-up error message the resources are coming online on the running node. If I dont click ok the pop-up error message. The Cluadmin screen is not responding and the resources are not coming online.

In the cluster log i see the below messages realted to Error ID 1726

00000874.00000b10::2014/03/17-23:38:58.276 WARN [EVT] EvtBroadcaster: EvPropEvents for node 2 failed. status 1726
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: ProcessId= 2164
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: SystemTime= 3/17/2014 23:38:58:276
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: GeneratingComponent= 2
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: Status= 0xc002100b
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: Detection Location= 641
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: Flags= 0x0
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: Number of Parameters= 2
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: Long Val= 32000
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: Long Val= 32000
00000874.00000b10::2014/03/17-23:38:58.276 INFO [NM] RpcExtErrorInfo: ProcessId= 2164

How to FIX this issue?

Regards,
Stunner.


Cluster network name resource 'Cluster Name' failed to update the DNS record

$
0
0

Hi,

I have created a two node cluster in my environment. After creating the cluster service, it reports me the following error and warning,

Warning

Event ID – 1579

Cluster network name resource 'Cluster Name' failed to update the DNS record for name 'MyCluster' over adapter 'PRO'. The error code was 'DNS bad key. (9017)'. Ensure that a DNS server is accessible from this cluster node and contact your DNS server administrator to verify the cluster identity can update the DNS record 'MyCluster'.

Error

Event ID – 1196

Cluster network name resource 'Cluster Name' failed registration of one or more associated DNS name(s) for the following reason:

DNS bad key.

Both the cluster node servers are running Windows 2008 R2 Enterprise, Service pack 1

My environment has a third party DNS setup. After some searching I found out that this problem can occur when I use a third-party server application for DNS resolution. But I could not apply the fix. It reports me “This hot fix is not applicable to your system”. May be I have already got the fix.

Does anybody know how to fix this issue?

Thanks in advance.

Viewing all 5654 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>