Quantcast
Channel: High Availability (Clustering) forum
Viewing all 5654 articles
Browse latest View live

Cluster Validation / UDP 3443

$
0
0

Hello,

I'm running into a problem running Cluster Validation (2012 R2)

I'm getting a network communications error stating that each node is unable to contact the other over UDP3343.

The only advice I've found online instructs me to open UDP3343 on the firewall. (or disable it all together)

All firewalls are disabled, all other IP communications are fine.

a Netstat shows that UDP3343 is not listening and a NMAP shows the port is not open.

Why would UDP3343 not be listening?

Thank you!

-Ryan


-Ryan Biddle-


WMI Permissions on Cluster Name

$
0
0

We have recently implemented SCOM 2012 with the SQL Management Pack.  On our SQL Clusters (SQL 2008), we have added a low level privilege account to the local administrators group on both nodes and have given read permissions to the databases in SQL (as recommended by Microsoft).   The servers are Windows Server 2008 R2 SP1.


The issue is, we were receiving a script error on a getSQL2008DBFilesFreeSpace.vbs that the script was terminated because it was running over the 300 second timeout.   Upon further investigation, we have found the real issue.

The low-level privilege account did not have the ability to browse WMI using the Cluster Name.   We were receiving an access denied.   If you logged in as the low-level privilege account locally on one of the SQL Cluster nodes and run wbemtest and then connected to \\<clustername>\root\cimv2, we received an access denied error.   If you connect to \\<clusternodename>\root\cimv2, it would connect successfully.


The low level privilege account is a local administrator of the server and is in the local administrators group.  If you look a the WMI permissions, the local administrators group has full permissions on both the node name and the SQL Cluster name.


To resolve this issue, we had to edit DCOM permissions for the "My Computer" object for Launch and Activation Permissions/Edit Limits and then add the low level priviledge account explicitly with Remote Activation permissions.


My question.....why do you have to add EXPLICIT permissions for an account in DCOM when the account is already a member of local administrators group that HAS the remote activation permission defined????


I appreciate it if any one can tell me if there is a patch or something that I am missing that corrects this condition.  I think it is silly that I have to add this account explicitly if it is already a member of a group that has the permissions defined?


Thanks!



Hyper-V Cluster Name offline

$
0
0

We have a 2012 Hyper-V cluster that isn't online and we can't migrate VMs to the other Hyper-V host.  We see event errors in the Failover Cluster Manager:

The description for Event ID 1069 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Cluster Name

Cluster Group

Network Name

The description for Event ID 1254 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Cluster Group

The description for Event ID 1155 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

ACMAIL

3604536

Any help or info is appreciated.

Thank you!

Missing disk on a Hyperv cluster

$
0
0
HyperV2012 cluster with two nodes. HP servers with 3PAR disk system. 
When I move and delete large disks, 2TB, losing the server I'm working on contact with all SAN disk's. Must take a restart of the disks that will come online again. Sounds like a bug. either?

File Server role, changing settings on shares throws an error

$
0
0

Using Windows 2012R2 with Failover Cluster Manager, I am attempting to set up a File Server role.  In doing so, I have been able to get the computer account created and bring the role online.  From there, click the "Shares" tab at the bottom of the Failover Cluster Manager, then right click any share listed and choose "Properties".  A new window appears with three tabs on the left labeled "General", "Permissions", and "Settings".  I click"Settings" and am presented with the following checkboxes:

  • Enable access-based enumeration
  • Enable continuous availability
  • Allow caching of share (checked by default)
  • Encrypt data access

At this point if I check or uncheck any options, when I apply them a red banner across the top gives me the message:

"The specific object cannot be updated either because the server is not available or because the required service is not present on the server."

I then moved the role to the other node and tried this again.  This time, I got a different error:

"Error occurred while updating an SMB share: The requested operation is not supported. The requested operation is not supported."(yes, it is repeated twice)

The event viewer doesn't have any entries.  I searched online for this error message, but it turned up nothing.  Any ideas what might be going on?



Cluster service not starting after reboot - The system cannot find the file specified

$
0
0

Hello,

I will describe the current situation, the lab and the cluster log.
I'm looking forward to any hint to get the cluster running again.

THE SITUATION

I am working on a cluster resource dll, porting it from Server 2003 to Server 2012R2.
After a reboot of my lab the cluster service does not start anymore on both nodes. The error messages in the eventlog and cluster log say, the system cannot find the file specified, but do not say which file.
I can access the file share witness from both nodes. The machines can ping each other.

Before the reboot, all cluster resources could be brought online and moved between the nodes.


THE LAB

The Lab has been set up by a colleague and consists of three VMs running in Virtual Box on Windows 7 Enterprise 64 Bit:
 1. Server 2008R2 as
  Domain Controller and
  provider of the file share witness.
 2. Server 2012R2 as Node 1
 3. Server 2012R2 as Node 2
 
THE LOG

The last lines of the cluster log, containing the error messages, you can find below.
The line with the first occurence of 'file not found' code 2 contains ClRtlOpenFileEx. But an internet search gives no result at all for this function.

00000bd4.00000ab0::2014/08/08-08:50:31.504 ERR   mscs::ReservationAgent::UpdateClusDiskMembership: (87)' because of 'DeviceIoControl failed'
00000bd4.00000ab0::2014/08/08-08:50:31.504 ERR   [DCM] StartRdrCsvInstance: instance \Device\SmbCsv, status 5
00000bd4.00000ab0::2014/08/08-08:50:31.504 ERR   [CORE] Node 1: exception caught (2)' because of 'ClRtlOpenFileEx(&hControlRdr_, NULL, (LPWSTR)RdrDeviceName.c_str(), SYNCHRONIZE, FILE_SHARE_VALID_FLAGS, FILE_SYNCHRONOUS_IO_NONALERT)'
00000bd4.00000ab0::2014/08/08-08:50:31.504 ERR   Exception in the PostForm is fatal (status =2)
00000bd4.00000ab0::2014/08/08-08:50:31.504 ERR   Exception in the PostForm is fatal (status =2), executing OnStop
00000bd4.00000ab0::2014/08/08-08:50:31.504 INFO  [DM]: Shutting down, so unloading the cluster database.
00000bd4.00000ab0::2014/08/08-08:50:31.504 INFO  [DM] Shutting down, so unloading the cluster database (waitForLock: false).
00000bd4.00000ab0::2014/08/08-08:50:31.566 ERR   FatalError is Calling Exit Process.
00000af0.00000928::2014/08/08-08:50:31.566 WARN  [RHS] Cluster service has terminated. Cluster.Service.Running.Event got signaled.
00000b38.00000b20::2014/08/08-08:50:31.566 WARN  [RHS] Cluster service has terminated. Cluster.Service.Running.Event got signaled.
00000b38.00000b20::2014/08/08-08:50:31.566 INFO  [RHS] Exiting.
00000af0.00000928::2014/08/08-08:50:31.582 INFO  [RHS] Exiting.

Thank you for reading and supporting.

SchLois



Migrate SOFS Role between 2012 R2 Clusters

$
0
0

I have a sofs role configured on a 2012R2 3 node cluster, and I need it moved to a different cluster so that the nodes on the current cluster can be rebuilt. Both clusters use a CSV. The share that is used by the sofs sits on a CSV (connected by iscsi on each node)

Im trying to understand the advice here:

http://technet.microsoft.com/en-us/library/dn659431.aspx

Method 1 is confusing because it discusses migrating virtual machine storage with VMM but I dont see how this relates to a SOFS.

Method 2 seems most applicable, however when I use the 'Copy Cluster Roles Wizard, and I select the source cluster, it brings up a list of role available to be migrated, and I cannot select the single SOFS I want to move. If I check one box, it automatically checks the box for every role on the cluster, and I dont want to move everything.

Also, im confused about the process of the migration and what exactly migrates once I get past the first issue above.

Should I be running this wizard to migrate the role to the new cluster, then connecting the iscsi for the lun to the 3 destination nodes and mounting as a csv ? and then disconnecting the csv from the source cluster ? or vice versa ? or?

cant find a good guide for this. Any advice or pointers in the right direction much appreciated.

tks

Windows Server 2012 R2 Multi-Node Failover Cluster ( more than 2 nodes)

$
0
0

Hi ,

I have set up a 8 Nodes Failover Cluster , and with Quorum Configuration of Node and Disk Majority (recommended for clusters with an even number of nodes)

i have 2 question regarding my cluster behavior : 

1. WHen will the cluster server down completely ? 

i poweroff my server 4 hyper V host and remain active 4 nodes , my cluster services still running and still able to perform live migration

and i continue to poweroff my server until i only have 3 active server running and i have tried to manually off my cluster and online it again with no error. 

I have decided to poweroff all my hyperV host and this time my cluster server and VM have down completely ,

which i assume is the time and token still active which make the cluster , roughly how long it take to alert us that entire Cluster will down even we have set the quorum to NODE and DISK Majority ? 

2. how Many Nodes required to poweron in order for my CLuster Services continue Running ? 

I have power up my first Hyperv Node and i couldnt restart my cluster server. 

Yet when the time i power up my second hyperV node , my cluster services is up and running and i manage to continue runing my VM and Live MIgration .

how Many Nodes required to poweron in order for my CLuster Services continue Running ? Where is the  six node cluster with a failed disk witness could sustain two (3-1=2) node failures.



Ong


Create Cluster - install Cluster Disk 1 on LUN 1 disk (why not on LUN 0 disk) ?

$
0
0

hi,

Storage vendor setup arrays and LUNs on storage.

On Windows 2008 Entr. hosts we see these LUNs as hard drives.

Device Manger - Disk drives. They are sorted:

Disk Drive LUN 8
Disk Drive LUN 1
Disk Drive LUN 9
Disk Drive LUN 7
Disk Drive LUN 0
Disk Drive LUN 10

 

On one host we did: Online, Initialised, Format first two Disks.

LUN 0 Disk - we prepare as Quorum Q:
LUN 1 Disk - we prepare as MSDTC T:

On the other host we scan for changes and only bring disks Online
and change Dirver letter.

 

Then we install FailOver CLustering on host 1 and then on host 2.

Cluster Validation was ok on both hosts.

Create Cluster was also ok, BUT after that in Fail Over Cluster Management
we have situation:

Cluster Disk 1, Volume:(R)

Cluster Disk 2, Volume:(U)


First we think that we need to change only drive letters, to put Q and T,
as we didi at the beggining.

But Cluster Disk 1 , Quorum (R), was installed on Disk LUN 1
and
Cluster Disk 2, MSDTC (U), was installed on Disk LUN 0.


WHY Create CLuster "choose" LUN 1 for Cluster Disk 1  ??

 


I chek disk sigantures:

C:\>cluster res "Cluster disk 1" /priv

Listing private properties for 'Cluster disk 1':

T  Resource             Name                           Value
-- -------------------- ------------------------------ -----------------------
D  Cluster disk 1       DiskIdType                     0 (0x0)
D  Cluster disk 1       DiskSignature                  3699015254 (0xdc7a7e56)
S  Cluster disk 1       DiskIdGuid
D  Cluster disk 1       DiskRunChkDsk                  0 (0x0)
B  Cluster disk 1       DiskUniqueIds                  10 00 00 00 ... (128 byte
s)
B  Cluster disk 1       DiskVolumeInfo                 01 00 00 00 ... (48 bytes
)
D  Cluster disk 1       DiskArbInterval                3 (0x3)
S  Cluster disk 1       DiskPath
D  Cluster disk 1       DiskReload                     0 (0x0)
D  Cluster disk 1       MaintenanceMode                0 (0x0)
D  Cluster disk 1       MaxIoLatency                   1000 (0x3e8)


C:\>cluster res "Cluster disk 2" /priv

Listing private properties for 'Cluster disk 2':

T  Resource             Name                           Value
-- -------------------- ------------------------------ -----------------------
D  Cluster disk 2       DiskIdType                     0 (0x0)
D  Cluster disk 2       DiskSignature                  3699015246 (0xdc7a7e4e)
S  Cluster disk 2       DiskIdGuid
D  Cluster disk 2       DiskRunChkDsk                  0 (0x0)
B  Cluster disk 2       DiskUniqueIds                  10 00 00 00 ... (128 byte
s)
B  Cluster disk 2       DiskVolumeInfo                 01 00 00 00 ... (48 bytes
)
D  Cluster disk 2       DiskArbInterval                3 (0x3)
S  Cluster disk 2       DiskPath
D  Cluster disk 2       DiskReload                     0 (0x0)
D  Cluster disk 2       MaintenanceMode                0 (0x0)
D  Cluster disk 2       MaxIoLatency                   1000 (0x3e8)

Cluster disk 2, LUN 0, has "lower" signature, so I thougt that it should be "choose" first during Create Cluster - or I am wrong ? 

We plan to put Quorum on LUN 0, but now LUN 0 is CLuster Disk 2.

I am afraid that could impact on cluster "stability" ...


Where we made mistake ???


They are some ideas to remove Cluster Disk 1 from Cluster Resources, but if any one have some good

advice/link/informatio about our situtation I would be gratefull :).


best regards,

thank you for your time,


Keli

automating failover cluster using batch file and power shell - [cluster disk 1 and cluster disk 2] will these name change in any case.

$
0
0

hi,

i am trying to automate the whole fail over cluster using set of powershell commands in a single batch file and a properties file, my problem is - after i create a cluster, i have 2 available disk name named cluster disk 1 and cluster disk 2. when i give this name in my properties file, i am afraid that it may change in some cases. also i dont know which one will go for quorum n which 1 for shared disk. can you please confirm if the name remains the same every time. So that i can give the same name in my properties file.

Also i actually wanted to get these cluster Resources using some command and put is by himself in the properties file. but seems not possble. kindly confirm. i used Get-ClusterResources, but it brings all the resources. i just wanted the cluster disk 1 n 2. is there any command for the cluster disk 1 n 2. is there any command for that ?



Validate cluster fails

$
0
0

Hi

I installed sql server failover cluster successfully. On installation time when I validate cluster it didn't give me any error that time it was success.

Now when I click again on 'validate this cluster' it gives error ' Failed to connect to the service manager on 'NODE 2'.'

Firewall is off for all server.

why is it give me error? Plz tell me suggestions.

Failover VM's Servers all grouping together under one server?

$
0
0

When I add VM server's to high availability using the wizard in the failover cluster manager. They all add under the first server that I add instead off in the top section individually as roles ( See Screenshot ) Ignore partially running, one VM is currently in an off state.

Screenshot
http://i57.tinypic.com/2wgt6p0.jpg

From what I see of other sample setups this isn't normal and they appear individually under roles and we can't migrate individual serves to another node. 

 

The server is setup 2x server, with 1 SAN CSV, and is setup successfully as a failover cluster. 


NLB Web Front End (WFEs)servers & Clustering IIS Application Pool between the two WFEs

$
0
0

Hi All,

I have just encountered a very bad experience in my Production environment.My topology is as follows: NLB WFEs, MOSS APP server, Cluster DBs, DCs.

NLB on the WFEs is working fine. However, for some unknown reason, one of my site under the Application Pool in IIS Manager was stopped on one of the WFE and the site was down. NLB did not redirect request to the second WFE server.

My question is: How can I make sites under the Application Pool highly available? If any site is in a stopped state, there should be a mechanism in place to redirect request to the Application Pool on the second NLB WFE.

Kindly help and advise.

Thanks all.


Does the NLB cluster can deployed in multi site?

$
0
0

Hi all,

i have a MPLS net in site A and site B, i get a vlan can connect two site, in this scenario can i set the NLB cluster in multi-site? how to do it, thanks.

2 Node Cluster with Server 2008 R2 Cluster Disks Show as "Local Disk"

$
0
0

I have 2 Node Cluster with Server 2008 R2 with 13 LUNS presented as Clustered Disks within Failover Cluster Manger.

When view Hard Disk Drives within Windows Explorer 10 Drives show as Type = Clustered Disk and 3 Drives show as Type = Local Disk.  The 3 drives still failover over and seeing any ill effects, but why do they show as "Local Disk".  I wondering if there are configuration issues that could cause these 3 disks to haver failover issues.  How can I change?

Thanks,

Richard


Cluster dies when 3rd node is on

$
0
0

Hi,

At work we have 3 servers within a cluster (Windows Server 2012 R2). On Monday the cluster failed and started to live migrate boxes to servers which were rebooting. We had a major site outage, where our proxy, exchange and lync went down. In the Failover Cluster Manager all the virtuals were stuck saying "loading", the only console which was working properly was "Hyper-V Manager". . We managed to get everything back up by rebooting each server to allow it to install Windows Updates.

On Tuesday, we had a similiar outage which was caused by one of the servers trying to take ownership of a store. The cluster then went into a "zombie state", which only occurred when the 3rd node was on. We now have a option where we can evict the node from the cluster and add it back in.

Any ideas why this might have happened?



Windows 2003 File Share 4 node Cluster: Does Cluster Resources need to be brought offline prior removing / unmapping any LUN's from SAN end?

$
0
0

Hello All,

Recently, on a 4 node Windows 2003 File Share Cluster, we encountered a problem where when trying to remove few shares (that were no longer in use) directly from SAN end crashed the entire Cluster (i.e., other shares also lost their SAN connectivity). I suppose the Cluster resources need to be brought offline prior removing it from SAN but I've been advised that these shares were not the root and instead a 'mount point' created within the share; and hence there is no need of bringing down any Cluster resources offline.

Please can someone comment on the above and provide me detailed steps as to how we go about reclaiming SAN space from specific shares on a W2003 Cluster?

p.s., let me know if you need any additional information.

Thanks in advance.

How to schedule Cluster logs to be generated for Microsoft Failover Clusters 2008 R2

$
0
0

Hi, As per my understanding we always have to generate Cluster logs manually on the cluster nodes to get these generated.

Is there any way we can schedule Cluster Logs to be generated every time so that it would be easy for us to analyze the issue?

Kevin

The file cannot be opened because it is in the process of being deleted

$
0
0

Hi,

I'm cleaning up a Hyper-V Cluster WS2012R2.  There is only 1 node the cluster for the moment.  All VMs have been moved to another cluster already except for 2.  When I try to delete them I get the msg "The file cannot be opened because it is in the process of being deleted". 

I want to destroy the cluster but I can't because the 2 VMs aren't removed yet.  I have tried to remove them via powershell "Remove-ClusterGroup VMName" , this results in an error: The object has been deleted from the cluster.

How can I remove the 2 remaining VMs?

Inexplainable 2008 Failover Cluster Issues

$
0
0
Hi,

We have a 2008 Failover Node & Disk Majority SQL 2005 cluster.

There are 2 nodes in the cluster with 2008 Ent 64-bit SP2 installed.

At around 00:20 each morning we see various FailoverClustering errors in the event logs on both servers.

EventID: 1135, 1069, 1177

Before the FailoverClustering events are seen, 2 informational events appear regarding the 'Microsoft Failover Clustering Virtual Adapater'

EventID: 4201 'The system detected that network adapter Local Area Connection* 9 was connected to the network, and has initiated normal operation.'

This is causing the resources to failover to the secondary node.

I have run the Cluster Validation Wizard and everything passes. I have disabled the Windows Firewall service on both nodes.

We are presenting the storage via NetApp and the nodes have 3 nics installed

NIC1 - Server Vlan - Speed/Duplex Set to 1000Mb Full
NIC2 - Storage Vlan - Speed/Duplex Set to 1000Mb Full
NIC3 - Heartbeat - Speed/Duplex Set to 100Mb Full

Please can anyone help me troubleshoot these issues ?

Thanks

Scott

Viewing all 5654 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>