NIC teaming seems to be confused about what NIC belongs to what team

May 18, 2015, 5:47 am

≫ Next: cluster nodes in the same level of windows update - Cluster 2012 R2

≪ Previous: Event ID 1230: (resource type '', DLL 'clusres.dll') either crashed or deadlocked.

I have a total of four physical servers, two of which are hypervisors and two of which are storageservers. All the servers are connected to the same switch and all servers run an up-to-date installation of Windows Server 2012 R2. The two hypervisors have two ethernet ports and both are teamed in Windows.

The two storageservers have four ethernet ports, two of which are native and the other two belong to a PCI-NIC we added. The two native ones (the two on the motherboard) are teamed in Windows and connect to the switch. The two PCI-NIC's are connected to eachother directly with an "ad-hoc" connection (two very small cables between the four ethernet ports). The two ethernet ports on the PCI-NIC are also teamed on both storageservers.

When all the teaming is set up everything looks fine and displays active. Once we try to copy a file over the netwerk from hypervisor to hypervisor (thus using the switch) we only get a 1Gbp/s connection, this should be 2Gbp/s due to the two ethernet ports being teamed. When I try to copy a file over the ad-hoc connection between the two storageservers the copying does go by 2Gbp/s, but ever so gracefully Windows decides to use both teams on that server. So instead of using the dedicated ad-hoc team, it uses half of that connection and throws the other half over the team that is connected to the switch. I am amazed by this, because it just doesn't make any sense.

But, it gets even more strange. If I disable one of the ethernet ports of the team that is connected to the switch on the storageserver, and I then try to copy a file over the ad-hoc team.. it does use the full 2Gbp/s available.

I am completely lost.. what am I doing wrong? Am I missing something?

↧

cluster nodes in the same level of windows update - Cluster 2012 R2

May 17, 2015, 3:38 pm

≫ Next: I/O Performance throttled down consistently in every minute with the same pattern. (Windows Clustering 2012 R2 + SQL Clustering 2012 R2)

≪ Previous: NIC teaming seems to be confused about what NIC belongs to what team

i have created a cluster of 3 nodes , Source of update is internet

after complete windows update , i find they are not in the same level of windows update , some of updates are not installed in some nodes ,

please advice what is the problem , and how to fix it

Ramy

↧

I/O Performance throttled down consistently in every minute with the same pattern. (Windows Clustering 2012 R2 + SQL Clustering 2012 R2)

May 18, 2015, 8:15 am

≫ Next: Network Drops for 30 seconds During Hyper-V Live Migration

≪ Previous: cluster nodes in the same level of windows update - Cluster 2012 R2

Dear Expert...

I just done installation of Windows Clustering 2012 R2 + SQL Clustering 2012 R2. However, when i do I/O performance test. the result is not perfect. As per graph below, the I/O seems to have throttled down consistently in every minute with the same pattern. Any idea?

Note: The cluster is connect to SAN storage. The SAN Storage tested fine with other standalone server.

Many Thanks!

↧

Network Drops for 30 seconds During Hyper-V Live Migration

May 18, 2015, 7:25 am

≫ Next: NLB Issue

≪ Previous: I/O Performance throttled down consistently in every minute with the same pattern. (Windows Clustering 2012 R2 + SQL Clustering 2012 R2)

I have 3 physical Hyper-V hosts setup with clustered storage. I disabled VMQ because I was getting errors when trying to do live migrations. I have also ran the network portion of the cluster validation tests without errors. What happens is basically when I do a live migration from any host to any other host I lose network connectivity to any VM running on those hosts. During this time I have a SQL application that is running and locks up and freezes all the users. Many will have to use task manager to kill the application to get back in or even reboot their machines to free it up.

I have been doing a ton of reading on network settings and configurations and have made no progress. Any help to point me in a direction to get this solved will be appreciated. I need to be able to do Live Migrations on my cluster storage.

Thanks for any help.

↧

NLB Issue

May 18, 2015, 3:45 am

≫ Next: quorum failure scenario question

≪ Previous: Network Drops for 30 seconds During Hyper-V Live Migration

Hi,

We are facing some issue with NLB. PFB setup details

1. 2 ADFS servers are in NLB.

2. 2 ADFS proxy servers are in DMZ network(192.168.0.1, 192.168.0.2)

all are installed with windows server 2012 R2 OS, and from network side DMZ to ADFS servers all ports are allowed.

Issue: When we configure NLB with IGMP multicast , we are able to ping/telnet from one DMZ machine(192.168.0.1) to NLB IP but from other DMZ machine(192.168.0.2) we are not able to ping/telnet, when we changed the NLB setting to Multicast , same issue.

But in Unicast setting, the we are able to ping/telnet from 192.168.0.2, but from 192.168.0.1 we are not able to ping/telnet.

Note: we are able to ping/telnet to individual machines from both DMZ machines.

Please help us where exactly the issue.

Regards

Anil Anchuri

↧

quorum failure scenario question

April 30, 2015, 9:49 am

≫ Next: Server 2012 R2 Hyper-V Cluster - Losing connectivity to vLANs

≪ Previous: NLB Issue

Hello TechNet friends,

I have a scenario that happened yesterday that leaves me stumped and I am not sure in which direction to go.

2-node active/passive 2008R2 file cluster (Node 1 & Node 2)
Nodes are vmguests on vsphere 5.5
path selection is round-robin
quorum node/disk majority (quorum disk is SAN...all drives are SAN in fact)
Node 1 owns cluster resources

Our VM environment re-balanced itself in the wee hours of the morning. Upon initiation of migration of Node 1 to a different host, the VM system reported that there was no heartbeat coming from node 1. This appears to be because the virtual switch used in VMware listed a different "Observable IP range" outside that of the heartbeat IP. We have noticed that the observable IP range change and apparently that is expected behavior due to broadcast packets being received and should not cause alarm. The guest migration then occurred.

Seconds later, the MS cluster reported the cluster service failed to update the cluster configuration on the witness disk. The witness disk then failed and dropped from the cluster. The cluster remained up with no errors being reported.

The newly migrated Node 1 showed all green in terms of cluster and cluster resources and only showed this Witness disk error in the logs. It wasn't until I was notified that the application could not reach its cluster resources did I drill down into the cluster and notice that the attached SAN drives only showed a unique Identifier # and no longer had a drive letter. The drive also showed 0 bytes. I had to reboot Node 1 in order to restore connectivity.

So..I think I have a couple of questions:

A) Did the intermittent loss of a heartbeat during the migration cause the cluster service to fail to update the cluster config on the witness disk?

B) Why does A matter if the original cluster config is kept c:\windows\cluster?

c) You can lose the witness disk and be ok, why did Node 1 all of a sudden think it had the cluster resources but could not provide a drive letter?

thank you.

↧

Server 2012 R2 Hyper-V Cluster - Losing connectivity to vLANs

April 22, 2015, 6:00 am

≫ Next: Overlapping Cluster Disk Numbers on Windows 2012 R2 Cluster

≪ Previous: quorum failure scenario question

Hi Everybody,

I am experiencing a strange issue with a Server 2012 R2 Hypervisor. There are two physical NICs teamed (one from on-board and one from expansion board), this team is assigned to a hyper-v virtual switch, there are four virtual network adapters assigned to the switch (Backend, Heartbeat, Management & CSV/Live Migration).

The server has been running without issue for months but, all of a sudden has started to lose connectivity to the Backend and Heartbeat. The loss of connection only occurs once VMs are hosted on the server and it doesn't happen immediately, it can take anywhere from 12 hours to several days.

There are no errors on the NICs or vNICs themselves and the configuration is intact. Once connectivity is lost to the Backend and Heartbeat, it remains that way and it seems the only solution so far is to reboot the server. Hardware diagnostics have been run and revealed no issues. There are no useful events in event viewer, the alerts relate to the loss of connection to the cluster as a result of connectivity being lost on the Backend and Heartbeat.

I have 6 other hypervisors with the same network configuration that are performing without issue so, I don't believe the problem is the configuration. The only difference is the hypervisor that has the problems is a Dell Poweredge R620 and the other 6 are Dell PowerEdge R610s.

Can anyone advise what might be the cause of these issues or, any event logging to enable that might help isolate where the issue lies.

↧

Overlapping Cluster Disk Numbers on Windows 2012 R2 Cluster

May 11, 2015, 7:49 am

≫ Next: Options or paths to migrate a microsoft fail-over cluster to a new storage array

≪ Previous: Server 2012 R2 Hyper-V Cluster - Losing connectivity to vLANs

I have a windows 2012 R2 Cluster that is showing overlapping Cluster Disk numbers in the disk view. It doesn't seem to affect

any cluster behavior. Just wondering why this would happen and if it is a bug instead of expected behavior?

↧

Options or paths to migrate a microsoft fail-over cluster to a new storage array

May 8, 2015, 5:21 am

≫ Next: Invitation: Provide Cluster feedback on the new Cluster UserVoice page

≪ Previous: Overlapping Cluster Disk Numbers on Windows 2012 R2 Cluster

Hello,

I am looking for some options or ideas on how to migrate my Microsoft fail-over cluster to new storage array. Currently i have a Microsoft Server 2008 R2 fail-over cluster, with a two nodes and a number of VM's. Recently we decided to invest in new HDD's to increase the capacity of our existing array. Upon adding the HDD's in the array, i configured the new HDD's in a RAID6 array. I am now looking for options or migration paths, to move my failover cluster to the new storage volume, so that we can re-use the old volume for archived data. Has anyone ever done this before? Can you offer and pros and cons, or advice on how this should be completed?

I thought we could use two new nodes, build a new failover cluster on the new volume, then migrate the VM's from one volume to the next. However this would require sourcing two additional Nodes for this process. I have also seen some migration tools, however i want to do due diligence as this is a production failover cluster.

I can provide additional information if needed, and i would appreciate any help in this matter so that i can complete the task at hand. I have never migrated a failover cluster before.

Thanks,

Kristopher

↧

Invitation: Provide Cluster feedback on the new Cluster UserVoice page

May 11, 2015, 10:59 pm

≫ Next: Having multiple CAs share the same private key

≪ Previous: Options or paths to migrate a microsoft fail-over cluster to a new storage array

The clustering team has a new UserVoice page here: http://windowsserver.uservoice.com/forums/295074-clustering that is part of the Windows Server UserVoice page:http://windowsserver.uservoice.com/forums/295047-general-feedback.

This is a new channel that we will be monitoring in parallel to the High Availability (Clustering) forum. Please feel free to add comments, suggestions, feedback, and vote on ideas posted on the new UserVoice page.

Many Thanks, TechNet Forum Users!!!

-Rob.

↧

Having multiple CAs share the same private key

April 28, 2015, 3:20 am

≫ Next: Storage Migration ONLY copies source disk (instead of move)

≪ Previous: Invitation: Provide Cluster feedback on the new Cluster UserVoice page

We are developing a system which implements an HA cluster across two separate geographical locations.
Each site will have several Windows Server 2012 machines and at least one DC, and we basically have to do a master-master replication between the two sites.
The entire system will be under a single domain.

We will be deploying AD CS since some of our sub-systems need certificates,
but we want to limit the variety certificate to just one (i.e. we want all CAs to issue identical certificates).
To do that, we have to setup AD CS so that all the DCs (both intra-site and inter-site) share the same private key.
Is it possible to have all DCs in a domain to share a single private key?

This article on TechNet suggests that we can do it within a cluster,
https://technet.microsoft.com/en-us/library/cc742450%28v=ws.10%29.aspx
but we are not sure if we can do it across different sites.

Any advice and comments are highly appreciated.

Wanko

↧

Storage Migration ONLY copies source disk (instead of move)

September 15, 2014, 1:17 am

≫ Next: Hyper-V (Guest) cluster on single machine...!!

≪ Previous: Having multiple CAs share the same private key

If I use Move/VM Storage in Cluster Manager, I would expect the storage to be MOVED

But instead it just gets copied to the destination (leaving the source untouched)

Surely that is at least misleading.

How to make move behaving as move?

Seb

↧

Hyper-V (Guest) cluster on single machine...!!

May 19, 2015, 10:19 am

≫ Next: Cluster Shared Volume error after server not shutting down properly

≪ Previous: Storage Migration ONLY copies source disk (instead of move)

Hi All,

I am new to cluster technology. Want to improve my skill in Windows 2012 clustering and planning to create test lab at home.

Is it possible to build hyper-v failed over cluster (guest) on single pc (server)? if answer is yes, what are the important instructions for that?

It would be great help for me...

Thanks

↧

Cluster Shared Volume error after server not shutting down properly

May 14, 2015, 2:34 am

≫ Next: Microsoft-Windows-FailoverClustering event id: 5120

≪ Previous: Hyper-V (Guest) cluster on single machine...!!

Hi,
We have two IBM X240 servers ( we call it server A and server B) connecting to IBM disk system:V3700 via fibre HBA.

The both servers are installing windows 2012 R2.

We have implemented VM cluster and everything is working well.

Last week this two server is down due to power shortage in my server room.

After turning on the server A, it will come out the below error:

Windows failed to start, a recent hardware or software change might be cause.
File: \windows\system32\drivers\msdsm.sys
status: 0xc0000017
Info:the operation system could't be loaded because a critical system drive is missing or contain errors.

After using the Last Good Configuration, we can log in to the system and turn on the clustered virtual machine.

it seems everything is fine now.

So i go and start the server B and log in to the system using the same method with server A.

I found all the VM will be shut down or running error due to Cluster Shared Volume error.

Refer to below some errors captured from system system logs.

* Event 5142, Cluster Shared Volume 'Volume7' ('Cluster Disk 10') is no longer accessible from this cluster node because of error '(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

* Event 5120,Cluster Shared Volume 'Volume3' ('Cluster Disk 4') has entered a paused state because of '(c00000be)'. All I/O will temporarily be queued until a path to the volume is reestablished.

Now we only can turn on only one server and shut down another server, if i turn on both server, the error will come out again & the server will go down.

Any suggestion or need me provide more information.

Thanks.

↧

Microsoft-Windows-FailoverClustering event id: 5120

May 20, 2015, 12:30 am

≫ Next: W2K12 R2 - Cluster Resource IP Address Replace Node IP Address In DNS

≪ Previous: Cluster Shared Volume error after server not shutting down properly

Hello All,

Every few days we receive the following message from our Hyper-V Cluster.

Cluster Shared Volume 'disk 1' ('disk 1) has entered a paused state because of '(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.Cluster Shared Volume 'disk 1' ('disk 1') has entered a paused state because of '(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.

We are running on Windows 2012R2 with all the patches till 01-05-2015.

All the VM's off the CSV volume stay online and there are no interruption within the VM's. Anybody a idee how ik can solve the message.

↧

W2K12 R2 - Cluster Resource IP Address Replace Node IP Address In DNS

May 15, 2015, 1:35 pm

≫ Next: Cluster migration, storage question.

≪ Previous: Microsoft-Windows-FailoverClustering event id: 5120

I have a two node Windows 2012 R2 Standard cluster. When the cluster resource is placed on a node in the cluster, the node's IP address on our Windows 2008 R2 DNS server is updated to the cluster resource's IP address. When I move the resource from node 1 to node 2, Node 1's IP address is changes back to the correct IP address, but node 2's IP address is update in DNS to the resource's IP address and vice versa.

↧

Cluster migration, storage question.

May 20, 2015, 4:44 am

≫ Next: Failover Cluster: CSV Network Migration

≪ Previous: W2K12 R2 - Cluster Resource IP Address Replace Node IP Address In DNS

I plan to migrate Server2008r2 cluster to Serevr2012r2. The clusters _Dont_ share the same storage. I guess I only need to transfer the Roles to the new 2012r2 cluster, but I am unsure what happens with the storage. Will it automatic be copied to the Server2012r2 storage LUN or how does it work?

↧

Failover Cluster: CSV Network Migration

May 20, 2015, 6:02 am

≫ Next: Can you cluster two nodes from subnet A with the cluster itself residing in subnet B?

≪ Previous: Cluster migration, storage question.

Our main VM CSV is connected to our iSCSI on the wrong network. We have three Cluster networks:

5.x - iscsi

40.x - cluster network / live migration

50.x - client vm network

For some reason the ISCSI is connected via the .50 at the moment on both of our CSV's, which means its passing all that traffic through our router. The 2nd CSV is just backups so it was easy for me to shutdown, and reconnect to the ISCSI volume. Doing the CSV with our other of dozens of vm's is not so easy and would require a scheduled outage on a weeekend.

Is there a graceful way to make it rollover to the correct network without bringing everything offline? I found this article and set the metric for the right network. But how do I make it reconnect without dropping connections?

https://technet.microsoft.com/en-us/library/ff182335%28WS.10%29.aspx

↧

Can you cluster two nodes from subnet A with the cluster itself residing in subnet B?

March 31, 2015, 2:38 pm

≫ Next: Windows 2008 R2 Cluster drive letter changed unexpectedly

≪ Previous: Failover Cluster: CSV Network Migration

We have a situation in a cloud deployment where the servers are automatically provisioned through a cloud portal with IP addressed from subnet A. Because we require static IP for clustering, the static IP can only be provisioned from subnet B for use by the cluster. Is it possible to create a cluster from the two nodes in subnet A from a cluster IP in subnet B? I cannot find any mention of this particular network configuration for a Windows cluster. We are using the cluster to support SQL Server 2012 AlwaysOn Availability Groups.

↧

Windows 2008 R2 Cluster drive letter changed unexpectedly

May 20, 2015, 8:32 pm

≫ Next: Storage Replica not configuring

≪ Previous: Can you cluster two nodes from subnet A with the cluster itself residing in subnet B?

On my Windows 2008 R2 cluster, drive letter changed unexpectedly during fail over. From what I understand, this can happen if the drive letter is already being used by the node at the time of fail over. But I did not find any error in cluster log related to the drive letter. I have seen this happening before on other Windows 2008 R2 clusters too.I want to understand what causes this unexpected drive letter change and what can I do to prevent this from happening.

ARASKAS

↧