Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 5654

Node of 2008 R2 cluster fails join its cluster and causes bogus share witness errors and loss of protection.

$
0
0

Hello.

I’m  trying to get   cluster 2008 R2 Ent. running as part of exchange server with two nodes DAG and witness server.  The cluster constantly gets one node down at random intervals (30 Min to 1-2 days), and  it can stay this way from few minute to several days.  Stopping  and restarting  the cluster (whole cluster)usually help, but it is not a solution as stability  and automatic failover for a reason  are essential for an usable e-mail service. I have dug  through the problem for quite a while, so I  noticed  a way to replicate the situation, and have reached the point where I cannot find any more clue what causes the break up of process.

The first symptoms  are events of failure-to-arbitrate and witness-unreachable:

event 1564, File share witness resource 'File Share Witness (\\witness01.company.com\DAG01.company.com)' failed to arbitrate for the file share '\\witness01.company.com\DAG01.company.com'.

event 1069, “Cluster resource 'File Share Witness (\\witness01.company.com\DAG01.company.com)' in clustered service or application 'Cluster Group' failed”.

event1573:” Node 'node002' failed to form a cluster. This was because the witness was not accessible. Please ensure that the witness resource is online and available”.


I monitored all traffic between nodes and  witness, and I found that NOTHING IS WRONG  with the witness. The node002 was trying to access and create folders and lock files in witness share, while the folders were actially present  and  files were still locked by node001. The node001 was keeping  majority  and locks of the  file witness and running the  remaining part of the cluster. HENCE THE KEY QUESTION BECAME:  WHY THE node002 TRIES TO TAKE OVER THE WITNESS INSTEAD OF JOINING THE CLUSTER.

I have downloaded the cluster log and found that  without any error node 001 and  node002  generate respectively “ DBG  [CHANNEL 192.168.2.22:~51189~] Close()”   and  “CHANNEL 192.168.1.11:~3343~] Close()” , later node002 registered an event “ INFO Shutdown lock acquired, proceeding with shutdown” and  then after some activities node002 starts knocking  to witness. May be the event at node001:  “WARN  [FTI][Initiator] Ignoring duplicate connection: usable route already exists”  is somehow related to the problem. This behavior  can be replicated with some  probability through stopping and starting cluster service on the node002. Sometimes the node002 joins the cluster in few seconds, but in many cases it takes longer (hours, and few times even days), and  then I know that  the node002 just went   hunting for file witness.

I found many  discussions regarding to “CHANNEL ….. graceful close, status” event, but those cases usually relay some errors  and most of them  are about  security and duplicate names of accounts. My log does not have any errors before the event and there is not any duplicated names. Actually, the fact that cluster periodically works fine suggests that there is not any permanent permission or name duplication problem.

This is a single network Exchange DAG cluster. It has   two nodes located in two different sites connected over VPN and a file witness located in a third site.  Logs from node002 and  a piece from node001 (goes after ----NODE001-----) are bellow. Log from node001 shows some inactivity periods around the moment of the “closing-the-channel”

I marked some key events with  ***  (including: 15:52:15.651 INFO***  Shutdown lock acquired, proceeding with shutdow).

Thanks  for any help.

 

 

-------------------------------------------------------NODE002-----------------------------------------------------------

000010d4.000015e4::2013/07/15-15:52:12.480 INFO  [IM] Route from 192.168.2.22:~3343~ to 192.168.1.11:~3343~ is already up, not sending report

000010d4.000016f4::2013/07/15-15:52:12.480 INFO  [NODE] Node 2: New join with n1: stage: 'Wait for Heartbeats on Initial NetFT Route'

000010d4.000016f4::2013/07/15-15:52:12.495 DBG   [FTW] NetFT address fe80::71f7:22a3:89fb:ab11:~3343~ is ready.

000010d4.000016f4::2013/07/15-15:52:12.495 INFO  [FTW] NetFT is ready after 0 msecs wait.

000010d4.000016f4::2013/07/15-15:52:12.495 INFO  [NODE] Node 2: New join with n1: stage: 'Wait for NetFT Duplicate Address Detection'

000010d4.000016f4::2013/07/15-15:52:12.511 DBG   [NETFTAPI] received NsiParameterNotification for fe80::71f7:22a3:89fb:ab11 (IpDadStatePreferred )

000010d4.000016f4::2013/07/15-15:52:12.511 DBG   [NETFTAPI] Signaled NetftLocalAdd event for fe80::71f7:22a3:89fb:ab11

000010d4.000016f4::2013/07/15-15:52:12.511 DBG   [NETFTEVM] FTI NetFT event handler got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.000016f4::2013/07/15-15:52:12.511 DBG   [NETFTEVM] TM NetFT event handler got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.000016f4::2013/07/15-15:52:12.511 DBG   [NETFTEVM] IM NetFT event handler got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.000016f4::2013/07/15-15:52:12.511 DBG   [WM] Filtering event NETFT_LOCAL_ADD? 1

000010d4.0000149c::2013/07/15-15:52:12.511 DBG   [NETFTEVM] FTI NetFT event dispatcher pushing event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.00000d34::2013/07/15-15:52:12.511 DBG   [NETFTEVM] TM NetFT event dispatcher pushing event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.000015e4::2013/07/15-15:52:12.511 DBG   [NETFTEVM] IM NetFT event dispatcher pushing event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.000015e4::2013/07/15-15:52:12.511 INFO  [IM] got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ added

000010d4.000016f4::2013/07/15-15:52:12.526 DBG   [NETFTAPI] Signaled NetftLocalConnect event for fe80::71f7:22a3:89fb:ab11

000010d4.000016f4::2013/07/15-15:52:12.526 DBG   [NETFTEVM] FTI NetFT event handler got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.000016f4::2013/07/15-15:52:12.526 DBG   [NETFTEVM] TM NetFT event handler got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.000016f4::2013/07/15-15:52:12.526 DBG   [NETFTEVM] IM NetFT event handler got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.000016f4::2013/07/15-15:52:12.526 DBG   [WM] Filtering event NETFT_LOCAL_CONNECT? 1

000010d4.0000149c::2013/07/15-15:52:12.526 DBG   [NETFTEVM] FTI NetFT event dispatcher pushing event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.00000d34::2013/07/15-15:52:12.526 DBG   [NETFTEVM] TM NetFT event dispatcher pushing event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.000015e4::2013/07/15-15:52:12.526 DBG   [NETFTEVM] IM NetFT event dispatcher pushing event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.000015e4::2013/07/15-15:52:12.526 INFO  [IM] got event: Local endpoint fe80::71f7:22a3:89fb:ab11:~0~ connected

000010d4.0000163c::2013/07/15-15:52:12.901 INFO  [ACCEPT] :::~3343~: Accepted inbound connection from remote endpoint fe80::7964:c5c5:3b5:7833%13:~37758~.

000010d4.000016f4::2013/07/15-15:52:12.901 INFO  [SV] Route local (fe80::71f7:22a3:89fb:ab11%13:~3343~) to remote (fe80::7964:c5c5:3b5:7833%13:~37758~) exists. Forwarding to alternate path.

000010d4.000016f4::2013/07/15-15:52:12.901 INFO  [SV] Securing route from (fe80::71f7:22a3:89fb:ab11%13:~3343~) to remote (fe80::7964:c5c5:3b5:7833%13:~37758~).

000010d4.000016f4::2013/07/15-15:52:12.901 INFO  [SV] Got a new incoming stream from fe80::7964:c5c5:3b5:7833%13:~37758~

000010d4.000016f4::2013/07/15-15:52:12.901 DBG   [SM] SrvCtxt initialized with package Kerberos, MaxTokenSize = 12000, RequiredCtxAttrib = 165910, HandShakeTimeout = 30000

000010d4.00000a84::2013/07/15-15:52:12.901 DBG   [SM] Handling auth handshake posted by thread id 5876

000010d4.0000154c::2013/07/15-15:52:13.042 WARN  [API] s_ApiOpenGroupEx: Group Cluster Group failed, status = 70

000010d4.0000147c::2013/07/15-15:52:13.651 DBG   [JPM] Node 2: contacts size for node node001 is 1, current index 0

000010d4.0000147c::2013/07/15-15:52:13.651 DBG   [JPM] Node 2: Trying to connect to node node001 (IP: 192.168.1.11:~0~)

000010d4.0000147c::2013/07/15-15:52:13.651 DBG   [HM] Trying to connect to node001 at 192.168.1.11:~3343~

000010d4.0000147c::2013/07/15-15:52:13.776 INFO  [CONNECT] 192.168.1.11:~3343~: Established connection to remote endpoint 192.168.1.11:~3343~.

000010d4.0000147c::2013/07/15-15:52:13.776 INFO  [SV] Securing route from (192.168.2.22:~51189~) to remote node001 (192.168.1.11:~3343~).

000010d4.0000147c::2013/07/15-15:52:13.776 INFO  [SV] Got a new outgoing stream to node001 at 192.168.1.11:~3343~

000010d4.0000147c::2013/07/15-15:52:13.776 DBG   [SM] Joiner: Initialized with SPN = node001, Package = Kerberos, RequiredCtxAttrib = 83990, HandShakeTimeout = 30000

000010d4.0000127c::2013/07/15-15:52:13.776 DBG   [SM] Handling auth handshake posted by thread id 5244

000010d4.0000127c::2013/07/15-15:52:13.776 DBG   [SM] Joiner: ISC returned status = 590610 output Blob size 1578

000010d4.0000127c::2013/07/15-15:52:13.917 DBG   [SM] Joiner: Received SSPI blob from the Sponsor of size 156

000010d4.0000127c::2013/07/15-15:52:13.917 DBG   [SM] Joiner: ISC returned status = 0 output Blob size 0

000010d4.0000147c::2013/07/15-15:52:13.933 INFO  [SV] Authentication and authorization were successful

000010d4.0000147c::2013/07/15-15:52:13.933 DBG   [SM] Joiner: Initialized with SPN = node001, Package = Kerberos, RequiredCtxAttrib = 67586, HandShakeTimeout = 30000

000010d4.0000127c::2013/07/15-15:52:13.933 DBG   [SM] Handling auth handshake posted by thread id 5244

000010d4.0000127c::2013/07/15-15:52:13.933 DBG   [SM] Joiner: ISC returned status = 590610 output Blob size 1578

000010d4.0000127c::2013/07/15-15:52:14.073 DBG   [SM] Joiner: Received SSPI blob from the Sponsor of size 156

000010d4.0000127c::2013/07/15-15:52:14.073 DBG   [SM] Joiner: ISC returned status = 0 output Blob size 0

000010d4.0000147c::2013/07/15-15:52:14.073 INFO  [SV] Security Handshake successful while obtaining SecurityContext for NetFT driver

000010d4.0000147c::2013/07/15-15:52:14.073 INFO  [VER] Got new TCP connection. Exchanging version data.

000010d4.0000147c::2013/07/15-15:52:14.073 DBG   [VER] Calculated cluster versions: highest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1], lowest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1] with exclude node list: (1)

000010d4.0000147c::2013/07/15-15:52:14.073 INFO  [VER] Checking version compatibility for node node001 id 1 with following versions: highest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1], lowest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1].

000010d4.0000147c::2013/07/15-15:52:14.073 INFO  [VER] Version check passed: node and cluster highest supported versions match.

000010d4.0000147c::2013/07/15-15:52:14.198 INFO  [SV] Negotiating message security level.

000010d4.0000147c::2013/07/15-15:52:14.198 INFO  [SV] Already protecting connection with message security level 'Sign'.

000010d4.0000147c::2013/07/15-15:52:14.198 INFO  [FTI] Got new raw TCP/IP connection.

000010d4.0000147c::2013/07/15-15:52:14.339 INFO  [FTI][Follower] This node (2) is not the initiator

000010d4.0000147c::2013/07/15-15:52:14.339 DBG   [FTI] Stream already exists to node 1: false

000010d4.0000147c::2013/07/15-15:52:14.339 DBG***   [CHANNEL 192.168.1.11:~3343~] Close().

000010d4.0000147c::2013/07/15-15:52:14.339 INFO***  [CHANNEL 192.168.1.11:~3343~] graceful close, status (of previous failure, may not indicate problem) ERROR_SUCCESS(0)

000010d4.0000147c::2013/07/15-15:52:14.339 INFO***  [CORE] Node 2: Clearing cookie bfc27345-e777-418e-b69d-1f1b8fe89bcf

000010d4.0000147c::2013/07/15-15:52:14.339 DBG***   [CHANNEL 192.168.1.11:~3343~] Not closing handle because it is invalid.

 

000010d4.0000147c::2013/07/15-15:52:14.339 WARN***  cxl::ConnectWorker::operator (): GracefulClose(1226)' because of 'channel to remote endpoint 192.168.1.11:~3343~ is closed'

 

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [NETFTAPI] received NsiParameterNotification for 169.254.171.17 (IpDadStateInvalid )

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [NETFTAPI] received NsiDeleteInstance for 169.254.171.17

000010d4.0000147c::2013/07/15-15:52:14.511 WARN  [NETFTAPI] Failed to query parameters for 169.254.171.17 (status 80070490)

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [NETFTAPI] Signaled NetftLocalAdd event for 169.254.171.17

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [NETFTEVM] FTI NetFT event handler ignoring PnP add event for IPv4 LinkLocal address 169.254.171.17:~0~

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [NETFTEVM] TM NetFT event handler ignoring PnP add event for IPv4 LinkLocal address 169.254.171.17:~0~

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [NETFTEVM] IM NetFT event handler ignoring PnP add event for IPv4 LinkLocal address 169.254.171.17:~0~

000010d4.0000147c::2013/07/15-15:52:14.511 DBG   [WM] Filtering event NETFT_LOCAL_ADD? 1

000010d4.0000147c::2013/07/15-15:52:14.526 WARN  [NETFTAPI] Failed to query parameters for 169.254.171.17 (status 80070490)

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTAPI] Signaled NetftLocalRemove event for 169.254.171.17

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] FTI NetFT event handler ignoring PnP remove event for IPv4 LinkLocal address 169.254.171.17:~0~

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] TM NetFT event handler ignoring PnP remove event for IPv4 LinkLocal address 169.254.171.17:~0~

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] IM NetFT event handler ignoring PnP remove event for IPv4 LinkLocal address 169.254.171.17:~0~

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [WM] Filtering event NETFT_LOCAL_REMOVE? 1

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTAPI] received NsiParameterNotification for 169.254.2.47 (IpDadStatePreferred )

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTAPI] Signaled NetftLocalAdd event for 169.254.2.47

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] FTI NetFT event handler ignoring PnP add event for IPv4 LinkLocal address 169.254.2.47:~0~

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] TM NetFT event handler ignoring PnP add event for IPv4 LinkLocal address 169.254.2.47:~0~

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] IM NetFT event handler ignoring PnP add event for IPv4 LinkLocal address 169.254.2.47:~0~

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [WM] Filtering event NETFT_LOCAL_ADD? 1

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTAPI] Signaled NetftLocalConnect event for 169.254.2.47

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] FTI NetFT event handler got event: Local endpoint 169.254.2.47:~0~ connected

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] TM NetFT event handler got event: Local endpoint 169.254.2.47:~0~ connected

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] IM NetFT event handler got event: Local endpoint 169.254.2.47:~0~ connected

000010d4.00000d34::2013/07/15-15:52:14.526 DBG   [NETFTEVM] TM NetFT event dispatcher pushing event: Local endpoint 169.254.2.47:~0~ connected

000010d4.0000147c::2013/07/15-15:52:14.526 DBG   [WM] Filtering event NETFT_LOCAL_CONNECT? 1

000010d4.000015e4::2013/07/15-15:52:14.526 DBG   [NETFTEVM] IM NetFT event dispatcher pushing event: Local endpoint 169.254.2.47:~0~ connected

000010d4.000015e4::2013/07/15-15:52:14.526 INFO  [IM] got event: Local endpoint 169.254.2.47:~0~ connected

000010d4.0000149c::2013/07/15-15:52:14.526 DBG   [NETFTEVM] FTI NetFT event dispatcher pushing event: Local endpoint 169.254.2.47:~0~ connected

000010d4.00000a84::2013/07/15-15:52:15.261 DBG   [SM] Sponsor: Received SSPI blob from the Joiner of size 1577

000010d4.00000a84::2013/07/15-15:52:15.261 DBG   [SM] Sponsor: SSPI ASC returned status = 0

000010d4.00000a84::2013/07/15-15:52:15.261 DBG   [SM] Sponsor: Sending SSPI blob of size 155 to Joiner

000010d4.00000a84::2013/07/15-15:52:15.261 DBG   [SM] Sponsor: Authentication handshake final Status 0

000010d4.000016f4::2013/07/15-15:52:15.401 INFO  [SV] Authentication and authorization were successful

000010d4.000016f4::2013/07/15-15:52:15.401 DBG   [SM] SrvCtxt initialized with package Kerberos, MaxTokenSize = 12000, RequiredCtxAttrib = 133122, HandShakeTimeout = 30000

000010d4.00000a84::2013/07/15-15:52:15.401 DBG   [SM] Handling auth handshake posted by thread id 5876

000010d4.00000a84::2013/07/15-15:52:15.401 DBG   [SM] Sponsor: Received SSPI blob from the Joiner of size 1577

000010d4.00000a84::2013/07/15-15:52:15.401 DBG   [SM] Sponsor: SSPI ASC returned status = 0

000010d4.00000a84::2013/07/15-15:52:15.401 DBG   [SM] Sponsor: Sending SSPI blob of size 155 to Joiner

000010d4.00000a84::2013/07/15-15:52:15.401 DBG   [SM] Sponsor: Authentication handshake final Status 0

000010d4.000016f4::2013/07/15-15:52:15.401 INFO  [SV] Security Handshake successful while obtaining SecurityContext for NetFT driver

000010d4.000016f4::2013/07/15-15:52:15.542 DBG   [SV] Incoming (second) connection from node001 is secure

000010d4.000016f4::2013/07/15-15:52:15.542 INFO  [ReM] Got stream info from fe80::71f7:22a3:89fb:ab11%13:~3343~ to fe80::7964:c5c5:3b5:7833%13:~37758~.

000010d4.000016f4::2013/07/15-15:52:15.542 DBG   [ReM] Exchanging local info.

000010d4.000016f4::2013/07/15-15:52:15.542 DBG   [ReM] Sending local info.

000010d4.000016f4::2013/07/15-15:52:15.542 DBG   [ReM] Local info sent, receiving remote info.

000010d4.000016f4::2013/07/15-15:52:15.542 DBG   [ReM] Remote info received from 1:node001.

000010d4.000016f4::2013/07/15-15:52:15.542 DBG   [ReM][Follower] I am the follower with n1.

000010d4.000013d4::2013/07/15-15:52:15.651 INFO  [DM] Node 2: Loaded

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] Form is called (lightweight form = true)

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] rcm::RcmAgent::Unload()

000010d4.000013d4::2013/07/15-15:52:15.651 INFO***  Shutdown lock acquired, proceeding with shutdown

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] orphan group handlers: requested=0, started=0, finished=0

000010d4.000013d4::2013/07/15-15:52:15.651 INFO  [GUM] Node 2: shutting down gum handling

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] Disabling API calls

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] Enabled API calls

000010d4.000013d4::2013/07/15-15:52:15.651 INFO  [GUM] Node 2: reenabling gum handling

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] rcm::RcmResType::InitializeFromDb()

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] Deleting stale key SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters\Rhs\3b8c4a5e-220e-4753-859c-967862ba62d4

000010d4.000013d4::2013/07/15-15:52:15.651 DBG   [RCM] Deleting stale key SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters\Rhs\b1750551-cf79-41c0-bfa7-b383cb5e40ec

000010d4.000013d4::2013/07/15-15:52:15.651 INFO  [RCM] Created monitor process 4464 / 0x1170

00001170.00000f74::2013/07/15-15:52:15.651 INFO  [RHS] Initializing.

000010d4.000013d4::2013/07/15-15:52:15.667 DBG   [RCM] Scheduling wait callback for monitor process 4464

00001170.00000da0::2013/07/15-15:52:15.667 DBG   [RHS] s_RhsRpcCreateResType(DFS Replicated Folder, dfsrclus.dll)

000010d4.000016f4::2013/07/15-15:52:15.683 INFO  [ReM][Follower] Got remote data from n1, epoch: 0, sn: 0, Fault Tolerant Session Id: 00000000-0000-0000-0000-000000000000

000010d4.000016f4::2013/07/15-15:52:15.683 DBG   [NODE] Node 2: To n1 getting epoch (currently 0)

000010d4.000016f4::2013/07/15-15:52:15.683 DBG   [ReM][Follower] Current state with n1, epoch: 0, sn: 0

000010d4.000016f4::2013/07/15-15:52:15.683 DBG   [ReM][Follower] Successfully sent current state to 1.

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(DHCP Service, clnetres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(Distributed File System, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(Distributed Transaction Coordinator, mtxclu.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(File Server, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(File Share Witness, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(Generic Application, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(Generic Script, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(Generic Service, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.714 DBG   [RHS] s_RhsRpcCreateResType(IP Address, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(IPv6 Address, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(IPv6 Tunnel Address, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(Microsoft iSNS, isnsclusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(MSMQ, mqclus.dll)

00001170.00000da0::2013/07/15-15:52:15.730 ERR   [RHS] s_RhsRpcCreateResType: ERROR_NOT_READY(21)' because of 'Startup routine for ResType MSMQ returned 21.'

000010d4.000013d4::2013/07/15-15:52:15.730 WARN  [RCM] Failed to load restype 'MSMQ': error 21.

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(MSMQTriggers, mqtgclus.dll)

00001170.00000da0::2013/07/15-15:52:15.730 ERR   [RHS] s_RhsRpcCreateResType: ERROR_NOT_READY(21)' because of 'Startup routine for ResType MSMQTriggers returned 21.'

000010d4.000013d4::2013/07/15-15:52:15.730 WARN  [RCM] Failed to load restype 'MSMQTriggers': error 21.

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(Network Name, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(NFS Share, nfssh.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(Physical Disk, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(Print Spooler, clusres.dll)

00001170.00000da0::2013/07/15-15:52:15.730 DBG   [RHS] s_RhsRpcCreateResType(Virtual Machine, vmclusres.dll)

00001170.00000da0::2013/07/15-15:52:15.745 DBG   [RHS] s_RhsRpcCreateResType(Virtual Machine Configuration, vmclusres.dll)

00001170.00000da0::2013/07/15-15:52:15.745 DBG   [RHS] s_RhsRpcCreateResType(Volume Shadow Copy Service Task, vsstask.dll)

00001170.00000da0::2013/07/15-15:52:15.745 DBG   [RHS] s_RhsRpcCreateResType(WINS Service, clnetres.dll)

000010d4.000013d4::2013/07/15-15:52:15.745 DBG   [RCM] rcm::RcmGroup::InitializeFromDb()

000010d4.000013d4::2013/07/15-15:52:15.745 DBG   [RCM] rcm::RcmDependency::InitializeFromDb()

000010d4.000013d4::2013/07/15-15:52:15.745 DBG   [RCM] rcm::RcmResource::AddDependency(Cluster Name, 79eb390e-4ac4-4088-8057-aa778415e0b5)

000010d4.000013d4::2013/07/15-15:52:15.745 INFO  [API] Online read only

000010d4.000013d4::2013/07/15-15:52:15.745 DBG   RcmGroup::TakeOwnershipOfAllGroups

000010d4.000013d4::2013/07/15-15:52:15.745 DBG   [RCM] rcm::RcmGroup::TransitionToState: Available Storage: Offline->ClusterGroupChoosingOwner.

000010d4.000013d4::2013/07/15-15:52:15.745 DBG   [RCM] rcm::RcmGroup::TransitionToState: Cluster Group: Offline->ClusterGroupChoosingOwner.

000010d4.000013d4::2013/07/15-15:52:15.745 INFO  [RCM] Created monitor process 4168 / 0x1048

00001048.00001214::2013/07/15-15:52:15.761 INFO  [RHS] Initializing.

000010d4.000013d4::2013/07/15-15:52:15.776 DBG   [RCM] Scheduling wait callback for monitor process 4168

000010d4.000013d4::2013/07/15-15:52:15.776 DBG   [RCM] rpc binding handle for File Share Witness (\\witness01.company.com\DAG01.company.com): HDL(18e9e60)

000010d4.000013d4::2013/07/15-15:52:15.776 DBG   Sending control 1

000010d4.0000147c::2013/07/15-15:52:15.808 INFO  [NM] Received request from client address node002.

000010d4.0000147c::2013/07/15-15:52:15.808 DBG   [API] Authenticated client--Client: NT AUTHORITY\SYSTEM Interface: b97db8b2-4c63-11cf-bff6-08002be23f2f Server: (null) Level: RPC_C_AUTHN_LEVEL_PKT_PRIVACY Service: RPC_C_AUTHN_WINNT Protocol Sequence: ncalrpc Client Address: node002 Network Option: .

000010d4.0000147c::2013/07/15-15:52:15.808 DBG   [API] s_ApiClusterControl(GET_COMMON_PROPERTIES)

000010d4.0000147c::2013/07/15-15:52:15.808 DBG   [API] s_ApiClusterControl(GET_COMMON_PROPERTIES)

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [ReM][Follower] Got direction from 1. Epoch is now 1, will resume from SN 0.  Fault Tolerant Session ID is 58d685e0-1c0b-4027-959f-04b2fb2f8ecc

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [ReM] Sending connection down normal path.

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [NODE] Node 2: New join with n1: stage: 'Update NetFT Route'

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [JPM] Received a new stream from node001

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [NODE] Node 2: New join with n1: stage: 'Send Current Membership Status for Join Policy'

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [MM] Node 2: Adding a stream to existing node 1

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [NODE] Node 2: n1 node object adding stream

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: n1 node object got a channel

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: Using new stream to n1, setting epoch to 1

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: Done closing stream to n1

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: My Fault Tolerant Session Id is now 58d685e0-1c0b-4027-959f-04b2fb2f8ecc

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [NODE] Node 2: No reconnect in progress to n1, updating send queue based on new stream.

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: Treating stream with n1 as new connection because epoch (1) is <= 1.

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [MQ-node001] Clearing 0 unsent and 0 unacknowledged messages.

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [NODE] Node 2: Highest version with n1 = Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1, lowest = Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [NODE] Node 2: Done processing new stream to n1.

000010d4.0000147c::2013/07/15-15:52:15.808 INFO  [PULLER node001] Just about to start reading from <refcounted count='2' typeid='.?AVBufferedStream@cxl@@'/>

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [CORE] Node 2: sending jpm/welcome to JPMA at node001

000010d4.000016f4::2013/07/15-15:52:15.808 INFO  [JPM] Node 2: Selected partition 802(1) as a target for join

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [JPM] Node 2: join attempt 802(1) was vetoed. Will retry

000010d4.000016f4::2013/07/15-15:52:15.808 DBG   [CORE] Veto Cancel Requested

000010d4.0000123c::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: just about to send a message of size 226 to 1

000010d4.0000123c::2013/07/15-15:52:15.808 DBG   [NODE] Node 2: message to node 1 sent

000010d4.0000154c::2013/07/15-15:52:15.823 DBG   [RCM] rpc binding handle for Cluster Name: HDL(18e9fe0)

000010d4.0000154c::2013/07/15-15:52:15.823 DBG   Sending control 3

00001048.00000eec::2013/07/15-15:52:15.823 INFO  [RES] Network Name <Cluster Name>: NetNameOpen Invoked

00001048.00000eec::2013/07/15-15:52:15.823 INFO  [RES] Network Name <Cluster Name>: Successful open of resid 3244336

000010d4.00000e30::2013/07/15-15:52:15.823 INFO  [NM] Received request from client address node002.

000010d4.00000e30::2013/07/15-15:52:15.823 DBG   [API] Authenticated client--Client: NT AUTHORITY\SYSTEM Interface: 299bc84a-de09-49e9-a240-8a1042d5d60a Server: (null) Level: RPC_C_AUTHN_LEVEL_PKT_PRIVACY Service: RPC_C_AUTHN_WINNT Protocol Sequence: ncalrpc Client Address: node002 Network Option: .

000010d4.00000e30::2013/07/15-15:52:15.823 INFO  [RCM] HandleMonitorReply: OPENRESOURCE for 'Cluster Name', gen(0) result 0.

00001048.00000eec::2013/07/15-15:52:15.823 INFO  [RES] Network Name <Cluster Name>: Getting a virtual computer account token.

00001048.00000eec::2013/07/15-15:52:15.839 INFO  [RES] Network Name <Cluster Name>: Resource object did not contain the cached AD Domain. Obtaining.

000010d4.0000147c::2013/07/15-15:52:15.948 INFO  [JPM] Node 2: Node 1 is in view 802(1) and hasQuorum = true

00001048.00000eec::2013/07/15-15:52:15.995 INFO  [RES] Network Name <Cluster Name>: Got new Logon Session.

000010d4.0000154c::2013/07/15-15:52:15.995 INFO  [RCM] HandleMonitorReply: OPENRESOURCE for 'File Share Witness (\\witness01.company.com\DAG01.company.com)', gen(0) result 0.

000010d4.000013d4::2013/07/15-15:52:15.995 INFO  [QUORUM] Node 2: setting quorum id to bd0940e0-a02c-4622-86e0-d9418f055e02 (storage-capable: false)

000010d4.000013d4::2013/07/15-15:52:15.995 DBG   [RCM] rcm::RcmAgent::SetQuorumResource(bd0940e0-a02c-4622-86e0-d9418f055e02)

000010d4.000013d4::2013/07/15-15:52:15.995 INFO  [QUORUM] Node 2: online quorum bd0940e0-a02c-4622-86e0-d9418f055e02

000010d4.000013d4::2013/07/15-15:52:15.995 DBG   [RCM] rcm::RcmAgent::Online(bd0940e0-a02c-4622-86e0-d9418f055e02)

000010d4.000013d4::2013/07/15-15:52:15.995 DBG   [RCM] rcm::RcmResource::IsReadyToGoOnline=> (File Share Witness (\\witness01.company.com\DAG01.company.com), true)

000010d4.000013d4::2013/07/15-15:52:15.995 INFO  [RCM] TransitionToState(File Share Witness (\\witness01.company.com\DAG01.company.com)) Offline-->OnlineCallIssued.

000010d4.000013d4::2013/07/15-15:52:15.995 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (Cluster Group, ClusterGroupChoosingOwner --> Pending)

000010d4.000013d4::2013/07/15-15:52:15.995 DBG   [RCM] rcm::RcmResource::WaitForState(File Share Witness (\\witness01.company.com\DAG01.company.com), Online)

000010d4.00000e30::2013/07/15-15:52:15.995 DBG   [CM] mscs::CheckpointManager::PreOnline: File Share Witness (\\witness01.company.com\DAG01.company.com)

000010d4.00000e30::2013/07/15-15:52:15.995 DBG   [RCM] Issuing Arbitrate(File Share Witness (\\witness01.company.com\DAG01.company.com)) to RHS.

00001048.00000eec::2013/07/15-15:52:15.995 INFO  [RES] File Share Witness <File Share Witness (\\witness01.company.com\DAG01.company.com)>: Beginning arbitration ...

000010d4.0000154c::2013/07/15-15:52:16.042 DBG   [API] s_ApiMoveGroupToNode(Cluster Group, 1)

000010d4.0000154c::2013/07/15-15:52:16.042 INFO  [RCM] rcm::RcmApi::MoveGroup: (Cluster Group, 1)

000010d4.0000154c::2013/07/15-15:52:16.042 DBG   [RCM] rcm::RcmGroup::WaitForStableState(Cluster Group, MustBeOfflineOrFailed::No)

000010d4.0000154c::2013/07/15-15:52:16.042 DBG   [RCM] rcm::RcmGroup::WaitForStableState: Group Cluster Group is Pending; group is not moving.

000010d4.0000154c::2013/07/15-15:52:16.042 DBG   [RCM] rcm::RcmGroup::WaitForStableState: Resources which are not in stable state:

000010d4.0000154c::2013/07/15-15:52:16.042 DBG   [RCM] File Share Witness (\\witness01.company.com\DAG01.company.com): OnlineCallIssued,

000010d4.00000ddc::2013/07/15-15:52:16.808 INFO  [JPM] Node 2: Selected partition 802(1) as a target for join

000010d4.00000ddc::2013/07/15-15:52:16.808 DBG   [JPM] Node 2: join attempt 802(1) was vetoed. Will retry

00001048.00000eec::2013/07/15-15:52:17.261 INFO  [RES] File Share Witness <File Share Witness (\\witness01.company.com\DAG01.company.com)>: Opening file \\witness01.company.com\DAG01.company.com\bd0940e0-a02c-4622-86e0-d9418f055e02\Witness.log.

00001048.00000eec::2013/07/15-15:52:17.526 INFO  [RES] File Share Witness <File Share Witness (\\witness01.company.com\DAG01.company.com)>: Attempting to lock file \\witness01.company.com\DAG01.company.com\bd0940e0-a02c-4622-86e0-d9418f055e02\Witness.log, try 1 of 30.

000010d4.00000ddc::2013/07/15-15:52:17.808 INFO  [JPM] Node 2: Selected partition 802(1) as a target for join

000010d4.00000ddc::2013/07/15-15:52:17.808 DBG   [JPM] Node 2: join attempt 802(1) was vetoed. Will retry

000010d4.00000ddc::2013/07/15-15:52:18.808 INFO  [JPM] Node 2: Selected partition 802(1) as a target for join

000010d4.00000ddc::2013/07/15-15:52:18.808 DBG   [JPM] Node 2: join attempt 802(1) was vetoed. Will retry

000010d4.0000154c::2013/07/15-15:52:19.167 DBG   [RCM] File Share Witness (\\witness01.company.com\DAG01.company.com): OnlineCallIssued,

 

 

 

-------------------------------------------------------------NODE001-----------------------------------------------------------------

000019a0.00001924::2013/07/15-15:52:14.203 INFO  [VER] Got new TCP connection. Exchanging version data.

000019a0.00001924::2013/07/15-15:52:14.343 DBG   [VER] Calculated cluster versions: highest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1], lowest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1] with exclude node list: (2)

000019a0.00001924::2013/07/15-15:52:14.343 INFO  [VER] Checking version compatibility for node node002 id 2 with following versions: highest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1], lowest [Major 6 Minor 7601 Upgrade 7 ClusterVersion 0x00061DB1].

000019a0.00001924::2013/07/15-15:52:14.343 INFO  [VER] Version check passed: node and cluster highest supported versions match.

000019a0.00001924::2013/07/15-15:52:14.343 INFO  [SV] Negotiating message security level.

000019a0.00001924::2013/07/15-15:52:14.484 INFO  [SV] Already protecting connection with message security level 'Sign'.

000019a0.00001924::2013/07/15-15:52:14.484 INFO  [FTI] Got new raw TCP/IP connection.

000019a0.00001924::2013/07/15-15:52:14.484 INFO  [FTI][Initiator] This node (1) is initiator

 

000019a0.00001924::2013/07/15-15:52:14.484 WARN  [FTI][Initiator] Ignoring duplicate connection: usable route already exists

000019a0.00001924::2013/07/15-15:52:14.484 DBG   [CHANNEL 192.168.2.22:~51189~] Close().

000019a0.00001924::2013/07/15-15:52:14.484 DBG   [CHANNEL 192.168.2.22:~51189~]/send: Attempting to perform I/O on closed stream.

000019a0.00001924::2013/07/15-15:52:14.484 DBG   [CHANNEL 192.168.2.22:~51189~] Not closing handle because it is invalid.

000019a0.00001924::2013/07/15-15:52:14.484 INFO  [CHANNEL 192.168.2.22:~51189~] graceful close, status (of previous failure, may not indicate problem) ERROR_SUCCESS(0)

 

000019a0.00001924::2013/07/15-15:52:14.484 DBG   [CHANNEL 192.168.2.22:~51189~] Not closing handle because it is invalid.

 

000019a0.00001924::2013/07/15-15:52:14.484 WARN  mscs::ListenerWorker::operator (): GracefulClose(1226)' because of 'channel to remote endpoint 192.168.2.22:~51189~ is closed'

000019a0.00001924::2013/07/15-15:52:15.062 DBG   [NETFTAPI] received NsiParameterNotification for 169.254.1.204 (IpDadStatePreferred )

000019a0.00001924::2013/07/15-15:52:15.062 DBG   [NETFTAPI] Signaled NetftLocalConnect event for 169.254.1.204

000019a0.00001924::2013/07/15-15:52:15.062 DBG   [NETFTEVM] FTI NetFT event handler got event: Local endpoint 169.254.1.204:~0~ connected

000019a0.00001924::2013/07/15-15:52:15.062 DBG   [NETFTEVM] TM NetFT event handler got event: Local endpoint 169.254.1.204:~0~ connected

000019a0.00001924::2013/07/15-15:52:15.062 DBG   [NETFTEVM] IM NetFT event handler got event: Local endpoint 169.254.1.204:~0~ connected

000019a0.00001924::2013/07/15-15:52:15.062 DBG   [WM] Filtering event NETFT_LOCAL_CONNECT? 1

000019a0.00001820::2013/07/15-15:52:15.062 DBG   [NETFTEVM] TM NetFT event dispatcher pushing event: Local endpoint 169.254.1.204:~0~ connected

000019a0.000013a0::2013/07/15-15:52:15.062 DBG   [NETFTEVM] FTI NetFT event dispatcher pushing event: Local endpoint 169.254.1.204:~0~ connected

000019a0.000019d4::2013/07/15-15:52:15.062 DBG   [NETFTEVM] IM NetFT event dispatcher pushing event: Local endpoint 169.254.1.204:~0~ connected

000019a0.000019d4::2013/07/15-15:52:15.062 INFO  [IM] got event: Local endpoint 169.254.1.204:~0~ connected

000019a0.00001bfc::2013/07/15-15:52:15.390 DBG   [SM] Joiner: ISC returned status = 590610 output Blob size 1577

000019a0.00001bfc::2013/07/15-15:52:15.531 DBG   [SM] Joiner: Received SSPI blob from the Sponsor of size 155

000019a0.00001bfc::2013/07/15-15:52:15.531 DBG   [SM] Joiner: ISC returned status = 0 output Blob size 0

000019a0.000019a8::2013/07/15-15:52:15.546 INFO  [SV] Authentication and authorization were successful

000019a0.000019a8::2013/07/15-15:52:15.546 DBG   [SM] Joiner: Initialized with SPN = node002, Package = Kerberos, RequiredCtxAttrib = 67586, HandShakeTimeout = 30000

000019a0.00001bfc::2013/07/15-15:52:15.546 DBG   [SM] Handling auth handshake posted by thread id 6568

000019a0.00001bfc::2013/07/15-15:52:15.546 DBG   [SM] Joiner: ISC returned status = 590610 output Blob size 1577

000019a0.00001bfc::2013/07/15-15:52:15.671 DBG   [SM] Joiner: Received SSPI blob from the Sponsor of size 155

000019a0.00001bfc::2013/07/15-15:52:15.671 DBG   [SM] Joiner: ISC returned status = 0 output Blob size 0

000019a0.000019a8::2013/07/15-15:52:15.671 INFO  [SV] Security Handshake successful while obtaining SecurityContext for NetFT driver

000019a0.000019a8::2013/07/15-15:52:15.671 DBG   [SV] Incoming (second) connection from node002 is secure

000019a0.000019a8::2013/07/15-15:52:15.671 INFO  [ReM] Got stream info from fe80::7964:c5c5:3b5:7833%12:~37758~ to fe80::71f7:22a3:89fb:ab11%12:~3343~.

000019a0.000019a8::2013/07/15-15:52:15.671 DBG   [ReM] Exchanging local info.

000019a0.000019a8::2013/07/15-15:52:15.671 DBG   [ReM] Sending local info.

000019a0.000019a8::2013/07/15-15:52:15.671 DBG   [ReM] Local info sent, receiving remote info.

000019a0.000019a8::2013/07/15-15:52:15.812 DBG   [ReM] Remote info received from 2:node002.

000019a0.000019a8::2013/07/15-15:52:15.812 DBG   [ReM][Leader] I did not initiate connection, getting epoch from stream NodeObject.

000019a0.000019a8::2013/07/15-15:52:15.812 DBG   [NODE] Node 1: To n2 getting epoch (currently 0)

000019a0.000019a8::2013/07/15-15:52:15.812 DBG   [ReM][Leader] I am the leader, my epoch = 0, sn = 0

000019a0.000019a8::2013/07/15-15:52:15.953 DBG   [ReM][Leader] The follower's epoch = 0, SN = 0, Fault Tolerant Session ID = 00000000-0000-0000-0000-000000000000

000019a0.000019a8::2013/07/15-15:52:15.953 DBG   [ReM][Leader] My node did not initiate the connection.

000019a0.000019a8::2013/07/15-15:52:15.953 INFO  [ReM][Leader] Allowing new connection through to n2 (initiatorEpoch <0>, receiverEpoch <0>).

000019a0.000019a8::2013/07/15-15:52:15.953 INFO  [ReM] Sending connection down normal path.

000019a0.000019a8::2013/07/15-15:52:15.953 INFO  [JPM] Received a new stream from node002




Viewing all articles
Browse latest Browse all 5654

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>