ouch! CRS-1612:Network communication with node … Node node, number 1, was manually shut down

December 7, 2012 — Leave a comment

another ouch with GI 11.2.0.3 on solaris 10 sparc 64bit: if one of your cluster nodes restarts and you can not find any evident reason for it despite some of these entries in the logs:

[cssd(1084)]CRS-1612:Network communication with node1 node (1) missing for 50% of timeout interval. Removal of this node from cluster in 14
.258 seconds
[cssd(1084)]CRS-1625:Node node1, number 1, was manually shut down
[cssd(1084)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node2 .
[ctssd(1117)]CRS-2407:The new Cluster Time Synchronization Service reference node is host node2.
[crsd(1522)]CRS-5504:Node down event reported for node 'node1'.
[cssd(1084)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node1 node2 .

… and:

[ CSSD][20](:CSSNM00018:)clssnmvDiskCheck: Aborting, 0 of 1 configured voting disks available, need 1
[ CSSD][20]###################################
[ CSSD][20]clssscExit: CSSD aborting from thread clssnmvDiskPingMonitorThread
[ CSSD][20]###################################
[ CSSD][20](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
[ CSSD][20]

you probably hit bug 13869978. this seems only to happen if you are on external redundancy for the cluster diskgroup and therefore only one voting disk was created.

two solutions are available:

  • migrate the votings disk to an asm mirrored diskgroup ( normal or high redundancy )
  • or apply PSU4 on top of 11.2.0.3

there seems to be the same issue on linux.

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.