Tuesday, March 27, 2012

Cluster recovery from node failure

Cluster services gives the high availability needed - that is great.
But I have never seen any discussion about what happens when a node
fails - what do you do to get everything back to the active-passive
tandem.

I imagine there is not much difference in terms of recovery procedure
for either active or passive node. So I'm just going to make up a
scenario that we have encountered. The system hard drive (not the
shared disk) on primary node fails. Cluster fails over to the passive
node. Following are the problems I have at hand:
-After installing windows, I need to install driver and configure the
permission to access the SAN. There is no way I could do it since the
secondary node has exclusive access to the disks.
-Imagine I got that working, is there anyway to install SQL so SQL
would know this server used to be the primary node and attach the DB
and translog automatically
-Finally, there is no proper way to apply SQL 2000 service pack 3a.
Originally when the cluster was fully functional, the service pack was
applied to active node and that automatically upgrades passive node.
Now we have a machine without 3a and a machine with 3a already
installed. See any problem?

Consider all of the above as this one big question: What is a proper
procedure to restore a cluster when one of the node goes down? Whether
it's the active or passive node."gotdough" <praemonitus@.hotmail.com> wrote in message
news:1ad01306.0409120058.3df26726@.posting.google.c om...
> Cluster services gives the high availability needed - that is great.
> But I have never seen any discussion about what happens when a node
> fails - what do you do to get everything back to the active-passive
> tandem.
> I imagine there is not much difference in terms of recovery procedure
> for either active or passive node. So I'm just going to make up a
> scenario that we have encountered. The system hard drive (not the
> shared disk) on primary node fails. Cluster fails over to the passive
> node. Following are the problems I have at hand:
> -After installing windows, I need to install driver and configure the
> permission to access the SAN. There is no way I could do it since the
> secondary node has exclusive access to the disks.
> -Imagine I got that working, is there anyway to install SQL so SQL
> would know this server used to be the primary node and attach the DB
> and translog automatically
> -Finally, there is no proper way to apply SQL 2000 service pack 3a.
> Originally when the cluster was fully functional, the service pack was
> applied to active node and that automatically upgrades passive node.
> Now we have a machine without 3a and a machine with 3a already
> installed. See any problem?
> Consider all of the above as this one big question: What is a proper
> procedure to restore a cluster when one of the node goes down? Whether
> it's the active or passive node.

This KB article might help:

http://support.microsoft.com/defaul...0&Product=sql2k

You should probably also post this in microsoft.public.sqlserver.clustering
to see if you get a better response.

Simon

No comments:

Post a Comment