Sunday, March 25, 2012

Cluster not responding

We have been having a problem where the cluster controller
does not get a response from the SQL Server from the "are
you alive" request ( select @.@.SERVERNAME ). It seems that
the SQL server does not respond within a couple of minutes
and the request times out.
The Cluster controller then closes down the SQL server
thinking it has failed and tries to restart it.
Has anyone got any means to increase the timeout time, or
reason why SQL Server is not responding ( p.s. it does not
seem to be that busy )
When does this happen? Is it intermittent or do you have a pattern?
What type of activities are taking place at that time?
IsAlive happens every 60 sec by default. It can be changed to a higher value by going to the properties of the SQL Server resource. Instead of increasing the time for IsAlive checks, its important to find what is going
on when you see this behavior?
Best Regards,
Uttam Parui
Microsoft Corporation
This posting is provided "AS IS" with no warranties, and confers no rights.
Are you secure? For information about the Strategic Technology Protection Program and to order your FREE Security Tool Kit, please visit http://www.microsoft.com/security.
Microsoft highly recommends that users with Internet access update their Microsoft software to better protect against viruses and security vulnerabilities. The easiest way to do this is to visit the following websites:
http://www.microsoft.com/protect
http://www.microsoft.com/security/guidance/default.mspx
|||Not a real pattern but suggestion is that it happens when
the backups are in progress, but not every time.
ie the backups happen every night at 3pm but only once a
month.
What I'd like to do is tell the cluster controller to wait
longer before assuming the database is down.
The application is not cluster aware and a failover
requires that we relog in the overnight batch
applications. But it does mean the service is available
for the online users in the morning.
|||I have seen something like this when an Anti-Virus scanner is running on the
backup target server. It is something to check.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com
I support the Professional Association for SQL Server
www.sqlpass.org
"David" <daldlay@.hotmail.com> wrote in message
news:35e801c4a53c$55a60160$a301280a@.phx.gbl...
> Not a real pattern but suggestion is that it happens when
> the backups are in progress, but not every time.
> ie the backups happen every night at 3pm but only once a
> month.
> What I'd like to do is tell the cluster controller to wait
> longer before assuming the database is down.
> The application is not cluster aware and a failover
> requires that we relog in the overnight batch
> applications. But it does mean the service is available
> for the online users in the morning.
|||"David" <daldlay@.hotmail.com> wrote in message
news:35e801c4a53c$55a60160$a301280a@.phx.gbl...
> Not a real pattern but suggestion is that it happens when
> the backups are in progress, but not every time.
> ie the backups happen every night at 3pm but only once a
> month.
Are backups going over the same wire that the heartbeat is using?

> What I'd like to do is tell the cluster controller to wait
> longer before assuming the database is down.
> The application is not cluster aware and a failover
> requires that we relog in the overnight batch
> applications. But it does mean the service is available
> for the online users in the morning.

No comments:

Post a Comment