Tuesday, March 27, 2012

Cluster problems

I was wondering if any of you could offer some help with my cluster. I'm
troubleshooting an existing problem within the company where the website
crashes because the clustered SQL servers' 8x CPUs reach 100%.
I've looked in the event logs of the SQL Servers and the only error is on
one inthe System Log :
User: N/A
Comptuer: SQLCLA
Source: ClusSvc
Categroy: Failover Mgr
Event ID: 1069
Description: Cluster resource "SQL Server (PRODUCTION)' in Resource Group
'PRODUCTION Disk Group' failed.
Any help is gratefully received!
The simple answer is the SQL instance didn't respond to the Cluster Service
LooksAlive and IsAlive tests within the allowed timeout period. The Cluster
interpreted that as a failure and acted appropriately.
The complex answer is to find out why your server is running at CPU
saturation. Until you find the answer to that question, you will still
experience cluster failures. You need to do a performance impacy analysis
of your server to determine the cause of the load. One common cause of this
is complex math or string manipulation code running in the server. Usually
that is better done at the application layer rather than at the data layer.
Unfortunately fixing that generally requires an application change. If that
is off the table, start shopping for a 16-way or higher server.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
"savvy95" <savvy95@.discussions.microsoft.com> wrote in message
news:554168C0-0879-45A4-9D43-1079F4E8A6AC@.microsoft.com...
>I was wondering if any of you could offer some help with my cluster. I'm
> troubleshooting an existing problem within the company where the website
> crashes because the clustered SQL servers' 8x CPUs reach 100%.
> I've looked in the event logs of the SQL Servers and the only error is on
> one inthe System Log :
> User: N/A
> Comptuer: SQLCLA
> Source: ClusSvc
> Categroy: Failover Mgr
> Event ID: 1069
> Description: Cluster resource "SQL Server (PRODUCTION)' in Resource Group
> 'PRODUCTION Disk Group' failed.
> Any help is gratefully received!
|||Followup to my previous post.
If this is a Hyper-Thread enabled host, restrict MAXDOP (Maximum Degree of
Parallelism) to the actual physical CPU count or lower. This won't help
query processing but may give the server a chance to respond to the CLuster
checks and keep it from failing over. The really good news is you cna
change this without restarting your server.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Admininistrator
"savvy95" <savvy95@.discussions.microsoft.com> wrote in message
news:554168C0-0879-45A4-9D43-1079F4E8A6AC@.microsoft.com...
>I was wondering if any of you could offer some help with my cluster. I'm
> troubleshooting an existing problem within the company where the website
> crashes because the clustered SQL servers' 8x CPUs reach 100%.
> I've looked in the event logs of the SQL Servers and the only error is on
> one inthe System Log :
> User: N/A
> Comptuer: SQLCLA
> Source: ClusSvc
> Categroy: Failover Mgr
> Event ID: 1069
> Description: Cluster resource "SQL Server (PRODUCTION)' in Resource Group
> 'PRODUCTION Disk Group' failed.
> Any help is gratefully received!

No comments:

Post a Comment