One more day with the SQL Server Cluster Resource not coming Online

Self healing computer, DBAs friend

Last Saturday I was out on a official outing. Within a few minutes of the “formal” presentation my phone started ringing. One of my team member was on a Disaster Recovery (DR) exercise and the SQL Server instance was not listening to him. It was a SQL Server 2005 instance running on a Windows Server 2003 cluster. The problem was that the SQL Server Resource would be in online pending state for a very long time and then go into Offline state.

As always first I checked the SQL Server Error Log, there was no clue in that. After starting MSDTC service, it would not go any further. The Cluster Error Log (cluster.log) also did not have useful information. The System Event Log had an entry like this.

Since the Cluster Resource was not coming online, I started the instance from the command prompt by using sqlservr.exe. From the command prompt, the instance started without any errors.

Now this had to be an issue with the SQL Server service not responding in a timely fashion, since the SQL Server instance was intact. In the meantime Gurpreet Singh Sethi ( blog ) who was with me that time, pointed to an interesting message in the System Event Log.

The cluster resource was getting timed out while trying to come online. Since this is a DR box, at times the performance of the server might not be at par with the Production server. Gurpreet also suggested that we change the Pending Timeout value to a higher value from the default value of 180 seconds. The above error message also pointed out to the same fact. Because this instance is DR I did not have to worry about processing Change Controls to make the requisite changes, changed the Pending Timeout value from 180 to 360.

Bingo! The SQL Server Resource came online as if nothing had happened earlier. Increasing the Pending timeout value had helped in this scenario, since DR servers are usually slow because of which any applications trying to connect may get timed out.

Now the cluster resource is online, I did not waste a minute to drop off the call and enjoy the team outing on a Saturday!

3 thoughts on “One more day with the SQL Server Cluster Resource not coming Online

  1. Pingback: SQL Server Cluster resource doesn't come online | Service control: stop before startup | SQL Server DBA Diaries of Pradeep Adiga

  2. Daniel Imamura

    Thanks! This solved my problem.
    After doing an IP change (networking team request), my SQL cluster instances were stuck in Online pending with this error showing up in the event logs…
    [sqsrvres] OnlineThread: ResUtilsStartResourceService failed (status 5b4)
    [sqsrvres] OnlineThread: Error 5b4 bringing resource online.
    Turns out that for some reason it’s taking longer (about 5 mins) to come online now…

    Anyway, everything is AOK. Thanks!

    Reply
  3. Pingback: SQL Server Cluster resource fails with “Data source name not found and no default driver specified” error - SQL Server - SQL Server - Toad World

Leave a Reply