On Friday one of the DBAs from another team pinged me that he was having trouble in starting SQL Server 2000 service. The SQL Service was continued to be in starting state after the server was rebooted.
There were no error messages in the System/Application Event Log. The SQL Server Error Log was not even initialized (the latest error log had the information till the time the Server was rebooted). Since the Service Control Manager fails to show up any interactive pop-ups from sqlservr.exe, I started the SQL Server from the command prompt.
Within a few seconds after I entered sqlserve.exe in the command prompt the following error message was displayed.
You SQL Server installation is either corrupt or has been tampered with (unknown package id).
Please uninstall then re-run the setup to correct this problem
I have seen this error message earlier but it would state some files are missing, hence SQL Server would treat the installation as corrupt. In this case the reason was unknown package id. The fix for this error on SQL Server 7.0 is to run the SQL Server setup again. But this instance is running SQL Server 2000. Moreover I would consider running the Setup again on a Production Instance as the last option. I started digging deep.
There were no changes implemented on this server/instance in the recent past. But some missing setting in the SQL Server installation was brought to light by the server reboot. I compared the files in
C:Program FilesMicrosoft SQL Server80ToolsBinn folder with a working instance. All the files were in tact.
Since there were no error messages anywhere, I was left with a very few options. The best option to track down the root cause was to follow each and every action of sqlservr.exe till it returned the error. The best tool to do this is the Process Monitor ( I simply love all the sysinternals tools) .
Opening the Process Monitor tool would trace each and every action of all the processes running on the server. It is important to Filter the requisite events failing which it would consume a lot of resources on the server.
In the Process Monitor Filter window, I included sqlservr.exe as in the below screenshot.
The events related to sqlservr.exe were captured and as expected the event collection came to a stand still as soon as the SQL Server process terminated with error. The Result column of the events collection did not have any Failed results. But just before the error message popped up, there was a name not found result for an entry in the Registry related to SQL Server.
The checksum key in the Registry was reported to be missing. This value was present in the Registry of a working instance whereas it was missing for this instance.
Since I found out the missing link here, I exported the checksum key from the *working* machine’s Registry to a .reg file and imported it onto the instance having trouble. When I tried starting the SQL Server service again, it started successfully as if nothing had happened earlier!
I don’t know how this key got deleted from the Registry but finding it out was a good challenge. After accepting a few thanks and exchanging a few smilies on the IM window, I ended my Friday with a sense of accomplishment.