Today at work, while I was in my own world little ping on the IM brought me back to the real world. One of my team members pasted this screenshot in the IM.
Since the Agent was red, I told “Please start the SQL Agent and check why it went down”. But he told me that he was pointing to the “SQL Server Logs” being empty. His confidence that I would read his mind was remarkable!
Bang came the next error message when he clicked on “SQL Server Logs” in SSMS
A severe error occurred on the current command. The results, if any, should be discarded. (Microsoft SQL Server, Error: 0)
This error message was too generic. Since reading Error Log from the GUI had failed, he tried reading it through sp_readerrorlog. It also threw the below error.
Msg 22004, Level 16, State 1, Line 0 Failed to open loopback connection. Please see event log for more information. Msg 22004, Level 16, State 1, Line 0 error log location not found
The “error log location not found” message gave us some direction. The first step was to look if the ErrorLog existed in the path. It was accessible through Windows Explorer and could open it in Notepad. But there were no entries for the last 20 days! Then it had to be a permission issue with the SQL Server service account. The SQL Server services were running under “Local System Account”. NT Authority\SYSTEM account had the correct permissions on all the registry keys and the folder containing the Error Log.
The next step was to check the permissions of this account in SQL Server. Surprise! NTAUTHORTY\SYSTEM was neither granted explicitly through a login nor through Group Membership. And it was the service account for SQL Services!
Since we are going great guns on SQL Server hardening, it had be someone from DBA team who had messed this up. Both of us started walking down the memory lane. Since this TEST box was running under Local System Account we had a Change Control to change the SQL Services to run under a Domain Account. After making this change the privileges of NTAUTHORITY\SYSTEM account was supposed to be revoked. At that time, the service account could not be changed due to some cross-domain authentication issue on which our System Admins are still working.
Since the first step was a failure, the change control was supposed to be rolled back. But the DBA went ahead with Step 2 and revoked the sydadmin privileges of NTAUTHORITY\SYSTEM. To validate the changes, the services were supposed to be restarted. He did not. No one knew about this fiasco till we encountered this error today! I assume the DBA was in deep night shift, so performed only 30% of the implementation. For sure this post could not have inspired him, since it was posted very recently.
Now the SQL Service is up but in a bad condition. If we were to restart the services there would have been many more surprises. Now we will be granting back the privileges to NT AUTHORITY\SYSTEM account, till the System Admins “fix” the Domain authentication problem.
I should thank the DBA that he did not do this goof up on a Production Instance.