Yesterday one of the Data Warehouse servers running SQL Server 2000 went down suddenly. It was found out that the Power Supply to that box had some issues. In order to reduce the down time business decided to move all the LUNs attached to that server to a new Server. Once all the LUNs are moved, the DBA was supposed to attach the databases. The DBA tasks were relatively easy. On the new server SQL Server was already installed and the build matched that of the server having issues. Once all the LUNs were attached to the new server they were named appropriately. All the DBA had to do was to modify the startup parameters of the SQL Server service to point to the correct location of the master database files.
After modifying the startup parameters the SQL Server service was restarted. Since the drives were properly named all the databases came online. The DBA job was over. After all the support teams did the validation the server was handed over to the Application Team to verify. The Application Team runs their load jobs using a third party scheduling agent. That software has a client component installed which runs the job locally after the job is triggered from the Central Server. The jobs are usually configured to run DTS packages which are stored locally on the server. The application job started failing with the Access denied error.
DBA quickly checked if the service account running the scheduling agent had requisite permissions on the instance. Permissions were not an issues since that account had sysadmin privileges on the instance and it was also part of Local Administrators on the Local Server. On checking the job definition, it was found that the job was referring to the alias (CName) of the server. The alias name was a human friendly short name for the server (not the SQL Server instance) since the actual server name was a bit lengthy. It was properly set at the DNS but it was referring to the old IP address. For anyone trying to access the alias using \servername it would prompt for the user name and password. Even after giving the correct login credentials, it would return the access denied error message.
When flushing the DNS did not yield results, the Windows Administrators removed this server from the domain and re-registered it in the Active Directory. After this everyone was able to access the server using the CName from other servers. But the job was getting executed locally from the server and it used the CName locally to execute the DTS packages. It still failed with the same error. The issue was that when someone tried to access the path \CNameOfTheServer within the server locally, it was returning the below error.
While the Windows Admins were trying to resolve this issue, one of my team members Pravin Nair recollected that he had come across this issue while this instance was de-clustered last year. He referred to this KB article which recommends modifying the registry to workaround this issue which is caused by the Server not listening on the configured CName.
The key DisableStrictNameChecking was created with a value of 1 and the server was rebooted. Still the same issue persisted. Later on as explained in the Method 2 of this KB article, the Loop Back check was disabled by creating a new registry key named DisableLoopbackCheck. After setting a value of 1 to this key the server was rebooted one more time. The combination of these two registry changes finally fixed this issue and job started completing successfully.
Even though there wasn’t much of DBA work involved in this incident, it was worth the time to be on the bridge to watch teams working together to resolve this issue.