On a production instance, we had issues of stale read. Everyone knew that it was due to faulty disk subsystems but there was no evidence of it in the logs. Hence a case was raised with Microsoft PSS. The MS Engineer suggested that PSSDiag be run on the instance on a continuous basis until further notice.
In most of the cases PSSDiag is run manually from the command prompt. Since we needed it to be run continuously it had to be automated. PSSDiag has several features which will ensure that it runs without human intervention. One of them is running it as a Windows Service. But in our environment as per process we need to involve the System Administrators team to install any new service on a Production box. Involving them was not feasible for us at that time, the DBA team had to look for other alternatives. One of the requirement was to stop PSSDiag execution at 2200 hrs and start it again at 2330 hrs.
Hence I wrote a small piece of code which runs PSSDiag. This code was put in a SQL Agent job which has 2 schedules. One starts at 2330 hrs and the other one starts whenever the SQL Agent starts. The second schedule was to ensure that the PSSDiag collection does not halt because of the SQL Service restart.
declare @cmd varchar(100) declare @outputpath varchar(100) set @cmd = 'rmdir F:pssdiagoutput' + convert(varchar(10),dateadd(dd,-2,getdate()),112) + ' /S /Q' exec xp_cmdshell @cmd SELECT @outputpath='F:pssdiagoutput'+CONVERT(VARCHAR(10),GETDATE(),112) set @cmd= 'F:pssdiagpssdiag.exe /O '+ @outputpath + ' /E ' + CONVERT(VARCHAR(10),GETDATE(),112) + '_22:00:00 /N2' exec xp_cmdshell @cmd
Line number 3 is for house keeping. It deletes the output files older than 2 days. Line number 3 will execute PSSDiag as in the below example.
F:pssdiagpssdiag.exe /O F:pssdiagoutput20100912 /E 20100912_22:00:00 /N2
/O has an output path which has a folder created for the current date. /E will ensure that the PSSDiag data collection stops at 2200 hrs on that day. /N2 is a very useful parameter. This ensures that the output folder does not get ovewritten, instead it is rolled over if the job is executed more than once on a given day. The earlier output folder is renamed to OutputFolderName_00001 and so on.
With this we could get PSSDiag running exactly as per the requirement. After a few days, we got the error message related to storage and the firmware upgrade of the disk subsystems eventually resolved the stale read issue.