Tag Archives: trace

Remove SQL Server Trace messages from Error Log

It is common practice for a DBA to run Profiler/Server-side traces in order to troubleshoot performance issues. At times some server-side traces are also run for Auditing purposes. Whenever a SQL Server Trace is started or stopped, the following messages are logged in the SQL Server Error Log.

SQL Trace ID 2 was started by login "loginname"
SQL Trace stopped. Trace ID = '2'. Login Name = 'loginname'.

This is an informational message. But if the traces are started and stopped frequently, these messages can really bloat the SQL Server Error Log. I have seen a couple of instances in my environment which have these messages every 5 minutes. The relevant information can be easily searched in the SQL Server Error Log as I had explained here. But if these messages are causing the SQL Server Error Log to grow and thereby consuming a lot of drive space, then the DBA has to think of some alternative.

The easiest alternative is to stop these informational messages from getting logged in the SQL Server Error Log. Fortunately SQL Server 2005 onwards, the trace flag 3688 has been included. Starting the SQL Server with this trace flag on, the trace related messages do not get logged in the Error Log. Here is how to add this trace flag in SQL Server Configuration Manager.

After adding this trace flag, the SQL Server service needs to be restarted for the changes to take effect. This trace flag will stop all trace related messages (including the default trace) from appearing in the SQL Server Error Log.

How many times did my database auto grow?

Today as I entered the Office, I came know that we were receiving alerts stating that the transaction log files of tempdb on one of the instances was almost full. This instance was running SQL Server 2000. I quickly looked at the transaction log file of tempdb and shocked to see that the initial size was set at 250 MB and auto growth of 100 MB! Also there were a lot of 833 errors in the Event Log. Everytime templog would take a lot of time to grow because of this and as a result the transaction which needed more tempdb transaction log space would get delayed/terminated. Even though I was able to diagonize the issue, I was not able to get the number of times the templog file tried to “Auto Grow”. I wish this instance was running SQL Server 2005 or above.

What changed in SQL Server 2005 in this regard? The Default Trace. Since SQL Server 2005 and above have the ability to run the Default Trace. This trace captures almost all the events that a DBA would need to torubleshoot.

Check if the Default Trace is running:

select name, value_in_use
from sys.configurations
where name='default trace enabled'

If value_in_use = 1 then the Default Trace is enabled.

Enable Default Trace if not running:

sp_configure 'default trace enabled', 1
go

reconfigure with override
go

Once the default trace starts capturing the data, executing the below query will give the details of how many times did the database auto grow.

DECLARE @trcfilename VARCHAR(1000);
SELECT @trcfilename = path FROM sys.traces WHERE is_default = 1
SELECT StartTime,
DB_NAME(databaseid)as DatabaseName,
Filename,
SUM ((IntegerData*8)/1024) AS [Growth in MB],
(Duration/1000)as [Duration in seconds]
FROM ::fn_trace_gettable(@trcfilename, default)
WHERE (EventClass = 92 OR EventClass = 93)
GROUP BY StartTime,Databaseid, Filename, IntegerData, Duration
order by StartTime

This output can be further analyzed to check if the initial size and auto growth settings of a database has been properly set.

C2 Auditing mode brought down the instance

Yesterday the application team reached out to our team stating that the drive hosting SQL Server databases was running out of space. It is not something new for our team that the Application Support team holds the database as the reason for any issue they face with the Application. But this time even we were alerted that the E drive on that server was running out of space. When the DBA logged on to the server, the drive had 0 KB free space left!

A quick glance on that drive indicated that .trc files ~30 GB in size were there on that drive. The folder hosting these files looked like this.

Since all these files had the naming convention of audittrace*.trc, it was very clear that these files were generated becuase the C2 Audit mode was enabled. In order to check the configuration when the DBA tried connecting to the instance, the fact that the instance was down came into light! The SQL Server Error Log had this information in the end.

Error: 566, Severity: 21, State: 1.
An error occurred while writing an audit trace.
SQL Server is shutting down. Check and correct error conditions such as insufficient disk space,
and then restart SQL Server. If the problem persists, disable auditing by starting the server at the command prompt with the "-f" switch,
and using SP_CONFIGURE.
SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.

The SQL Server got stopped when it was trying to write to an audit trace file. Troubleshooting this issue is very easy if one know how C2 Audit mode works. Here are some of the features of C2 Audit mode in SQL Server.

  • The output of C2 auditing trace files is saved to the Default Data directory for an instance.
  • There is no option to change the path of the C2 auditing trace files to a directory other than this.
  • When the trace file size reaches 200MB, it is automatically rolled over and a new trace file is created.
  • There is no built-in functionality to delete the old trace files. SQL Server will keep generating the trace files until the drive gets full.
  • A custom job needs to be setup which deletes trace files older than x number of days.
  • If the drive where the trace files are saved gets full, SQL Server will shutdown itself.

The last point was key in this case. If the drive where the audit trace files are getting saved runs out of space, SQL Server stops the SQL Server service. To overcome this issue there are two options. First is to clear space on the drive where the audit trace files are saved. Second option is to start the SQL Server service with -f flag which bypasses auditing and then disabling C2 Audit mode.

C2 audit mode can be disabled either by T-SQL or SSMS as below.

T-SQL:

sp_configure 'c2 audit mode',0
go
reconfigure with override
go

SSMS:

In our case when we asked the client if C2 Audit needs to be enabled, he was clueless why it was enabled in the first place. Most likely when the instance was built, the DBA enabled this feature by mistake. Now that the client does not require this feature to be enabled, we are processing a Change Control to disable it.

Stopping the trace because the current trace file is full

For sometime now, we were receiving alerts from one of the Production Instances. It read

Trace ID '4' was stopped because of an error.
Cause: Stopping the trace because the current trace file is full
and the rollover option is not specified..
Restart the trace after correcting the problem

As the message itself indicates one of the server-side trace output file on the instance had become full and it is stopping because the rollover option is not specified. Despite knowing the resolution I found it difficult to fix the issue because I could not find out from where this trace was getting triggered. After searching for trace related stored procedures in all the databases using this method, I was able to track down the one which was triggering this trace.

The resolution for the above error lies in the parameters used while executing sp_trace_create. The details of this stored procedure are documented on this article in MSDN. Since the error is related to trace file getting full, the parameters of interest are @maxfilesize and @options. @maxfilesize as the name suggests restricts the maximum size of the trace output file. @options decide the action to be taken when the trace file size reaches the @maxfilesize value. @options has three possible values.

  • TRACE_FILE_ROLLOVER (2)
  • SHUTDOWN_ON_ERROR (4)
  • TRACE_PRODUCE_BLACKBOX (8)

In my case @options had a value of 0 and @maxfilesize of 100MB. This trace was run by the audit team and it would be filled up in few minutes because the instance on which it was running is the busiest warehouse server in our environment.

I was not sure if the audit front-end application would be able to read “rolled over” trace files. Hence the option that I was left with was to increase the @maxfilesize from 100 MB to 20 GB. Now the count of these alerts has come down drastically but the warehouse instance eats up even 20GB once in a while!

DBA why did you run the trace?

Usually on the first of every month I would be preparing the Inventory Report for all the databases across the Enterprise. Highly technically challenging task. I need to compare the number of databases found in the enterprise with that of the last month report! The first thing that I would work on the 1st is this report.

I had almost finished preparing it. An audit report dropped in my inbox. “A DBA Team member’s id was seen running a trace on a very critical ans a controlled server”. The statement that was caught in the audit report was something like this

select @status=status  @curr_tracefilename=path from sys.traces where is_default = 1 ;
insert into @temp_trace.......

From this it was clear that he was not “running” a trace instead reading the default trace, is_default=1 in the where clause confirms it. This information was captured exactly four hours back. But the DBA confirmed that he had logged on to the server to find out the reasons for growth of a transaction log file. He had not run this script. So how did this information got logged against his name?

Started to rewind the time machine. He had logged on to the server and had opened SSMS. He had logged on to the server and verified the free space detail on the drive hosting the transaction log file. Then he had checked the database properties and the User Statistics report for that database. Thats all and he had logged off the server.

Like I mentioned earlier, some of the Standard Reports read the default trace running on the instance. I felt that he would have accidentally clicked on one of such reports which read the default trace. Looking at the Report menu confirmed my opinion.

The report that he had viewed was “User Statistics” and there is “Schema Changes History” report just above that. What happens when one click on this report? To verify that I ran a trace while clicking on the report. Here is the snip from the trace output.

select @curr_tracefilename = path from sys.traces where is_default = 1 ;
set @curr_tracefilename = reverse(@curr_tracefilename)
        select @indx  = PATINDEX(''%\%'', @curr_tracefilename)
        set @curr_tracefilename = reverse(@curr_tracefilename)
        set @base_tracefilename = LEFT( @curr_tracefilename,len(@curr_tracefilename) - @indx) + ''\log.trc''; 

        insert into @temp_trace
        select ObjectName
        ,       ObjectID
        ,       DatabaseName
        ,       StartTime
        ,       EventClass
        ,       EventSubClass
        ,       ObjectType
        ,       ServerName
        ,       LoginName
        ,       NTUserName
        ,       ApplicationName
        ,       ''temp''
        from ::fn_trace_gettable( @base_tracefilename, default )
        where EventClass in (46,47,164) and EventSubclass = 0  and DatabaseID = db_id() ;

I guessed it right. The first two lines from this output matched that of the one captured in the Audit Log. I have already sent a lengthy explanation to the concerned. Most likely they will buy my theory 😉

Error Code 12 while running Server Side Trace

Recently we received a request from the customer to run SQL Profiler trace on a Production Instance for a weeks time. No, not to track any performance issues. They wanted to check if someone is still using one of the databases on that instance. If not, they were planning to decommission it.

I had discussed about the benefits of Server Side traces over SQL Profiler trace here. Team was well aware of this and were all set to run a Server Side trace for one week. They had scripted out a standard profiler trace and did the modifications. When the script was executed, it returned Error Code 12.

As per the official documentation, Error Code 12 meant File not created. Since the error message was regarding the Trace Output file not being generated, the sp_trace_create statement in the script which defines the trace output should have some issues.

The missing link in the script was glaring. The Output File parameter did not have the Trace File Name, instead it ended with only the folder name. Quickly this path was modified and it was \\ServerName\ShareName\TraceOutputFolder\TraceName. The script was executed again but surprisingly same error message. ErrorCode 12! The folder existed, the service account has full permissions on that folder. What is missing now?

A closer look at the Trace Output folder revealed that there was a leading space just before the \\ServerName. Huh! Removed that space and the trace started running smooth. This is an example of missing small things while the focus is on the bigger issue.

Get trace definition from a running trace

On one of the Production SQL Server 2005 instance we were getting “Access Violation” error. This error would trigger a huge .mdmp file to be generated. During this time, the instance would become unresponsive. Since this was occurring at least 5 times a week, we opened a case with Microsoft PSS.

On investigation, the engineer opined that the traces running on the instance while capturing OLEDB related events were triggering Access Violation error. Event IDs 61, 119, 120, 121 and 194 were the ones which were related to OLEDB. On our instance, we have 3 traces running almost every 30 minutes. These are part of various Audit\Application requirements. We needed to know which trace is capturing these events. Here are the scripts that I used to get that information.

SELECT distinct traceid 
FROM ::fn_trace_getinfo(default);

This query will list the traceid of the traces that are currently active on the instance.

select
trace.eventid ,
events.Name as EventName ,
trace.columnid ,
cols. name as ColumnName
from :: fn_trace_geteventinfo ( 2 ) AS trace 
INNER JOIN sys.trace_events events 
ON trace.eventid = events.trace_event_id
INNER JOIN sys.trace_columns cols 
ON trace.columnid = cols.trace_column_id

In this query I am fetching the trace definition of traceid 2. The output was something similar to the one below.

With this output it was easy out filter out the OLEDB events and it was the traceid 2 which was capturing those events. We will be implementing a Change Control to exclude these events from being traced. Hope this will resolve the issue permanently. Keeping my fingers crossed!