Is It Me or Is That an Error in Your Log?

This week I promised my DBA buddy that I’d behave myself. So any mayhem that occurred was not a direct result of anything I did… at least this week… for the most part… I think…

Anyway, we thought we were going to get an Oracle database that would be used for the new pass-through web portal. *psych!* Nope! It’s back to SQL Server… first it was Oracle then it was SQL Server… back to Oracle then SQL Server…Oracle again… now we’re back to SQL Server… Actually, I’ve lost count. This is supposed to be it, though. They just need to get approval for the price tag. I’m actually a bit bummed. I was starting to look forward to learning how to administer an Oracle database. That’s okay because I’m sure there will be other opportunities to spread mayhem…*cough* I mean, to learn other platforms…

For some reason earlier this week, I thought back to when I first became a DBA. It was sorta-kinda-not-really accidental.  The company I worked for as a developer had an opening for a SQL Server DBA. At the time there were two SQL Server DBAs and one Oracle DBA. The Oracle DBA left for a job with Oracle. So the senior SQL Server DBA took over as the Oracle DBA. Which left an opening. *woo hoo!* The DBAs recommended me as having some kind of potential. *who? me?* Possibly because they let me manage my own SQL jobs and thankfully I didn’t crash the server. I waited until I became a DBA to do that. Well, not really. One of the server guys actually dubbed me “Server Crasher” because I was usually the one who reported a server that had crashed. No one told me that if you report it, you must have done something to it. So I applied for the job figuring it couldn’t hurt… this time.  Besides, there was talk of steering me towards some sort of contract management position. At the time I wanted to stay technical and it sounded like fun! *What do you mean I just inherited a cluster? You mean like a cluster of nuts? hmm…that would explain all the secret squirrel meetings…* 

So they interviewed me and then, lo and behold, offered me the job. *welcome to the dark side… we have cookies!* I happily accepted not really having an inkling of what I was getting myself into. *drat! no cookies!* Don’t get me wrong. I don’t regret it. In fact, I’ve enjoyed the challenges the job has posed over the years. I remember feeling very overwhelmed at first. *what the heck is merge replication and why do I care?*  Although, after 5 years the feeling comes back every so often. *time to reindex the brain*  At the time, I was fortunate to have two DBAs to help me get started. Not everyone is as lucky.

One recommended that I write myself a script that I could use to check the servers daily for problems or potential problems such as oh, I don’t know… a SQL injection attack?  *fun story for another day* Since we had over 40 database servers/instances dedicated to SQL Server at the time and only two dedicated SQL Server DBAs (myself included), it sounded like a great idea.  By the way, it’s also a good idea to know certain administrative T-SQL commands in case the GUI isn’t working for whatever reason. *like that one time no one could access the SQL cluster and the server people panicked wondering why it was slower than a snail in a snowstorm… turned out to be a virutal memory issue* So she gave me some ideas as to what to look for but left writing the script up to me.  At the time, I didn’t really know what I was doing. I tried to find examples on the Internet but came up short. *insert off-the-wall jokes about my height or lack thereof* Somehow I managed to write one on my own without bribing anyone. I still use a version of that script at my current job seeing as we have over 50 servers/instances dedicated to SQL Server and three DBAs (my DBA buddy, our boss, and yours truly). Even though we’ve split the servers up amongst ourselves (roughly 12 each), I find the script helps to get a quick feel in the mornings on how the servers are doing. *it’s 8 am… do you know what your servers did last night?* It’s a habit I’ve gotten into over the years and I find that it saves me some time. Some like it GUI and some like it script. Personally, I like scripts.

I know there are lots of great checklists out there on what to look for during server checks. So I then wondered if anyone else out there has or had the same problem I had when I started out. How does one check for these things efficiently if you have a lot of servers? I’m sure everyone has their own method.  This one is mine. Or at least it’s part of my process. *Please bear with me, I’m getting to the point*

So I thought I’d do an experiment for this week’s blog. I thought I’d do something different and offer up my script for anyone to hack apart, criticize, try out, chew up/spit out, line the bottom of the cat box with it, or whatever floats your boat. It needs some major updating, though. I wrote it for SQL Server 2000 and then started to update it for SQL Server 2005. I haven’t implemented all the great DMVs yet so it is most definitely a work in progress. When I get to updating it, I’ll try to post the changes. It’s most definitely a work in progress.

Disclaimer: If you decide to use this script in whole or in pieces, please try it out on a development or test instance first. I’ve never had issues with it but one never knows how it’ll work on another system.  Since it’s a bit long and could be a bit overwhelming for someone, I thought I’d break it out into pieces.

Still interested? Really? Seriously? Okay, then! You have been warned! 🙂 Since it’s so long, I will write separate posts on parts of it.  If you’re feeling brave and want the whole enchilada, just click “sql_server_quick_check“. Otherwise, below is the listing for the next several weeks unless I get flamed for spreading crappy code all over the place. I will be the first to admit it’s not the greatest code ever. I’m not an expert. This is just my stab at it and is meant to be more of a guideline. 🙂 Depending on the feedback, I may stop and move onto something else more fun. If no one says anything good or bad, I’ll probably just continue to post my code and see how it goes.

  1. SQL Server error log and agent log (this week)
  2. System level triggers & How to check for new or missing linked servers
  3. Failures: Backup jobs & Database mail (SQL 2005/2008)
  4. Drive space, Transaction log sizes, & How many databases do you see? (I have been surprised with new mysterious databases showing up unannounced in the past or some disappearing)
  5. Logins: Admin logins, Logins with default database set to master, Logins created/updated in the last month (I see you!)
  6. Processes: What processes are running, Multiple processes with same spid, How those processes are connecting to the instance
  7. File growth, Databases created in the last week, & Databases not being backed up
  8. Jobs: Job failures, Job owners, Jobs that are currently running (i.e. jobs that suck, err… I mean could be stuck; I once had a job that ran the entire weekend because the procedure was stuck… no it didn’t suck)

1. Check your SQL Server Error Log and Agent Log (i.e. Is it me or is that an error in your log?)

When I run the script as a whole, I like to send the results to text. For me, it’s easier to read than scrolling through all the separate grids. To show the results as text, you can use the Ctrl + T keyboard shortcut, or you can puruse the menu via Query –> Results –> Results to Text. Note: This is using Management Studio.

Since I usually run the script as a whole on SQL Server 2000 and 2005 instances, I declare and set a couple of variables to check the version throughout the script. This allows me to run SQL 2000 or 2005 code as needed. I haven’t checked this against 2008 yet but I plan to. I would think it’d work on 2008 without problems, though.

declare @chvServer varchar(50), @chvVersion varchar(20)
select @chvServer = convert(varchar, SERVERPROPERTY('ServerName'))
select @chvVersion = convert(varchar, SERVERPROPERTY('ProductVersion'))
select 'Server'=@chvServer , 'Version'=@chvVersion

Displaying the server name just confirms to me that I’m on the server that I think I’m on and that the version hasn’t changed. If it has changed, then at least I know about it and can start interrogating, err… I mean, investigating who did what to my server and why.

So when I first log onto a server (you’ll probably need admin privileges by the way), I like to check the SQL Server error logs and the agent logs. The code is different depending if you’re on SQL Server 2000 or 2005.

----SQL 2005--------
if left(@chvVersion,1) = 9 or left(@chvVersion,2) = 10
begin
    --Read SQL error log
    exec master..xp_readerrorlog 0, 1, 'error', NULL, NULL, NULL
    --Read agent log
    exec master..xp_readerrorlog 0, 2, 'error', NULL, NULL, NULL
end
else
----SQL 2000--------
if @chvServer = 'SQLTEST'
begin
    --sql error log
    exec master..xp_readerrorlog 1,'g:\MSSQL\log\ERRORLOG','error'
    --agent
    exec master..xp_readerrorlog 1,'g:\MSSQL\LOG\SQLAGENT.OUT','error'
end
else if @chvServer = 'SQLTEST\PRINCESS'
begin
    --sql server error 
    exec master..xp_readerrorlog 1,'f:\MSSQL$PRINCESS\log\ERRORLOG','error'
    --agent
    exec master..xp_readerrorlog 1,'f:\MSSQL$PRINCESS\LOG\SQLAGENT.OUT','error'
end
else
--check the default location
begin
    --sql error log
    exec master..xp_readerrorlog 1,'C:\Program Files\Microsoft SQL Server\MSSQL\log\ERRORLOG','error'
    --agent
    exec master..xp_readerrorlog 1,'C:\Program Files\Microsoft SQL Server\MSSQL\log\SQLAGENT.OUT','error'
end

Parameters for SQL Server 2000:

  1. Indicates which error log number to read with “0” being the current log. I included “1” as an example. 
  2. The file name.
  3. The line number.
  4. A string value to search for.

Parameters for SQL Server 2005: 

  1. Indicates which error log number to read with “0” being the current log.
  2. Tells SQL Server which log to read. For the error log, use 1. For the agent log, use 2.
  3. A string value to search for.
  4. Another string value to search for.

Note: xp_readerrorlog is an undocumented procedure. However, there is a lot of good info on it out there. I could probably do an entire blog post on reading the logs alone but for this purpose, I’m just listing the code I use. I usually search for more than the word “error” but for the sake of brevity, I just put a few examples here.

Depending on feedback, if any, next week I’ll show you more of my code and why you might want to check for system level triggers and checking for new or missing linked servers. If you don’t like where I’m headed, just let me know. 🙂

Next in the series: Ready, Aim, Fire That Trigger!

Advertisements

5 thoughts on “Is It Me or Is That an Error in Your Log?

  1. how are you I was fortunate to discover your Topics in digg
    your post is wonderful
    I obtain much in your blog really thank your very much
    btw the theme of you blog is really wonderful
    where can find it

    1. I’m glad you like it. 🙂 The theme for my blog is actually from wordpress. It’s called MistyLook by Sadish. However, the picture of the mountains was taken by my husband last year in Estes Park, Colorado.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s