Friday, November 07, 2008

Whoops! The SQL DBA Quiz

Before the PASS Summit, various twitter and blogger folks were passing sharing SQL DBA mistakes that they had made in their careers.  
Though i was never tagged by others, I did write some up, and now publish those for your enjoyment.

Mistake #1

When i was a wanna be DBA, i had implemented an import process of medical data.  Once in the db, the data was scrubbed, run thru a series of edits, and the resulting data was output to another system.  There could be up to 25 edits performed on data, depending on the contents of the data.
I ingeniously designed a system of procs and tables that would push and shove the data around.  But sometimes, specific records would fail the edits, and get left in an odd state.  So, i needed to know from step to step which edits had been performed.  So i setup a history table of edit steps per record.  This was awesome, in the event of a failure, we could locate the record(s) and determine where they stopped in their processing.  We could then take action to get them moving again.  The we i talk about was me and 1 other developer.  Time went on, and we suffered a lot less failures as we shored up the system.  
Then one day, the PM notices that instead of taking 5 seconds to process a record, its taking 5+ minutes, and asks me to look at it.  I immediately look into my history table for clogged records, and find none.  All seem to be flowing well thru the edits.  This is when i realize that the history table is rather large... and has no indexes... and has no primary key... and is taking longer and longer to query and process history into...  doh!
A couple sql statements later, a PK and some indexes... and viola! the system is up and running again speedily.
So, not to self, follow best practices on all tables all the time, regardless of their minimal existence on a system, or level of duty on the systems.

Mistake #2

I was in a hurry to get to my lunch workout at the gym.  I was releasing some updates to produciton.  As always, I start with a transation.  Make sure i'm updating the correct set of data with a select statment.  Perform the Update.  Then select the data, to ensure it was altered.  I report this to those that care, that the release was completed.  I bolt out the door, late to mu lunch gym appointment.  
On the way down the road, i get a call from a guy back east that monitors the db.  He says something is odd, and its really slow.  I hope i can look at it when i get back from the gym.  As i start listening to his description, i remember the beginning of my story.  Begin Transaction....
uh, I sheepishly tell start, I believe i left my maching in a transaciton on the main table on the prod system... no Commit.  oops.  
After revealing my password so he could log into my box, and letting him commit, i made some mental notes to not perform open ended transactions again on prod systems.  

Mistake #3

I start a trace on prod system to monitor for durations on a specific proc.  I don't use profiler, as the GUI can be a resource hog.  I ensure that a server side trace will take less resources, and be a controlled trace.  I code the trace up, and give it a stop time.  I dont want this thing running too long on the prod system.  45 minutes should be sufficient.  I'm about to goto the chiropractor anyway, and this is the perfect amount of time between me starting it, and finishing there, and making sure it completed successfully.  I add into my sql code 45m to my dateadd function, to generate the end datetime of the trace, and kick off the trace.  I goto the chiropractor and get fixed up.  afterwards, i look at the traces running with ::fn_trace_getinfo, and notice the default trace, and mine.  But mine has an odd endtime.  
This is when i realize that my dateadd value had added 45 months to the date it was supposed to stop at... not 45 minutes.  DATEADD( m, 45, GETDATE()) is way different from DATEADD( mi, 45, GETDATE()).
Good thing i was checking on this trace's completion anyway, and not leaving it to the computer to decide.  After all, the computer was only doing what i told it to do, and we all know I can't be trusted.  So, another mental note to self.  Don't assume I know what I am doing.  Ensure that controls are in place to double check my own work.

No comments: