Collect and Baseline Wait Statistics (Dear SQL DBA Episode 14)

What are the best tools to collect and baseline wait statistics? Should you write your own? Watch the 18 minute video or read the episode transcript below.

Prefer a podcast?

 

Dear SQL DBA…

I am getting into performance troubleshooting on SQL Server. I hear you talk about wait stats a lot, and how important they are to the process of troubleshooting.

What ways are there to check the wait stats for a given time? How would you go about creating a baseline for a system you have just taken over?

Sincerely,

Waiting on Stats

I do love wait stats!

If you listened to the performance tuning methodology I outlined in an earlier episode, you saw how important I think wait stats are for troubleshooting performance.

If you missed that episode, it’s called Lost in Performance Tuning. (I’ve got an outline of the discussion in the blog post, as always.)

Not-Napping-Collecting-Wait-Stats

If I’m going to manage the system for a long time, I would buy a vendor tool to baseline wait stats

SQL Server is a mature database. There’s a lot of vendors out there who have tapped into the need to track and baseline wait stats.

They’ve honed tools to:

  • Collect the waitstats in a lightweight manner
  • Store them in a repository and groom the data over time, so it doesn’t explode
  • Build reports for you to see big picture data
  • Build fancy UIs for you to zoom in on a point in time
  • Find queries that were running when the waits were occurring

Example vendors – I’m listing three that I’ve used before to solve problems:

SQL Sentry Performance Advisor, Idera Diagnostic Manager, Dell Software (formerly Quest) Spotlight on SQL Server Enterprise

I haven’t listed these in order of preference. I know people who swear by each of them.

Since monitoring systems for SQL Server are pretty mature, the differences are in the details.

Details can be very important, of course– research and trials will help you find which one is the best fit for your team, processes, and applications.

Should DBAs write their own tools?

There are some people out there who think you should roll your own tools. That it makes you more legitimate.

I’ve written a lot of my own tools. It takes a lot of time.

To get feature parity with what vendors are offering, we’re talking years of investment.

It’s really easy to negatively impact performance with your tools. Tool vendors work very hard to avoid this, and it even happens to them sometimes.

The difference is that the vendor has a bunch of engineers who can quickly fix the issue and release a new version.

It’s only worth it to write your own tools when nobody offers a solution that fits you.

It’s a little bit like monitoring your heart rate for your own health

I wear a heart rate monitor to help me estimate how active I am during the day, and how hard I work during my workouts. Heart rate monitors are pretty affordable, and you can choose between wearing them on your wrist and wearing a chest strap. Some are more accurate than others, and they have different reporting tools.

I could learn to take my own heart rate and sample and record it myself. I could probably build some reports off it. But I’m really happy having spent $150 for a device that does it for me.

This leaves me free to spend my time interpreting the heart rate and calorie burn data it gives me, and customizing my activity to fit my health plan.

How to get budget for a performance monitoring tool

Do two things:

  • Outline the business cases that a performance monitoring tool will help with. Link to specific customer incidents that it would help resolve.
  • Pick the top 2 or 3 vendor tools you’d like to test, and map their features to the business cases.

Bam, your request is looking a lot more legitimate.

Test them one at a time. Start with a non-production server.

Your best bet is to write some code to reproduce performance problems against that server.

Ideally these map to your business cases.

Other ideas:

  • Find sample code with searches to simulate blocking and deadlocks, if you’d like to start there.
  • Modify my sample code for testing inserts for race conditions with Microsoft’s free ostress tool for more fun (here it is)
  • Write some queries that read a lot of data and possibly run them from an Agent job (maybe it calls ostress)

Review how your use cases all look in the tool you’re testing.

Are the wait stats recorded and displayed well? Are they useful to you?

How easy is it for you to find the queries related to the wait stats?

Reach out to the vendor during your trial if you’re having problem. Sometimes the tools are smart in ways that aren’t obvious. This also gives you some insight into their support processes.

Tip: check if the tool which you test sends monitoring data to the cloud. If so, make sure you get that approved by management before putting the tools into production. In sensitive environments, get that approved before you test it, too.

If I’m troubleshooting a system for a short time, or if there’s no budget, I’ll use and contribute to an open source tool

Sometimes there’s good reasons for budgetary limitations– maybe you work for a non-profit and that money is literally feeding children.

Or maybe you’re doing a short term analysis and you just need to collect information over a couple of days, and there’s no time to test and deploy a more robust tool.

In that case, I’d start with sp_BlitzFirst from Brent Ozar Unlimited:

  • It’s free
  • It’s open source
  • It’s got some documentation to get you started
  • It’s already wired up to support different sample lengths, and write to tables
  • It looks at running queries as well as some system metrics to help point out critical information related to the wait stats

You can start with what others have built, and slowly contribute on your own as well. Much nicer than starting from scratch.

Previous Post
Free Poster: tempdb data files in SQL Server
Next Post
Do Clustered Index Columns Count Against Max Nonclustered Key Size?

Related Posts

6 Comments. Leave new

  • For this specific flavor of monitoring tool, I would toss on one other suggestion…it’s not terribly cheap, and they don’t pay me a dime (no, sadly the flow of money is reversed)…Solarwinds DPA, which I’m still finding myself refer to as its old Confio name, Ignite. I’ve used it and Spotlight for years, and Spotlight does a lot of things well…maybe -pretty well- is a better descriptor, but I’ve not seen anything to touch Ignite (see there I did it again) in terms of being able to drill into a performance problem, and it is a waits stat logger at its core. Really great interface, one of those things I would hate to be without on a critical instance. For systems monitoring and alerting, not the best (I’m sure not the worst either) but if you can afford a couple tools (I know, I know) having DPA specifically for monitoring performance and waits stats, and something else to handle alerting and “miscellaneous” is rather a nice setup.

    Reply
  • I have had to create my own performance monitoring tools (with help using both Brent Ozar and Paul Randal scripts) starting with SQL 2000. I have also used SQL Sentry in the past few years as well. For me and the DBA teams I am on, a combination of vendor tools and custom code is what works best. We are always responsible for SQL performance. However, we do not always have the tools we need when we need them in all environments (prod/pre-prod/QC/Test). Custom code always has been and always will be a necessity for me and my DBA teams until SQL performance monitoring tools are ubiquitous and no cost (highly unlikely).

    Reply
  • […] Kendra Little on collecting wait stats as part of a baseline: […]

    Reply
  • HI Kendra,

    thank’s your podcast is awesome. I didn’t manage to post a review on iTunes yet as I haven’t figured out apple’s review mechanism yet.

    I have been using Microsoft Management Data Warehouse for performance analysis for more than a year now. It ships free with the product and offers a rich source of information. Some say it is one of the hidden gems most unused and unknown to the SQL folks.The reports may not be perfect but they adress my needs well enough. What do you think about this and why didn’t you mention it? Brent Ozar once told me, I would have to look out for performance regression due to MDWH collector jobs. However I haven’t seen any so far.

    One challenge that eventually turned up was the licensing of another SQL server (just to host the monitoring data). Unfortunatly (as I understand it) Microsoft doesn’t allow to use developer edition for this. Express is not an option either due to data volumes (> 10GB) and missing SQL Server Agent. I have put it on a separate server from the beginning. Processing and querying the data involves some load which I wouldn’t like to see on a production server. I guess the data hosting question that’s relevant for all of the commercial tools (Idera, Dell, SQLSentry, …) as well. I don’t believe that this solutions will happily use something else then a database to store their results in. What’s your experience with this and how would you argue the (perhaps on top) expenses for another database server?

    Reply
    • Hi Martin,

      Great question. I didn’t mention Management Data Warehouse because I agree with you about having the monitoring database separately — for general performance, but also the issue of not wanting to be querying the live instance for perf data when you have bad performance, possibly making it worse.

      Issues with the MDW: it doesn’t get much active development. When you buy from one of the third party vendors you get support and they release new features, but MDW has that licensing cost AND it doesn’t include that.

      One of the signs of immaturity that has jumped out over the years is the fact that there was no supported way to remove the data collector on an instance after you installed it– it left behind jobs with dependencies. They did finally release a stored procedure that lets you clean up, but the connect item reports that people are running into issues with that and still have to use a workaround: https://connect.microsoft.com/SQLServer/feedback/details/334180/data-collector-remove-data-collector-to-remove-associated-objects

      Essentially, it’s one of those features that was a good idea, but the implementation was done very narrowly and it’s never been brought up to a level where it can compete with what third party tools can do.

      Reply
      • HI Kendra,

        thanks for your answer. I guess I should look at the products from specialised vendors once I get the time. For me logically just starting out with MDWH I might not know what I miss. Originally I am a BI professional and have grown to be a accidental DBA and sometimes MDWH performs really sucks and I wished that they would have built a nice SSAS cube on top of it. Of course I could do this myself…however I agree on your statement that a DBA doesn’t have to build everything on his own.

        For me it’s greatly important to buy a tool where I an opt out from sending data to the cloud. That is perhaps because I am German and we have some stricter rules regarding data privacy here than in the US. Dell recently presented its free Spotlight Developer…but of course there’s nothing such as a free meal and it comes with a personal account and will send all your data to the cloud.

        Looking forward for more great Dear SQL DBA podcasts

        Martin

        Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Menu