Monday, June 23, 2014

Can that report run faster?

We have a scenario where a report is causing us some grief. Let's call this report 'Revenue'. Let's assume that this report is shown in a 'Leadership' meeting where revenue is an important topic. Let's assume that the presenter in this meeting shows the 'Leadership' team this report by launching it in Report Manager, along with several other reports. All these reports will be used to discuss points of interest in said meeting.
So put yourself in the seat of one of the observers of said report. It is kicked off, and you watch the little spinny do its thing, while you wait for this report to materialize the data and render it visually for you to consume. You wait. And wait.

How long do you wait before you open up your device and kick off the same report to 'see if you can run it faster' than the presenter? I'm betting a few minutes, maybe 4-5. And if you do that, who else in the room is doing the same thing? Let's assume more than one individual does this, which can cause the other executions to slow, or at least causes your execution to slow.

This scenario played out a few weeks ago. Presenter kicked off report, which timed out after 8 minutes or so, but at the same time, 4 other individuals also kicked off the report. It was in the middle of these executions that I was contacted to look at said report, and see if I could make it faster. Later that day, as everyone else stopped executing it, the speed of execution dropped to a couple minutes, under 2 minutes. But during the peak of multiple executions, it was taking 8 minutes. (also, that morning, 4 times the normal quantity of reports were being generated. 4 times. as in 4000 reports, instead of 1000 reports)

Over the past few weeks, I have been digging at this report and looking for solutions to its speed issue. It seems that when normal circumstances prevail, this report can retrieve and render its data in under a minute. That seems acceptable. But under load, it takes time, and causes frustration in the team that needs to consume it.

This report is written in SSRS and hits our production reporting system that is a SQL Server 2008 R2 system that utilizes several FusionIO drives to speed things along. It is a nice machine and is much faster than our old reporting system. The data seems to be properly indexed. The queries seem to function as expected. So I jumped into caching and snapshotting to investigate alternatives to 'speeding up the report'.

Let's look at the past execution of this report to see what we are dealing with. Looking into the ExecutionLog view allows us a view into what has transpired with this particular report. I format the time to perform the tasks, and convert it to minutes, so i don't have to think too hard. I have to join to the Catalog to limit the results to the actual report that I want to see. And for sorting, let's order it by the time the report started in descending order to see the latest executions to the older ones.

  2.     c.[Name],
  3.     (TimeDataRetrieval + TimeProcessing + TimeRendering)/60000.00 AS [Time],
  4.     [RequestType],
  5.     [TimeStart]
  6.   from ReportServer.dbo.ExecutionLog el
  7.     JOIN ReportServer.dbo.Catalog c ON c.ItemID = el.ReportID  
  8.   WHERE c.[Name] = 'revenue'
  9.   ORDER BY [TimeStart] desc;
This query shows me a lot of good information. I can see how many times the report was executed, how long it took, and so on. When I did this for the time frame of issues, I see exactly what I expected. A ton of executions taking a long time, and many many within the same time frames. I can expand this query to add other fields, like UserName to see who executed the report, status to see success or otherwise, among other. Use the fields you want to tell the story of the data. My story showed me that a few individuals had executed the report repeatedly and upon failure, attempted again, and again. As I mentioned, later in the day, it sped up, once everyone stopped hitting it incessantly.

I went to SSRS Report Manager to drill down to the properties of this report. On the 'Snapshot Options' tab, I enabled 'Allow report history to be created manually' and 'Store all report snapshots in history' checkbox items. In the 'Report History' tab, I saw nothing, as expected. I planned on revisiting this tab to see the progress of adding caching and snapshotting. On the 'Processing Options' tab I chose 'Cache a temporary copy of the report. Expire copy of report after a number of minutes:' and chose 240 as the minutes. I also took this opportunity to select the 'Do not timeout report' option, hoping to allow the slow report to simply continue, instead of eventually dying, which prompts others to try the same thing. And fail.
The last option that I did was to setup a cache plan on the 'Cache Refresh Options' tab. I chose 2 plans, after careful review of the executions seen from the query above. This query helped me see the times that were executed, and when a cache would be most profitable to this report.

With these 2 cache plans created, and the rest of the options selected, I was happy that the report would run better next week. As Monday rolled around, I was interested to see the results. So I grabbed the above query and executed it mid day Monday, only to encounter report executions. Meaning I could only see individuals that had fired off this report, and the time it took them to render. (which was fast). I was unable to see cache executions. I would have thought that there would be an execution at the times indicated in my cache plan, and I could see how long those took. But this query failed me. To the twitterverse!!!

After receiving some help from my #SQLFamily via @markvsql, @jasonhorner, @RateControl, and @tameraclark, I was able to create this query.

  1. SELECT 
  2.     ItemPath,
  3.     (TimeDataRetrieval + TimeProcessing + TimeRendering)/60000 AS [Time],
  4.     [RequestType], 
  5.     [TimeStart]
  6.   from ReportServer.dbo.ExecutionLog3 el
  7.   WHERE [ItemPath] like '%revenue%'
  8.   ORDER BY [TimeStart] desc

                  This uses ExecutionLog3, which has different fields in its view. Happily, some of these fields are exactly what I need. I can add into the where clause options to show 'Interactive' (ran by a human) or 'Refresh Cache' (ran by the caching mechanism) and see how long each took.

                  Looking at these results showed that the cache took longer than the interactives, as expected. But when you compare the 'Interactive' times to pre cache times, the results are astounding. The caching takes a chunk of time, and then subsequent executions are tremendously smaller in execution time. Meaning, let the system create the cache on its cache plan schedule, and then the humans can simply run the report, and get the cached results quickly. No more multiple executions. We went down from 28 executions that never seemed to complete, to 4-5 that are a lot faster.

                  After a few Mondays, and reviews of the resultant data, I believe I have narrowed it down to the appropriate options. I kick off the cache about a half hour before the meeting on Monday morning, and then again at noon. This allows for caching of the report data twice on a Monday, allowing the 'Leadership' meeting to have quick report executions, and cached data in the morning, and the noon cache allows for other teams to utilizes a fresher cache in their afternoon meetings. But as soon as the cache reaches 240 minutes, it expires, which reverts users back to non cached data retrieval. The rest of the week the report should run quickly in the random times it needs to run.

                  Point of this story is that it is important to use data to make data faster. And not a single solution is better than the other, and the appropriate solution may be something you are unfamiliar with. I did not know or utilize caching in reports much prior to this. I would have normally added indexes and tweeked the queries used to perform the data retrieval, trying to squeeze all the juice out of them that I could. But a simpler solution was to understand the use of this report, compare to other non peak time of use, and adjust appropriately.

                  I believe that this solution is valuable, and suits my needs, and satisfies the users of this report. Thus I share it with you in the hopes you too can learn as I did.

                  No comments: