“Multi-threading” the Sql Server Analysis Services Formula Engine – I #ssas #mdx

The SSAS single threaded Formula Engine (FE)

It’s a known (also unfortunate…) fact that analysis services (multidimensional) formula engine (FE) is single threaded.

Now, for anyone not familiar with SSAS this should come as a very unusual  limiting factor. Honestly, in a world dominated by the cloud, map/reduce, MPP/shared nothing platforms, (imo) the only plausible reason to still have a single threaded SSAS FE, is that we are really looking at an amazing product from a group of amazing minds that, unfortunately, hasn’t got any major upgrade since its very first 2005 release. (aside from 2008 block computation improvements ), and yes the reason for that reason would need another kind of post…

Well, not really single threaded…

A relevant note nevertheless is that, when we say SSAS FE is single threaded, we’re not talking about a single threaded engine FE per server, where we would have a single thread handling all session queries. That would be obviously unsustainable. What seems to happen -as far as I can guess-  is that the FE is single thread per query. When the work is heavily FE bounded, a single thread will handle the FE load. Other simultaneous queries will get their own FE thread from the thread pool. (note: the number of total available worker threads is not unlimited of course)

A very simple query will show this behavior a (using with clause to inhibit the FE cache –a tip from Chris I use a lot – and median calculation to make it very CPU/FE bound):

with
   member [Measures].[Test1] as
   ‘median([Customer].[Customer].[Customer],[Measures].[Internet Sales Amount])’
   member [Measures].[Test2] as
   ‘median([Customer].[Customer].[Customer],[Measures].[Internet Tax Amount])’
   member [Measures].[Test3] as
   ‘median([Customer].[Customer].[Customer],[Measures].[Internet Order Quantity])’

select
non empty {[Measures].[Test1],[Measures].[Test2],[Measures].[Test3] } on 0,
non empty [Date].[Calendar].[Month]*[Product].[Category].[Category] on 1
from [Adventure Works]

This query running on my laptop (SQL 2012 SSAS) will average 45-55 secs duration, with a very constant 20%-25% cpu usage (4 listed cores):

image

   

image

 

image

In a multi-core server environment  this would be even more noticeable, with very low total cpu% usage during that query FE execution.

The real issue (for end users at least) is that adding more cores/cpus will not have any significant performance  improvement for queries like this –but faster cpus should help a bit-. A single thread will handle all the load, all the cpu% power on recent hardware will be completely wasted. You will see several FE heavy queries taking several seconds to complete, almost without any kind of cpu/io pressure. “Uncomfortable” to watch at least.

“Partitioning” MDX queries & multi threading

If a “final round” can be expected for SSAS multidimensional I really hope that this could be somehow fixed (configurable at least), in a way that multiple threads could handle the FE load when appropriate.

(relevant update, see comments below, thanks Chris! don’t miss this post from Chris Webb, Parallelism and MDX Queries, an amazing piece of work using ssas proc/clr to multi thread some specific query patterns, may be this can be fine tuned for a wider range of scenarios)

Partitioning the workload by unrelated measure makes for a very strong candidate. (related measures will share intermediate calculations,cache, and probably won’t benefit so much from multi threading).

An example, let’s split that previous query into three, one for each measure requested, (note that I used 3 unrelated base measures)

(query 1, the others are similar)

image

Executing them in parallel will result in something like: query1: 24 secs, query2: 23 secs, query 3: 23 secs. But as they executed simultaneously, we only have to wait for the longest one to get all the data we need, ~24 secs. Less 50% of the original single query duration.

CPU usage now spikes predictably to ~75% during the ~25 secs queries duration.

 

image

Possible scenarios for query partitioning

So, what do we take from here? Well, it depends a lot on each specific query workload… But it can come handy in some scenarios:

  • you have a system were cpu capacity is clearly underused most of time
  • you control the queries sent to ssas, through a web frontend for example or you are using a frontend that can use multiple parallel requests (tip: performance point)
  • queries are easy “partitionable“, eg: by measure
  • you already did everything possible regarding the usual ssas performance tuning best practices (at least everything -you think- is possible)

In these scenarios,  you can try to partition the queries, execute and aggregate the results. There’s a chance that your users can benefit from better response times.

(and of course partitioning queries like this is a poor solution for badly written MDX that tends to be very FE intensive. For that my recipe is always refer back to Chris, Marco, Mosha, Piasevoli, Greg, Kasper, Richard, and many others, never forgetting our amazing BI team at DevScope :) )

Next post: “Multi-threading” the Sql Server Analysis Services Formula Engine II – Performance Point

Stay tuned!

Rui

Advertisements

SmartPivot–Free Little Excel Add-In for Olap Pivot Tables, Easily create Excel dashboards from olap cubes,Auto Synchronize Excel Pivot Tables filters

(update: SmartPivot version 2.3 – a much improved version- has just been published, please visit SmartPivot Product Page for the latest version features including instant search, quick connect , a stunning pivot viewer visualization for Excel tables & many others)

image

SmartPivot Latest Version for Excel 2007, 2010, 2013 & PowerPivot

 

Note: the content below is now outdated

So, following my previous post “Excel 2007/2010 Pivot Tables: Getting detailed/granular table reports from OLAP… in seconds! “ , I’ve decided to test the feature a little further, including it on our little add-on for Excel, DevScope SmartPivot. And doing that also decided to get some public feedback, making this little Excel  add-on publicly available. :)

So… what’s SmartPivot? It’s an Excel add-on that targets olap pivot tables & charts, resulted mainly from some requests from our business intelligence customers. Little things like this:

image

Easily create graphical dashboards over olap cubes

-Excel 2007 or 2010!
-Create connected charts
-Automatic filter configuration
-Filter sync
-(no, no slicers in the screenshot :) )

image

Synchronize filters across pivot tables

-Creates connections between different pivot table filters
-Auto-Update

image

Create tabular reports from olap cubes **BETA**

-Following my last post  on this
-Works great with standard, not calculated, measures
-Free ad-hoc filtering & column selection

 

As you can see, not a lot of features. :) We provide our add-on to existing customers, there is not a any kind of roadmap for some a full featured, as in “paid”, product (we also lack the time for it :) ). It’s basically an exercise of “creativity”, trying to find simple, smart, ways  to make Excel a better BI tool for handling OLAP browsing. No promises . :)

So, how to use it? First download at http://www.devscope.net/products/SmartPivot and install it.

 

Then you’ll need an Excel Pivot Table to access SmartPivot features, simple right click and explore the features:

 

image

 

Example: Creating Dashboards with SmartPivot

Start with a pivot table and add some filters

 

image

Then select SmartPivot > Create Connected Chart:

 

image

 

The pivot chart is created, properly connected to the filters pivot table, the pivot filters are replicated to the pivot chart:

 

image

 

Now build the chart as usual, let’s try a trend chart by  product :

 

image

 

 

So, let’s repeat this a few more times in order to get some interesting “360º view” style dashboard, adding more connected charts:

 

image

 

Now I’ll make it a little bit larger, with 3 more easily connected charts we get something like this:

 

image

 

Now the fun begins :) , just change the main pivot filter as needed, let’s pick some specific years and geography (time to get local :) and see how’s Portugal doing):

 

 

image

 

As you see, chart filters are in sync with the top pivot filters. In fact, let’s add some new filters:

 

image

 

The new filters we’re also replicated to the charts below. So that’s it, almost no effort and you have a more useful view of your data directly in Excel!  You can save and reopen the file later, change the filters, the behavior is what you would expect, the filters always drive the charts below.

 

So… now you may be asking… but hey, that’s the point of Excel 2010 slicers!!! :) And you’re partially right… but please note the following:

 

  • The first working draft of this feature dates back to 2009, before Excel 2010 slicers (although it as been rather interesting seeing that slicers use the same, cof, “dirty” trick ;) )
  • SmartPivot works with Excel 2007 & Excel 2010, Slicers are Excel 2010 only
  • Is it me or there is no really simple way of creating dashboards from OLAP in Excel 2010, even with slicers/powerpivot, without either build a powerpivot model or hacking through all the charts, connections, pivots and slicers manually? If my understanding is right…it’s a mess!!!! (but may be I’m missing something here)
  • Try to get Excel slicers working with complex deep 3,4,5 level hierarchies (even the simple example above has uses lots of levels)…
  • Excel 2010 has a great search member features, you don’t get that with slicers, but as SmartPivot uses only standard pivot filters, all existing features apply, search included

For example, let’s search some projectors in another sample dashboard:

image

 

The filter selection is sent to the dashboard charts :

 

image

 

So, that’s it for the dashboards the part. Now the rather risky *beta* feature of olap table reports, based on my previous post.

 

Example: Creating “faster” tabular reports from olap data

Let’s say we need a regular table style report, high detail report, from our olap cube. We’ve seen how Excel has some problems handling this type of queries in pivot tables, let’s how we could do this using SmartPivot.

Again, start with a pivot and some filters, and select Smart Pivot >  Create Table Report:

image

 

Select the columns you want in the report (not the most amazing column select screen I guess, hey, it’s beta :) ). Let’s make it heavy and select a bunch of columns like date, product, sales manager, etc, ie near to drilltrough style of query :

image

 

And that’s it, a few moments later we get the report:

image

 

Let’s add some sorting (date), formatting, size the columns, etc:

image

 

Ok, 74 rows it’s really not that impressive, let’s increase our filter scope  a little bit, let’s say Europe & All Products:

 

image

 

So, now that’s more like it, 338,584 rows from contoso_retail cube, in about 20 seconds, getting 11 columns. Kind of cool right? :)

That’s the fun part, and although it’s been interesting to explore this rather “hacky” feature and seeing what it can do, use it with caution. Due to internals of the olap engine it works rather well with standard measures, not so well with calculated measures (don’t expect to get thousands of rows when requesting the calculation engine…).

Anyway it’s been a life saver on some projects, being able to allow users to see data in this way, let they choose the filters,etc. At least it shows that the “for that use reporting services with some hard coded filters and query the dw database directly“ has other strong alternatives.

So the public download listed above allows you to install, test  & use this Add-On. As we don’t have any kind of roadmap for it at this point (may be we’ll put parts of it on codeplex, how knows)  the public version allows the creating of dashboards up to 4 charts (kind of usual right?) & table reports with up to 8 columns. Anyone who wish to take it to some deep dive testing just drop my a note  and I’ll send you a license.

All feedback will be greatly appreciated!

 

(Related posts & reference material)

 

Excel 2007/2010 Pivot Tables: Getting detailed/granular table reports from OLAP… in seconds! – Rui Quintino BI –
http://biresort.net/blogs/rquintino/archive/2010/07/25/excel-2007-2010-pivot-tables-getting-detailed-granular-table-reports-from-olap.aspx

Richard Lees on BI: Improving Excel’s Cube Performance
http://richardlees.blogspot.com/2010/04/improving-excels-cube-performance.html

Binding an Excel table to the results of an MDX query « Chris Webb’s BI Blog
http://cwebbbi.wordpress.com/2009/12/18/binding-an-excel-table-to-the-results-of-an-mdx-query/

 

 

 

RQ

Excel 2007/2010 Pivot Tables: Getting detailed/granular table reports from OLAP… in seconds!

(update: SmartPivot for Excel version 2.3 – a much improved version- has just been published, please visit SmartPivot Product Page for the latest version Excel BI features including fast tabular reports, instant search, quick connect , a stunning pivot viewer visualization for Excel tables & many others)

image6[3]

SmartPivot Latest Version for Excel 2007, 2010, 2013 & PowerPivot

 

Well, I’ve been delaying  & postponing this post. Being the main reason having some assurance that this could have some actual value :). Really, I’m not sure, but anyway, here it goes!

 

A few weeks (actually I think I now can say months… :) ) I was called to help a customer solve a set of somehow advanced scenarios on Analysis Services. One of the issues, actually the big one, was that end users were having a real hard time getting detailed reports (near to drill trough, maximum granularity “detail” queries) working with Excel 2007 Pivot Tables.

 

My first thought was, honestly :) , “sure… olap & analysis services are mostly aggregate oriented, not the best option for creating large facts/rows detailed reports“. We could say, if being a little “naive”, that there other good options like using reporting services,etc.  IMO that really doesn’t cut it anymore, my reasons:

 

  • End users really don’t care if they want aggregates or detail rows, they want a single BI repository to their information needs. And they’re right! :)
  • Solutions like building  custom reports have the problem that they don’t allow free adhoc report authoring & filtering
  • From a “single version of the truth” perspective (whatever that means :) ) using several tools & repositories to show almost the same information is the best way to go (IMO)

 

So perceiving that end users have good reasons I went a little further to see why, when getting back *only* near 10.000 records with 8/9 fields on a pivot table, Excel 2007 mostly….ahh…”crashed”, or took several minutes to get the “tiny” 10.000 rows pivot table back.

 

And the results were somehow surprising. In this specific scenario it really has nothing to do with OLAP/Analysis Services, but with the mdx query excel 2007 generates, that even with hidden grand totals (to mimic a table result), requests aggregated values to the olap (All member).

 

The problem with is this? Well  for a result of 784 real result rows (what end user sees), using 13 columns, olap processes aprox. 125.000 total cells, and returns aprox 10.000 total rows. But if the result rows are now 6.207 we get 1.211.896  total cells from olap, and  86.564   total olap rows. Why is that?

 

The problem is that Excel still requests all member aggregations even when Excel is not showing that values (that’s a “table” report layout remember?). In this case the yellow rows are returned from analysis services but never shown.

image

The query goes something like this (note drilldown member style of query):

 

image

 

This is unfortunate because a simple custom mdx query using standard cross joins to get the same “table” style results, actually returns on just a few seconds!!! Something like:

image

 

(note: a few weeks later Richard Lees posted exactly the same problem, with great insights & tips: using properties or custom “reporting” hierarchies. Please see here . That’s a great post. This was after my issue, it sure would have helped me at the time! :) )

 

So, although testing this on Excel is difficult due to performance maybe we’re talking of something like (relation between real rows vs olap rows with 13 columns)

 

image

 

So can we do something do to ease things out? And still maintaining end user authoring & filtering ? :) Well, yes we can! We really can’t tailor the MDX query generated by Excel…but.. we can change the mdx query from an external data table query, and we can get the filters from the pivot and using them on our external table query! :)

 

The results are really impressive, actually I was deeply surprised, warm cache queries for 6.207 rows are under a second, for 73.094 rows under 15 seconds, we were actually  able to return more than 1million rows from analysis services, using this technique, in a few minutes.

 

(note: that when using native pivot table for the 73.094 rows scenario I couldn’t get the query to end without first crashing my excel – if you use Prism HUD (great tool btw) the memory & cpu footprint of this native pivot table queries are *very* noticeable…

in fact be careful when exploring several columns excel pivot tables with analysis services, particularly  if both are in the same box…  I got the impression that my disk almost “fried” due to excessive memory usage…).

So, here’s the sample I’ve worked on, I would really like to get some feedback. :)  You can filter the table report like a regular pivot table. It checks the results so that grand totals check, and show the refresh time in seconds. It also does the trick of changing the column labels to the friendly ones.

 

image

Some results (note excel result rows vs olap rows):

 

image

 

Some notes:

 

  • When opening you must enable macros, that’s the way custom mdx query is changed when adding/removing filters on the fast sample
  • I use a named instance for SQL2008, so if you have default instance you have to change it on the connections tab
  • The sample is bases on the Contoso Retail Olap sample
  • If you need to add more columns to the sample just edit the query on the config sheet. As long as you leave the {filters} token
  • Doesn’t support multi select filters (hey, it’s just a sample! actually my customer did an awesome job with this startup, and the final workable excel was great)
  • There’s a strange result with employee jae,pak witch doesn’t show up on the native pivot table (fell free to send feedback on this, my time is limited :) )

image

 

 

 

 

So, that’s it. I think there’s some potential here I and would like to share & receive feedback. I talked with several people on the possibility that we were missing some excel option to fix this, and that it’s still open :) , if you know one please share it.

 

Richard Lees mentions that this should be fixed on 2010, but I’m still getting the same behavior on the sample, again, maybe we missed something. I also would like to share some final notes:

 

  • End users really shouldn’t be worrying about using the analysis services only for aggregated values. What they want is a single repository for their information needs, specially if the data scope is the same! I actually think there should be a really, cof, :) Unified Model that covers both type of scenarios.
  • As it seems that we can actually use olap to do, at least some, more granular queries… why doesn’t excel offer a real table style, no totals, end user authored pivot table? Doesn’t seem difficult to me…

 

Thanks!

 

Rui

 

Update Aug-2010:

Update Sep-2011: (some notes from Chris Webb on the issue, now regarding denali tabular & dax scenarios-surprisingly…still a problem! -)

Sample Download:

 http://cid-8b11b4c8b8ea33ad.skydrive.live.com/redir.aspx?resid=8B11B4C8B8EA33AD!223&Bpub=SDX.Docs&Bsrc=GetSharingLink

DevDays DAT302, Tuning & understanding MDX Query Performance in SQL Server 2008 Analysis Services

 

 

image

 

A little bit late I know , but now in english,. Thanks for the “heads up” Pedro.

 

Anyway, here it is, all the materials I’ve used on the DevDays session.Notes of interest:

  • Report/Dashboard that shows the mdx trace log, properly organized by query
  • Possible to filter, using sub report links
  • Use of a custom report in management studio
  • I think it should run fine in sql 2005 management studio (needs sp2 for custom reports)
  • Sample code, use as is, can  be buggy! :)

All feedback is welcome!

 

Materials

http://cid-8b11b4c8b8ea33ad.skydrive.live.com/embedrowdetail.aspx/DevDays2009-DAT302
http://cid-8b11b4c8b8ea33ad.skydrive.live.com/self.aspx/DevDays2009-DAT302

 

Resources

MDX Studio
http://www.mosha.com/msolap/mdxstudio.htm

Blog Mosha Pasumansky
http://sqlblog.com/blogs/mosha/

BIDS Helper
http://www.codeplex.com/bidshelper

A Solution for Collecting Analysis Services Performance Data for Performance Analysis
http://www.codeplex.com/SQLSrvAnalysisSrvcs

Microsoft SQL Server Community Samples: Analysis Services
http://www.codeplex.com/SQLSrvAnalysisSrvcs

SQL Server 2008 White Paper: Analysis Services Performance Guide
http://www.microsoft.com/downloads/details.aspx?FamilyID=3be0488d-e7aa-4078-a050-ae39912d2e43&displaylang=en

SQL Server Best Practices Article: Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services
http://www.microsoft.com/DOWNLOADS/details.aspx?FamilyID=975c5bb2-8207-4b4e-be7c-06ac86e24c13&displaylang=en

Microsoft SQL Server 2008 Analysis Services Unleashed (Paperback)
http://www.ssas-info.com/announcements/1296-new-book-released-qmicrosoft-sql-server-2008-analysis-services-unleashedq