“Multi-threading” the Sql Server Analysis Services Formula Engine II-a parallel query msmdpump proxy
December 5, 2013 Leave a comment
Following a previous post (“Multi-threading” the Sql Server Analysis Services Formula Engine – I #ssas #mdx ), we returned recently to the issue of multi threading and SSAS formula engine. See that last post or this post by James Serra for reference.
Honestly it’s kind of amazing that something like the SSAS engine can run incredibly well using a single threaded model for each query. It’s damn smart, and as long it runs smart you usually don’t need brute force. :)
Until… you find yourself with a very (very) complex cube, together with a rather complex scorecard model also completely built in SSAS/mdx, and…. PerformancePoint with its “with clauses” (and like Chris Webb pointed before, with clause disables FE cache)
Sample query generated by PerformancePoint (a scorecard model, kpis on rows), running in 50-60 secs.
To return a complex and FE intensive scorecard with several different KPIs (completely different measure groups), with values/metrics like value, ytd value, prior year, end of year projection, target, ytd target, score and so on… well, requires a not so usual amount of computation by SSAS. Worst, due to the, let’s call it a “reasonable”, :) amount of mdx scopes involved it triggered some internal thresholds i, and it stops being smart… maybe switching to cell by cell mode? Query 3 kpis individually, none exceeds 2-3 secs, get the 3 together in the same query, –> 30-40 secs….
After exhausting all the tuning we could possibly remember… reducing down the query from 1m20 secs, to under 40-50 secs, but still annoying, why all the CPU power if we are waiting for damn 50 secs?
Another thing to note was that every row was a different KPI, completely isolated from all the other rows returned, really a very good candidate for partitioning and multi threading. But SSAS doesn’t do that (query hint would be great SSAS team ;) ),
(and this where I have to say the usual disclaimer, please do this at home, never at work! it’s not supported… aside from tolerating that 50 secs what follows is the worst possible thing you could do…)
We had previously built some SSAS http msmdpump proxies before (it allows you to query SSAS server over http) adding some “extra” features we needed…
So why not trying intercepting that query in a “fake” msmdpump (proxy) built in asp.net, partitioning it by member rows requested, run a bunch of parallel MDX queries against the SSAS DB, get the results, join the cell set together, and return as a SSAS XMLA pump reply….? (kinda…nuts….yes)
And well, far, far, far away from being a reusable module for this scenarios we built it for only this specific one, changed the connections in PerformancePoint pointing to the http endpoint (only for this kpis/scorecards), and gave it a go:
From here, query running to 50 secs, almost no cpu activiy:
To here, query now running under 14 secs (multi-threading is very noticeable, obviously you should have spare cpu available, not for a already heavily loaded server ):
Short story, how it works?
1-a web app gets the XMLA http requests targeted to the isapi SSAS msmdpump, inspects them for a very specific pattern. everything else passthrough to the regular msmdpump (debug/testing) or blocked (production server, we only need to answer a very specific pattern from performance point)
2-it splits the member list requested in the mdx rows (on rows), generates an identical MDX query for each group of N members
3-using .net parallel apis then sends x simultaneous queries/threads to the local SSAS db:
4-and the tricky part, getting the result cell set xml for each query, and aggregating that in a unique cellset xmla that then is returned to the original http request made by performance point (you can also fire a man studio mdx query and passing it through the proxy)
Honestly I don’t even know if it’s a good snippet to share , but well we confirmed that it can work in very, very specific scenarios, as a last resource when you can’t influence queries being made (tried that also… didn’t work out so good…) …. If you risk being fired for not tuning that query that the CEO runs every day, you may want to test something like this… otherwise forget what you’ve read. :)
But if this post happens to generate some requests where it can be put to good use I consider to share it privately or even on our codeplex sample site.
Please be warned that in most cases you won’t need anything like this, SSAS FE engine should work just fine because it’s very smart (and we are not! :) ), only for very complex cubes and specific scenarios something as risky as this can be possibly useful.
ps-this also opens a rather interesting possibility, spreading the inner requests to secondary servers in a scale out farm… imagine that working automatically for some queries… will we get this for SQL 2016? ;)
As for the code, its goes something like this (lots of complexity removed, just the core steps):