Friday, June 24, 2016

SSAS Cube Aggregation Design


Several features of your cube design must be borne in mind when designing aggregations, because they can influence how Analysis Services storage engine queries are made and therefore which aggregations will be used. These include:

·         There's no point building aggregations above the granularity you are slicing your partitions by. Aggregations are built on a per-partition basis, so for example if you're partitioning by month there's no value in building an aggregation at the Year granularity since no aggregation can contain more than one month's worth of data. It won't hurt if you do it, it just means that an aggregation at month will be the same size as one at year but useful to more queries. It follows from this that it might be possible to over-partition data and reduce the effectiveness of aggregations, but we have anecdotal evidence from people who have built very large cubes that this is not an issue.

·         For queries involving a dimension with a many-to-many relationship to a measure group, aggregations must not be built using any attributes from the many-to-many dimension, but instead must be built at the granularity of the attributes with a regular relationship to the intermediate measure group. When a query is run using the Sales Reason dimension Analysis Services first works out which Sales Orders relate to each Sales Reason, and then queries the main measure group for these Sales Orders. Therefore, only aggregations at the Sales Order granularity on the main measure group can be used. As a result, in most cases it's not worth building aggregations for queries on many-to-many dimensions since the granularity of these queries is often close to that of the original fact table.

·         Queries involving measures which have semi-additive aggregation types are always resolved at the granularity attribute of the time dimension, so you need to include that attribute in all aggregations.

·         Queries involving measures with measure expressions require aggregations at the common granularity of the two measure groups involved.

·         You should not build aggregations including a parent/child attribute; instead you should use the key attribute in aggregations.

·         No aggregation should include an attribute which has AttributeHierarchyEnabled set to False.

·         No aggregation should include an attribute that is below the granularity attribute of the dimension for the measure group.

·         Any attributes which have a default member that is anything other than the All Member, or which have IsAggregatable set to False, should also be included in all aggregations.

·         Aggregations and indexes are not built for partitions with fewer than 4096 rows. This threshold is set by the IndexBuildThreshold property in msmdsrv.ini; you can change it but it's not a good idea to do so.

·         Aggregations should not include redundant attributes, that is to say attributes from the same 'chain' of attribute relationships. For example if you had a chain of attribute relationships going from month to quarter to year, you should not build an aggregation including month and quarter—it should just include month. This will increase the chance that the aggregation can be used by more queries, as well as reducing the size of the aggregation.

·         Aggregations usability % is higher if the aggregated attribute is at a lower/Medium level in a user defined hierarchy. So it becomes imperative to scan through the attribute relationship usage of Dimension before we add any attribute to aggregation design.

·         To gauze the usage of an aggregation,  Look  out for Query Sub cube , Query sub cube Verbose  and Get data from aggregation events in AS Trace . If these are getting hit , then your MDX query is using the aggregations.

·         Calculated Measure Expressions cause queries to be evaluated at the common granularity of the two measure groups involved. Unary operators other than + and Custom Rollups will also force query granularity down. For Example If Total Impressions =AdExpert Impressions+ AdRevenue Impression . Then Any Mdx query will utilize an aggregation if the designs on the Measure Groups are having aggregation defined at same level of querying Dim attribute else it won’t be utilized.

·         Above All, Aggregations addition/deletion do not UnProcess a cube .But You have to do a process index on selective partition/Measure Group/ whole cube  to see them in effect.

·         Calculated Measure Expressions cause queries to be evaluated at the common granularity. (Aggregation not defined or defined at same grain in appnexus/mocean and adexpert.)

·         Aggregations should not include redundant attributes, that is to say attributes from the same 'chain' of attribute relationships.