

The expression filters all the columns, and the requirement is that the value of all the columns belongs to a row in the cluster. Therefore, the expression computed for the Blue cluster looks like this: Instead, it creates a filter context using all the columns in the cluster, filtering the values that are present in the cluster.įor the Blue cluster we have three bikes for 300.00 USD and four shirts accounting for 400.00 USD. Here is the first of the many shenanigans SUMMARIZE tends to engage in: to restrict the calculation to an individual cluster, SUMMARIZE does not create a filter context containing the cluster header only. Once the clusters are ready, SUMMARIZE computes the value of SUM ( Sales ) for the three clusters. The cluster header can contain multiple columns, even though in this first scenario we have only one column. The cluster header is the set of columns used in the groupby section of SUMMARIZE. In our scenario, Sales is the cluster header. Because we are grouping by Sales, SUMMARIZE splits the Sales table into three clusters based on the color.īecause we grouped by color, each cluster is identified by one color. SUMMARIZE first clusters the table based on the color, and then computes the expression for each cluster by creating a filter context that limits the calculation to the iterated cluster. Clustering is the creation of partitions based on the columns used to group by. What it does is more complex.īecause the query needs to group by color, SUMMARIZE splits the table in partitions – one for each color. Unfortunately, this would be an approximation of the steps performed by SUMMARIZE. Intuitively, we might think that SUMMARIZE started by grouping Sales by Sales that it then iterated the result by computing the sum of Amount for all the rows with the same color, by creating a filter context over the color. This query produces the sales amount by color. On this simple table, we run a SUMMARIZE query that works just as expected:

We use a table with only seven rows to introduce clustering. Introducing clusteringĬlustering is a technique used by SUMMARIZE to compute its result. We introduce the concepts one step at a time: first clustering, then the role of expanded tables in clustering, and finally the effect of the presence of both a row and a filter contexts in the same formula. In order to correctly understand SUMMARIZE you must understand how clustering works, what impact the presence of both a row and a filter context has, and the role of expanded tables in clustering. It could be even worse: The results might seem correct during the first tests, and yet this might only mean that SUMMARIZE is silently waiting to stab you in the back, by producing an incorrect result – nearly impossible to debug – as soon as you go in production. The thing is that the semantics of SUMMARIZE are so convoluted that the results might be unexpected. It is not that these functionalities do not work. The other two operations – creating new columns and computing subtotals – should be avoided. Out of the three main operations of SUMMARIZE, only the first one is safe.

It can create new columns, computing expressions in row and filter contexts.It can group a table by any column, of the table itself or of related tables.Then, in 2016 SUMMARIZE abdicated its role as query king, leaving the throne to the easier and more powerful SUMMARIZECOLUMNS.īeing designed as a querying function, SUMMARIZE performs several operations: SUMMARIZE was the main query function in DAX to produce reports for a few years. Still, there is value in being aware of the details. Once you have digested the content of this article, you are likely to no longer use SUMMARIZE to compute expressions.
DAX SUMMARIZE FULL
If this is the case, then the goal of this full article is to explain exactly how SUMMARIZE works and why using it to compute new columns is a bad idea. With that said, you might be a curious reader: you might want to understand why best practices are indeed best practices.
DAX SUMMARIZE CODE
We understand how SUMMARIZE works but we do not want your code to return inaccurate results, just because you use a function without understanding when its result might be different from the result you expected. Open your existing DAX code, search for SUMMARIZE and if you find that you are using SUMMARIZE to compute new columns, add them instead by using ADDCOLUMNS.Īt SQLBI we are so strong on this position that we deliberately omitted a part of the detailed description of the behavior of SUMMARIZE in our book.

If you are using SUMMARIZE to calculate new columns, stop. If you like to follow best practices, you can just read this paragraph out of the entire article.
