PIVOT, Multi Pivot & Dynamic Pivot in SQL Server


Summary

Pivoting is a technique used to rotate(transpose) rows to columns. It turns the unique values from one column in one table or table expression into multiple columns in another table. SQL Server 2005 introduced the PIVOT operator as a syntax extension for table expression in the FROM clause. PIVOT, the relational operator is a T-Sql proprietary operator and is not part of ANSI SQL Standard.

PIVOT operator structure

Rotating(Pivoting) one table or table expression into another  table requires three different elements

  1. Grouping element
  2. Aggregating element
  3. Spreading element

The PIVOT operator accepts only Aggregating and Spreading elements. To avoid possible logical errors we must have a clear understanding of all three parameters, especially the Grouping element.

The following example demonstrates the three elements in action.

Let’s say we want to present the sum of freight(Shipping cost) values per order year for each country that ordered our products.
Set up dbo.Orders_TestPivot table. The script can be found here.

The PIVOT queries below transpose columns from a table expression (ShipCountry, Freight, and OrderYear) into a new table.
The queries are logically identical although they use different types of table expressions. The version on the left uses a Derived query and the one on the right uses Common table expression(CTE).
More on table expressions can be found here:
My personal preference is the CTE version, so i’ll use that in the following examples. 🙂

Derived query table expression Common Table Expression

The figure below visually maps the elements of the PIVOT operator and the final result set.

Figure 1, PIVOT Operation

My personal way of thinking when creating a PIVOT query is;

  1. Sketch the final result-set and visualise all three elements required for PIVOT operation
  2. Define a table expression(CTE) that returns:
    1. Spreading element – what we want to see on columns – OrderYear
    2. Aggregate element – what we want to see in the intersection of each row and column – Freight
    3. Grouping element* – what we want to see on rows – ShipCountry
  3. Add  PIVOT operator. The pivot operator returns a table result – in our example, the table result has alias PVT.
    1. Include aggregate function applied to the aggregate element – SUM(Freight).
    2. Include the FOR clause and the spreading column – FOR OrderYear.
    3. Specify the IN clause and the list of distinct, comma-separated values that appear in the spreading element. [2018],[2019],[2020] . In our example we have a list of irregular identifiers* that needs to be delimited.
      If we added a non-existing value to the IN list e.g [2099], the query would execute with no error but with the NULL aggregated values 🙂
    4. Specify an alias for the PIVOT result table – PVT
  4. Specify the final SELECT. The columns are selected from the PIVOT result table. The sequence of the selected columns is not relevant.

Note: Irregular identifiers:
We use identifiers to name(identify) Sql Server’s objects i.e stored procedures, tables, views, constraints, column names, attributes ..etc. There is a set of rules for creating identifiers i.e The first character cannot be numeric, so e.g 2018 is an Irregular identifier. To be able to use irregular identifiers we need to “fix” their boundaries/limits or to deLimit them. To do that we can use double quotation marks – 2018 or tSQL specific – square brackets;  [2018]. More about Sql Server Identifiers can be found here.

An interesting thing about the PIVOT operator is that it does not include grouping element. The grouping element is “everything else” that is not a spreading or an aggregating element. In our example, the grouping element is ShipCountry column selected in the table expression.
If we selected e.g ShipCity along with ShipCountry as the two columns that are not a spreading or an aggregate element, the result would be different.


Figure 2, Group By ShipCountry and ShipCity

This behavior can cause logical errors, especially if we apply the PIVOT operator directly on a table.

In the next experiment, we are not using a table expression to prepare the data-set for the PIVOT operator. Instead, PIVOT now operates over the entire table. It implicitly(automatically) groups data by all columns except the orderDate and Freght columns. As we can see in Figure 3, the query produces an unexpected result


Figure 3, PIVOT operation directly on a table

To avoid possible logical errors, it is always a good practice to first construct a table expression with the implicitly defined PIVOT elements(grouping, spreading, and aggregating), and then apply the PIVOT operator on the prepared data set.

Multi aggregate pivot

A PIVOT operator can handle only one aggregate element at a time.  This means that if we want to use more aggregate elements we need to add more PIVOT operators to our query – a PIVOT operator per aggregate element 😐
In the previous example our aggregate element was Freight when we calculated the total shipping costs in different countries per year.
This time, we want to calculate the average value of the orders placed in different countries per year and to add the results to our query.
Figure 4 shows the desired result

Figure 4, Multi aggregate PIVOT- two aggregate elements

From the result, we can see that the second result-set is just “appended” to the first. Basically, we just combined the two PIVOT results using an INNER JOIN table operator and an equality predicate on the ShipCountry column.
The final query uses column aliases to indicate the different data sets.
Figure 6, Multi aggregate PIVOT operation

The query in Figure 6 can be found here.

Dynamic PIVOT

A disadvantage of the PIVOT operator is that its IN clause only accepts a static list of spreading values. It does not support e.g a sub-query as input. This means that we need to know in advance all the distinct values in the spreading element. The “hard-coding” may not necessarily be a problem in cases when we deal with a spreading element with the known spreading values e.g OrderYear.
Going back to the first example, we can easily expand the IN list with the spreading values that are not available yet.

Things get more complex when we cannot predict all possible spreading values. In these situations, we can first design a query that will give us a distinct list of spreading values and then use that list to dynamically construct the final PIVOT query, the Dynamic Pivot.
A typical scenario in which we use Dynamic pivoting is when transposing attributes of an EAV*(Entity-Attribute-Value) data model.

EAV* is one of the open-schema data models (xml, json, clr) that, in some cases, can provide more flexibility than the relational model. Here is an interesting post about EAV.

Let’s say we have a list of Products. Each product is different and can have a specific set of attributes. e.g a bicycle can have a specific type of tires and a hard drive can have a specific capacity..etc. Business frequently adds new products and product attributes. In the next example, I used a simplified EAV model to store the products. The table script can be found here.

Our next task is to return a row for each distinct product, a column for each distinct product attribute, and in the intersection of each product and attribute, we want to see the value of the attribute.

Figure 7 shows the desired output for all products and for a specific product
Figure 7, Dynamic pivot result

In this scenario, we cannot know all the possible Attributes(the spreading element values). Moreover, the list of attributes is constantly changing, so hard-coding the IN list is no longer an option.
The following is a  dynamic pivot query that can give us the result in Figure 7.

NOTE: To extract a known Attribute value, in this case, we can use MAX() or MIN() aggregate functions. Both functions will operate on a single value and will return a single value. Keep in mind that MIN and MAX as well as all other aggregate functions except COUNT(*), ignores NULL values.

The new attributes will be automatically handled by the dynamic query.

A couple of versions of the dynamic query can be downloaded here.

Conclusion

Pivoting is a technique used to transpose rows to columns. PIVOT is tSql proprietary operator and is not part of the ANSI Standard. PIVOT operator accepts two parameters; Spreading element or what we want to see on columns and aggregating element or what we want to see in the intersection of each distinct row and column. The grouping element is the third parameter involved in pivot operation. It is what we want to see on rows. The grouping element is not a formal part of the PIVOT operator and represents all columns that are not defined as spreading or aggregating elements. The implicit nature of the grouping element can lead to logical errors. This is why is recommended to construct a table expression for the PIVOT operator that provides exactly three elements needed for the operation.
A PIVOT operator is limited to only one aggregate function. To perform multi aggregate pivot we need to introduce a PIVOT operator per aggregation.
The IN clause of the PIVOT operator accepts only a hard-coded, comma-separated list of spreading element values. In the situations when the values are not known, we use dynamic sql to construct the query.

 

Thanks for reading.

Dean Mincic

6 thoughts on “PIVOT, Multi Pivot & Dynamic Pivot in SQL Server”

  1. Hello,

    very great and particulary the dynamic pivot. I’m searching a long time for this

    thank’s

  2. A really great article with good example and step by step clear explanation.
    Thanks for sharing the knowledge.

  3. In your multi-pivot example, the spreading element remains the same, but what if you want to spread over 3 different columns? For example, if you have employee data and want to group by Office and show employee counts for each combo of job cagtegory, region, department. Assuming there are 2 categories, 2 departments and 2 regions.
    So the output columns for each row would be:

    office, cat1-east-sales, cat1-east-tech, cat1-west-sales, cat1-west-tech, cat1 totals ,cat2-east-sales, cat2-east-tech, cat2-west-sales, cat2-west-tech, cat2-totals

    Is that even something applicable for the PIVOT operator?

    1. Hi BoaMike,
      Thanks for reading the post.
      One approach may be to prepare Emp counts per “combos” i.e Category + Region + Department before pivoting. Each combo can be represented by a unique name eg. cat1-east-sales.
      This is just an idea 🙂

      DROP TABLE IF EXISTS #multiSpreadPivot
      GO

      SELECT tab.OfficeId
      ,tab.Category
      ,tab.Region
      ,tab.Department
      ,tab.EmpCount
      INTO #multiSpreadPivot
      FROM ( VALUES ('Office_1','Cat1','East','Sales',10),
      ('Office_1','Cat1','East','Tech',5),
      ('Office_1','Cat2','East','Sales',15),
      ('Office_1','Cat2','East','Tech',8),
      ('Office_2','Cat1','East','Sales',12),
      ('Office_2','Cat2','East','Sales',22),--
      ('Office_2','Cat1','East','Tech',7),
      ('Office_2','Cat2','East','Sales',3),
      ('Office_2','Cat2','East','Tech',11),
      ('Office_3','Cat1','East','Accnt',30),

      ('Office_4','Cat1','West','Sales',17),
      ('Office_4','Cat1','West','Tech',22),
      ('Office_4','Cat2','West','Sales',15),
      ('Office_4','Cat2','West','Tech',8),
      ('Office_5','Cat1','West','Sales',12),
      ('Office_5','Cat1','West','Tech',7),
      ('Office_6','Cat2','South','Sales',3),
      ('Office_6','Cat2','West','Tech',11),
      ('Office_7','Cat1','North','Marketing',43)
      )AS tab(OfficeId,Category,Region,Department,EmpCount)

      SELECT * FROM #multiSpreadPivot

      ;WITH tblExpr AS
      (
      SELECT OfficeId --- grouping
      ,Combo = Category +'_' + Region + '_'+ Department --spreading
      ,EmpCountPerCombo = SUM(EmpCount) --aggregate
      FROM #multiSpreadPivot
      GROUP BY OfficeId,Category +'_' + Region + '_'+ Department
      ), catTotals AS --prepare totals
      (
      SELECT OfficeId
      ,Category
      ,TotalPerCat = SUM(EmpCount)
      FROM #multiSpreadPivot
      GROUP BY OfficeId,Category
      ), piv1 AS
      (
      SELECT OfficeId
      ,p1.Cat1_East_Sales
      ,p1.Cat1_East_Tech
      ,p1.Cat1_West_Sales
      ,p1.Cat1_West_Tech
      ,p1.Cat1_East_Accnt
      ,p1.Cat2_East_Sales
      ,p1.Cat2_East_Tech
      ,p1.Cat2_West_Sales
      ,p1.Cat2_West_Tech
      ,p1.Cat2_South_Sales
      ,p1.Cat1_North_Marketing
      FROM tblExpr
      PIVOT (SUM(EmpCountPerCombo)
      FOR Combo
      IN ( [Cat1_East_Sales]
      ,[Cat1_East_Tech]
      ,[Cat1_West_Sales]
      ,[Cat1_West_Tech]
      ,[Cat1_East_Accnt]
      ,[Cat2_East_Sales]
      ,[Cat2_East_Tech]
      ,[Cat2_West_Sales]
      ,[Cat2_West_Tech]
      ,[Cat2_South_Sales]
      ,[Cat1_North_Marketing] )
      ) p1
      ),piv2 AS
      (
      SELECT OfficeId
      ,Cat1Total = p2.Cat1
      ,Cat2Total = p2.Cat2
      FROM catTotals

      PIVOT (SUM(TotalPerCat)
      FOR Category IN ([Cat1],[Cat2])
      )p2
      )

      SELECT p.OfficeId
      ,p.Cat1_East_Sales
      ,p.Cat1_East_Tech
      ,p.Cat1_West_Sales
      ,p.Cat1_West_Tech
      ,p.Cat1_East_Accnt
      ,p.Cat1_North_Marketing
      ,p2.Cat1Total --total Cat1
      ,p.Cat2_East_Sales
      ,p.Cat2_East_Tech
      ,p.Cat2_West_Sales
      ,p.Cat2_West_Tech
      ,p.Cat2_South_Sales
      ,p2.Cat2Total --total Cat2

      FROM piv1 p
      INNER JOIN piv2 p2
      ON p2.OfficeId = p.OfficeId
      go

Leave a Reply

Your email address will not be published.