The PIVOT
and UNPIVOT
relational operators in SQL Server are powerful tools that allow you to reshape data, transforming table-valued expressions into new table formats. Specifically, PIVOT
rotates a table by converting the unique values from a chosen column into distinct columns in the output. Furthermore, it performs aggregations on any remaining column values, providing summarized and restructured data in your final result. Conversely, UNPIVOT
performs the inverse operation, turning columns back into row values.
For those familiar with creating cross-tabulation reports or needing to dynamically reshape data for easier analysis, understanding PIVOT
in SQL Server is essential. This article will delve into the intricacies of the PIVOT
operator, providing clear explanations, syntax breakdowns, and practical examples to empower you to effectively utilize this feature in your SQL queries. We will also touch upon UNPIVOT
to give you a complete picture of data reshaping in SQL Server.
This guide is applicable to various SQL Server environments, including SQL Server, Azure SQL Database, Azure SQL Managed Instance, Azure Synapse Analytics, Analytics Platform System (PDW), SQL analytics endpoint in Microsoft Fabric, Warehouse in Microsoft Fabric, and SQL database in Microsoft Fabric.
Understanding PIVOT Syntax in SQL Server
The PIVOT
operator simplifies what would otherwise be complex SELECT...CASE
statement constructions, offering a more readable and efficient syntax for data transformation. Here’s a breakdown of the PIVOT
syntax:
SELECT [ <non-pivoted column> [ AS <column name> ] , ] ...
[ <first pivoted column> [ AS <column name> ] ,
[ <second pivoted column> [ AS <column name> ] , ] ...
[ <last pivoted column> [ AS <column name> ] ] ]
FROM ( <table-valued expression> ) AS <alias for source query>
PIVOT ( <aggregation function>(<value column>)
FOR <pivot column>
IN ( [ <first pivot column name> ], [ <second pivot column name> ],
[ <last pivot column name> ] )
) AS <alias for pivot table>;
Let’s dissect each part of this syntax:
<table-valued expression>
: This is the source query or table from which data will be pivoted. It’s the initial dataset you want to reshape.<alias for source query>
: A temporary name assigned to your source query, making it referenceable within thePIVOT
operation.<aggregation function>(<value column>)
: This specifies the aggregate function (e.g.,SUM
,AVG
,COUNT
,MAX
,MIN
) to be applied to the<value column>
. The<value column>
is the column whose values will be aggregated. The result of this aggregation becomes the values in the new pivoted columns.<pivot column>
: This is the column in the source data whose unique values will become the new column headers in the pivoted table.IN ( [ <first pivot column name> ], [ <second pivot column name> ], [ <last pivot column name> ] )
: This list explicitly defines the unique values from the<pivot column>
that you want to transform into columns. SQL Server will create columns based on these specified values.<alias for pivot table>
: An alias for the resulting pivoted table, allowing you to further query or manipulate the transformed data.<non-pivoted column>
: Columns from the source query that are not pivoted but are included in the final output. These columns essentially act as grouping columns, maintaining context for the pivoted data.<pivoted column>
: These are the new columns created by thePIVOT
operation, derived from the unique values in the<pivot column>
.
Basic SQL Server PIVOT Example: Averaging Product Costs
Let’s illustrate PIVOT
with a simple example using the AdventureWorks2022
database. Suppose we want to find the average standard cost of products, categorized by the number of days it takes to manufacture them (DaysToManufacture
).
First, let’s look at the non-pivoted data:
USE AdventureWorks2022;
GO
SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost
FROM Production.Product
GROUP BY DaysToManufacture;
This query gives us a straightforward result set:
DaysToManufacture AverageCost
------------------ ------------
0 5.0885
1 223.88
2 359.1082
4 949.4105
Now, let’s use PIVOT
to transform the DaysToManufacture
values into column headers:
SELECT 'AverageCost' AS CostSortedByProductionDays,
[0], [1], [2], [3], [4]
FROM (SELECT DaysToManufacture, StandardCost
FROM Production.Product) AS SourceTable
PIVOT (AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])) AS PivotTable;
Here’s the pivoted result:
CostSortedByProductionDays 0 1 2 3 4
--------------------------- ----------- ----------- ----------- ------- -----------
AverageCost 5.0885 223.88 359.1082 NULL 949.4105
In this pivoted table:
DaysToManufacture
values (0, 1, 2, 3, 4) have become column headers.- The
AVG(StandardCost)
is calculated for eachDaysToManufacture
category. - Notice the
NULL
value under column[3]
. This indicates that there are no products withDaysToManufacture
equal to 3 in theProduction.Product
table.PIVOT
still creates the column as requested but fills it withNULL
when no data exists for that pivot column value.
Alt text: Pivoted table showing average product cost categorized by days to manufacture, with days to manufacture values (0, 1, 2, 3, 4) as column headers and average cost as values.
Advanced PIVOT Example: Purchase Order Summary by Employee
Let’s consider a more complex scenario. We want to analyze purchase orders in the AdventureWorks2022
database and create a cross-tabulation report showing the number of purchase orders placed by specific employees, categorized by vendor.
USE AdventureWorks2022;
GO
SELECT VendorID, [250] AS Emp1, [251] AS Emp2, [256] AS Emp3, [257] AS Emp4, [260] AS Emp5
FROM (SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader) p
PIVOT (COUNT(PurchaseOrderID)
FOR EmployeeID IN ([250], [251], [256], [257], [260])) AS pvt
ORDER BY pvt.VendorID;
This query pivots the purchase order data based on EmployeeID
. The COUNT(PurchaseOrderID)
aggregates the number of orders for each employee and vendor combination. The IN ([250], [251], [256], [257], [260])
clause specifies that we are interested in employees with these IDs.
A snippet of the result set looks like this:
VendorID Emp1 Emp2 Emp3 Emp4 Emp5
----------- ----------- ----------- ----------- ----------- -----------
1492 2 5 4 4 4
1494 2 5 4 5 4
1496 2 4 4 5 5
1498 2 5 4 4 4
1500 3 4 4 5 4
...
This pivoted result clearly shows, for each VendorID
, the count of purchase orders placed by each specified employee (Emp1 to Emp5, corresponding to EmployeeIDs 250, 251, 256, 257, and 260 respectively). This cross-tab report provides a summarized view of employee purchasing activity per vendor.
UNPIVOT: Columns to Rows
UNPIVOT
is the inverse of PIVOT
. It transforms columns back into rows. Imagine we have the pivoted table from the previous purchase order example, and we want to revert it back to a more normalized form.
First, let’s create a temporary table pvt
to represent the pivoted data:
-- Create the table and insert sample values
CREATE TABLE pvt (
VendorID INT,
Emp1 INT,
Emp2 INT,
Emp3 INT,
Emp4 INT,
Emp5 INT
);
GO
INSERT INTO pvt VALUES (1, 4, 3, 5, 4, 4);
INSERT INTO pvt VALUES (2, 4, 1, 5, 5, 5);
INSERT INTO pvt VALUES (3, 4, 3, 5, 4, 4);
INSERT INTO pvt VALUES (4, 4, 2, 5, 5, 4);
INSERT INTO pvt VALUES (5, 5, 1, 5, 5, 5);
GO
Now, let’s use UNPIVOT
to transform the employee columns (Emp1
, Emp2
, etc.) back into rows:
SELECT VendorID, Employee, Orders
FROM (SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM pvt) p
UNPIVOT
(Orders
FOR Employee IN (Emp1, Emp2, Emp3, Emp4, Emp5)
) AS unpvt;
GO
The UNPIVOT
syntax is as follows:
SELECT [ <non-pivoted column> [ AS <column name> ] , ] ...
[ <output column for values> [ AS <column name> ] , ]
[ <output column for column names> [ AS <column name> ] ]
FROM ( <table-valued expression> ) AS <alias for source query>
UNPIVOT
( <value column>
FOR <pivot column name> IN ( <column1>, <column2>, ... <columnN> )
) AS <alias for unpivot table>;
In our UNPIVOT
example:
Orders FOR Employee IN (Emp1, Emp2, Emp3, Emp4, Emp5)
: This is the core of theUNPIVOT
operation.Orders
will be the new column holding the values from the columns being unpivoted (Emp1
,Emp2
, etc.).Employee
will be the new column holding the names of the columns that were unpivoted (e.g., ‘Emp1’, ‘Emp2’).
The UNPIVOT
query produces results like this:
VendorID Employee Orders
----------- ----------- ------
1 Emp1 4
1 Emp2 3
1 Emp3 5
1 Emp4 4
1 Emp5 4
2 Emp1 4
2 Emp2 1
...
Alt text: Unpivoted table displaying VendorID, Employee (representing employee columns from pivoted table), and Orders (order counts corresponding to employees).
As you can see, the columns Emp1
, Emp2
, Emp3
, Emp4
, Emp5
have been transformed into rows under the Employee
column, and their corresponding values are now in the Orders
column.
Important Considerations for PIVOT and UNPIVOT
- Aggregation in PIVOT:
PIVOT
inherently involves aggregation. You must specify an aggregate function. If no aggregation is needed (e.g., you just want to reshape without summarizing),PIVOT
might not be the direct tool, and alternative methods might be more suitable. - NULL Values and Aggregation: When using aggregate functions with
PIVOT
,NULL
values in the<value column>
are ignored during aggregation calculations. - UNPIVOT is not a perfect reverse:
UNPIVOT
is not always a perfect reversal ofPIVOT
, especially whenPIVOT
involves aggregations that collapse multiple rows. Also,NULL
values present before aPIVOT
operation might disappear in theUNPIVOT
output. - Performance: Repeated use of
PIVOT
/UNPIVOT
in a single query can impact performance. Consider optimizing complex queries involving these operators. - Column Collation: Be mindful of column collations, especially when using
UNPIVOT
. In Azure SQL Database, the collation isSQL_Latin1_General_CP1_CI_AS
. For partially contained databases in SQL Server, it’sLatin1_General_100_CI_AS_KS_WS_SC
. UseCOLLATE DATABASE_DEFAULT
to avoid collation conflicts if needed.
Conclusion
PIVOT
and UNPIVOT
are valuable SQL Server operators for reshaping data, making it easier to generate cross-tabulation reports and transform data for analysis. PIVOT
allows you to rotate unique column values into new columns and perform aggregations, while UNPIVOT
does the reverse, converting columns into rows. By mastering these operators, you can significantly enhance your data manipulation capabilities in SQL Server and gain deeper insights from your datasets. Remember to consider performance implications and the nuances of aggregation and NULL
value handling when working with PIVOT
and UNPIVOT
.