The PIVOT
and UNPIVOT
relational operators in SQL Server are powerful tools for reshaping data within table-valued expressions. PIVOT
transforms unique values from a column into distinct columns in the output, effectively rotating the table. It also performs aggregations on any remaining column values to produce a summarized result. Conversely, UNPIVOT
performs the inverse operation, converting columns into rows of values.
For tasks like generating cross-tabulation reports or preparing data for specific applications, understanding and utilizing PIVOT
and UNPIVOT
can significantly simplify your SQL queries. Instead of resorting to complex SELECT...CASE
statement series, these operators offer a more readable and efficient syntax.
Note: While
PIVOT
andUNPIVOT
are incredibly useful, excessive use within a single T-SQL statement can impact query performance. It’s important to use them judiciously and consider alternative approaches for very complex transformations.
The examples in this article are based on the AdventureWorks2022
or AdventureWorksDW2022
sample databases, readily available for download from Microsoft SQL Server Samples.
Understanding the Syntax of PIVOT and UNPIVOT
Let’s break down the syntax for both PIVOT
and UNPIVOT
operators to understand their structure and components.
PIVOT Syntax:
SELECT [ <non-pivoted column> [ AS <column name> ] , ] ...
[ <first column pivoted> [ AS <column name> ] ,
[ <second column pivoted> [ AS <column name> ] , ] ...
[ <last column pivoted> [ AS <column name> ] ] ]
FROM ( <table-valued expression> ) AS <alias for the source query>
PIVOT ( <aggregation function>(<value column>)
FOR <pivot column>
IN ( [ <first column> ], [ <second column> ],
[ <last column> ] ) ) AS <alias for the pivot table>
[ ORDER BY <column name> ];
UNPIVOT Syntax:
SELECT [ <non-pivoted column> [ AS <column name> ] , ] ...
<pivot column name> [ AS <column name> ],
<value column name> [ AS <column name> ]
FROM ( <table-valued expression> ) AS <alias for the source query>
UNPIVOT
( <value column name>
FOR <pivot column name>
IN ( [ <first pivoted column> ], [ <second pivoted column> ],
[ <last pivoted column> ] ) ) AS <alias for the unpivot table>
[ ORDER BY <column name> ];
Key Considerations for UNPIVOT Column Identifiers:
It’s important to note that column identifiers within the UNPIVOT
clause adhere to the catalog collation.
- In Azure SQL Database, the collation is consistently
SQL_Latin1_General_CP1_CI_AS
. - For SQL Server partially contained databases, the collation is always
Latin1_General_100_CI_AS_KS_WS_SC
.
When combining these columns with others, you might need to use a COLLATE DATABASE_DEFAULT
clause to prevent collation conflicts and ensure consistent behavior.
Also, in Microsoft Fabric and Azure Synapse Analytics pools, using PIVOT
with a GROUP BY
clause on the nonpivot column output might lead to query failures. A workaround is to remove the redundant GROUP BY
clause, as it often doesn’t affect the query results.
Basic PIVOT Operation: Transforming Rows to Columns
Let’s start with a simple example to illustrate how PIVOT
works. Consider the Production.Product
table in the AdventureWorks2022
database. We want to find the average standard cost for products based on their DaysToManufacture
.
First, let’s execute a basic GROUP BY
query to see the initial data:
USE AdventureWorks2022;
GO
SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost
FROM Production.Product
GROUP BY DaysToManufacture;
This query yields a table with two columns and four rows, showing the average cost for each distinct DaysToManufacture
value.
DaysToManufacture AverageCost
------------------ ------------
0 5.0885
1 223.88
2 359.1082
4 949.4105
Notice that there are no products with DaysToManufacture
equal to 3.
Now, let’s use PIVOT
to transform the DaysToManufacture
values into column headers.
-- Pivot table to transform DaysToManufacture values into columns
SELECT 'AverageCost' AS CostSortedByProductionDays,
[0], [1], [2], [3], [4]
FROM
(
SELECT DaysToManufacture, StandardCost
FROM Production.Product
) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture
IN ([0], [1], [2], [3], [4])
) AS PivotTable;
In this PIVOT
query:
- We select a descriptive column
'AverageCost'
and specify the desired pivoted columns:[0]
,[1]
,[2]
,[3]
,[4]
, representing theDaysToManufacture
values. - The
SourceTable
subquery retrieves theDaysToManufacture
andStandardCost
fromProduction.Product
. - The
PIVOT
operator then aggregates theStandardCost
usingAVG()
for eachDaysToManufacture
value, specified in theIN
clause.
Here’s the pivoted result:
CostSortedByProductionDays 0 1 2 3 4
--------------------------- ----------- ----------- ----------- ----------- -----------
AverageCost 5.0885 223.88 359.1082 NULL 949.4105
As you can see, the DaysToManufacture
values (0, 1, 2, 3, 4) are now column headers, and the corresponding average costs are displayed under each column. Even though there were no products with DaysToManufacture = 3
, a column [3]
is still included, with a NULL
value indicating no data for that category.
Advanced PIVOT for Cross-Tabulation Reports
PIVOT
is particularly useful for creating cross-tabulation reports that summarize data in a matrix format. Let’s consider a more complex scenario using the PurchaseOrderHeader
table in AdventureWorks2022
. Suppose we want to analyze the number of purchase orders placed by different employees, categorized by vendor.
The following query generates a report showing the count of purchase orders for specific employees, broken down by VendorID
.
USE AdventureWorks2022;
GO
SELECT VendorID, [250] AS Emp1, [251] AS Emp2, [256] AS Emp3, [257] AS Emp4, [260] AS Emp5
FROM
(
SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader
) AS PurchaseOrders
PIVOT
(
COUNT(PurchaseOrderID)
FOR EmployeeID
IN ([250], [251], [256], [257], [260])
) AS PivotTable
ORDER BY PivotTable.VendorID;
In this query:
- We aim to pivot the results based on
EmployeeID
to see purchase order counts per employee and vendor. - The subquery
PurchaseOrders
selectsPurchaseOrderID
,EmployeeID
, andVendorID
fromPurchasing.PurchaseOrderHeader
. - The
PIVOT
clause aggregatesPurchaseOrderID
usingCOUNT()
for eachEmployeeID
listed in theIN
clause ([250]
,[251]
,[256]
,[257]
,[260]
).
Here’s a snippet of the resulting cross-tabulation report:
VendorID Emp1 Emp2 Emp3 Emp4 Emp5
----------- ----------- ----------- ----------- ----------- -----------
1492 2 5 4 4 4
1494 2 5 4 5 4
1496 2 4 4 5 5
1498 2 5 4 4 4
1500 3 4 4 5 4
...
This output clearly shows the number of purchase orders (counts) placed by each employee (Emp1 to Emp5, corresponding to EmployeeIDs 250, 251, 256, 257, and 260) for each VendorID
. This is a typical cross-tabulation, summarizing data across two dimensions.
Important: When using aggregate functions with
PIVOT
, anyNULL
values in the value column (in this case,PurchaseOrderID
) are not considered in the aggregation. This is a crucial point to remember when interpreting pivoted results.
UNPIVOT: Transforming Columns Back to Rows
UNPIVOT
is the reverse of PIVOT
. It rotates columns back into rows. Let’s take the pivoted table pvt
from the previous example and transform the employee columns (Emp1
, Emp2
, Emp3
, Emp4
, Emp5
) back into rows. We’ll create a table pvt
mimicking the pivoted output and then unpivot it.
-- Create a sample pivoted table
CREATE TABLE pvt (
VendorID INT,
Emp1 INT,
Emp2 INT,
Emp3 INT,
Emp4 INT,
Emp5 INT
);
GO
INSERT INTO pvt VALUES (1, 4, 3, 5, 4, 4);
INSERT INTO pvt VALUES (2, 4, 1, 5, 5, 5);
INSERT INTO pvt VALUES (3, 4, 3, 5, 4, 4);
INSERT INTO pvt VALUES (4, 4, 2, 5, 5, 4);
INSERT INTO pvt VALUES (5, 5, 1, 5, 5, 5);
GO
-- Unpivot the table
SELECT VendorID, Employee, Orders
FROM
(
SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM pvt
) AS p
UNPIVOT
(
Orders
FOR Employee IN (Emp1, Emp2, Emp3, Emp4, Emp5)
) AS unpvt;
GO
In this UNPIVOT
query:
- We want to unpivot the columns
Emp1
,Emp2
,Emp3
,Emp4
, andEmp5
. - The
UNPIVOT
operator takes the values from these columns and puts them into a new column namedOrders
(thevalue_column
). - It also creates a new column named
Employee
(thepivot_column
) to store the original column names (Emp1
,Emp2
, etc.).
Here’s a partial result set after unpivoting:
VendorID Employee Orders
----------- ----------- ------
1 Emp1 4
1 Emp2 3
1 Emp3 5
1 Emp4 4
1 Emp5 4
2 Emp1 4
2 Emp2 1
2 Emp3 5
2 Emp4 5
2 Emp5 5
...
The result is a table where each employee column is transformed into rows, making it easier to analyze data in a normalized format.
Important:
UNPIVOT
is not a perfect reversal ofPIVOT
.PIVOT
involves aggregation, potentially merging multiple rows into one.UNPIVOT
cannot reconstruct the original table-valued expression exactly, especially if aggregations were involved. Also,NULL
values in the input ofUNPIVOT
will disappear in the output.
Real-world Example: Sales Data Analysis
The Sales.vSalesPersonSalesByFiscalYears
view in AdventureWorks2022
itself uses PIVOT
to display sales data. It shows total sales for each salesperson across different fiscal years. You can examine the view’s script in SQL Server Management Studio to see a practical application of PIVOT
in a reporting context. In Object Explorer, navigate to AdventureWorks2022
database, then Views, and find Sales.vSalesPersonSalesByFiscalYears
. Right-click and select Script View as to inspect its definition.
Conclusion
PIVOT
and UNPIVOT
are essential SQL Server operators for data transformation. PIVOT
allows you to rotate rows into columns, creating summarized, cross-tabular reports, while UNPIVOT
does the opposite, converting columns into rows for normalized data views. Understanding and effectively using these operators can significantly enhance your ability to manipulate and analyze data in SQL Server. By leveraging these tools, you can write more concise, readable, and powerful SQL queries for various data transformation needs.