Common Table Expressions (CTEs) in SQL Server provide a powerful way to simplify complex queries and enhance readability. Considered as temporary named result sets, CTEs are derived from simple queries and exist within the execution scope of a single SELECT
, INSERT
, UPDATE
, DELETE
, or MERGE
statement. They can also be integral to CREATE VIEW
statements. A key feature of CTEs is their ability to reference themselves, forming what are known as recursive CTEs, which are particularly useful for handling hierarchical data.
Understanding CTE Syntax in SQL Server
The foundation of using CTEs lies in understanding their syntax. Here’s the basic structure:
[ WITH <common_table_expression> [ ,...n ] ]
<common_table_expression>::=
expression_name [ ( column_name [ ,...n ] ) ]
AS
( CTE_query_definition )
Let’s break down the components:
WITH <common_table_expression>
: This clause initiates the definition of one or more CTEs. Multiple CTEs can be defined in a singleWITH
clause, separated by commas.<common_table_expression>::=
: This defines the structure of a single CTE.expression_name
: This is the identifier or name you assign to your CTE. It must be unique within theWITH
clause but can reuse names of existing base tables or views; in such cases, the CTE takes precedence within the query’s scope.( column_name [ ,...n ] )
: Optionally, you can specify column names for the CTE. If omitted, column names are derived from theCTE_query_definition
. Ensure the number of specified column names matches the columns returned by the CTE query. Duplicate column names within a CTE definition are not allowed.AS ( CTE_query_definition )
: This is where you define the query that generates the CTE’s result set. ThisSELECT
statement adheres to view creation rules, with the notable exception that a CTE cannot define another nested CTE directly. When defining multiple CTEs, their query definitions must be combined using set operators likeUNION ALL
,UNION
,EXCEPT
, orINTERSECT
.
Guidelines for Non-Recursive CTEs
Non-recursive CTEs are foundational for simplifying queries. Here are key guidelines for their creation and usage:
- Execution Scope: A CTE must be immediately followed by a
SELECT
,INSERT
,UPDATE
, orDELETE
statement that utilizes the CTE’s columns. CTEs can also be defined within aCREATE VIEW
statement. - Multiple Definitions: You can define several CTEs within a single
WITH
clause, combining them withUNION ALL
,UNION
,INTERSECT
, orEXCEPT
. - Referencing: CTEs can reference themselves and previously defined CTEs within the same
WITH
clause, but forward referencing is not permitted. - Single WITH Clause: Nested
WITH
clauses are not allowed. If aCTE_query_definition
contains a subquery, that subquery cannot include anotherWITH
clause defining a nested CTE. - Restricted Clauses: The
CTE_query_definition
cannot contain:ORDER BY
(unless used withTOP
)INTO
OPTION
clause with query hintsFOR BROWSE
- Semicolon Requirement: When using CTEs in a batch, ensure the statement preceding the
WITH
clause is terminated with a semicolon (;). - Cursor Definition: Queries using CTEs can define cursors.
- Remote Tables: CTEs can reference tables located on remote servers.
- Hint Conflicts: Hints referencing CTEs can conflict with hints discovered when the CTE accesses underlying tables, similar to views, potentially leading to query errors.
Diving into Recursive CTEs
Recursive CTEs are designed to handle hierarchical or recursive data structures, such as organizational charts or bill of materials. They achieve this by repeatedly executing a query until a condition is met.
Here are the specific guidelines for defining recursive CTEs:
- Anchor and Recursive Members: A recursive CTE must have at least two query definitions: an anchor member and a recursive member. Anchor members (one or more) come first and establish the base case of the recursion. Recursive members then reference the CTE itself to iterate.
- Set Operators: Anchor members are combined using
UNION ALL
,UNION
,INTERSECT
, orEXCEPT
. Crucially,UNION ALL
is the only allowed operator between the last anchor member and the first recursive member, and also between multiple recursive members. - Column Consistency: Anchor and recursive members must have the same number of columns.
- Data Type Compatibility: Corresponding columns in the anchor and recursive members must have compatible data types.
- Single Self-Reference: The
FROM
clause in a recursive member must reference the CTEexpression_name
only once. - Recursive Member Restrictions: The
CTE_query_definition
of a recursive member cannot contain:SELECT DISTINCT
GROUP BY
PIVOT
(Compatibility level 110+)HAVING
- Scalar aggregation functions
TOP
LEFT
,RIGHT
,OUTER JOIN
(INNER JOIN
is allowed)- Subqueries
- Hints applied to the recursive CTE reference within its definition.
And here are guidelines for using recursive CTEs:
- Nullability: All columns returned by a recursive CTE are nullable, irrespective of the nullability of columns from the constituent
SELECT
statements. - Infinite Loops: Incorrectly formed recursive CTEs can lead to infinite loops. This can occur if the recursive member query definition returns the same values for parent and child columns.
MAXRECURSION
Hint: To prevent infinite loops, use theMAXRECURSION
hint in theOPTION
clause of theINSERT
,UPDATE
,DELETE
, orSELECT
statement following the CTE. Set a limit between 0 and 32,767 to control recursion levels and halt execution if needed. The server default is 100; 0 means no limit. Only oneMAXRECURSION
hint can be used per statement.
- View Updates: Views containing recursive CTEs cannot be used for data updates.
- Cursor Types: Cursors defined on queries using recursive CTEs are restricted to fast forward-only and static (snapshot) types. Other cursor types will be implicitly converted to static.
- Remote Server References: Remote tables can be referenced in recursive CTEs. If a remote table is in the recursive member, a spool is created for local repeated access. In query plans, look for “Index Spool/Lazy Spools” with the
WITH STACK
predicate to confirm recursion. - Analytic and Aggregate Functions: Analytic and aggregate functions within the recursive part of a CTE operate on the current recursion level’s set, not the entire CTE result set. Functions like
ROW_NUMBER()
apply only to the data subset passed in the current recursion level.
Practical Examples of CTEs in SQL Server
Let’s explore some examples to illustrate the use of CTEs.
Example A: Basic Non-Recursive CTE for Sales Data
This example demonstrates a simple CTE to calculate the total sales orders per year for each sales representative.
-- Define the CTE expression name and column list.
WITH Sales_CTE (SalesPersonID, SalesOrderID, SalesYear) AS
(
-- Define the CTE query.
SELECT
SalesPersonID,
SalesOrderID,
YEAR(OrderDate) AS SalesYear
FROM
Sales.SalesOrderHeader
WHERE
SalesPersonID IS NOT NULL
)
-- Define the outer query referencing the CTE name.
SELECT
SalesPersonID,
COUNT(SalesOrderID) AS TotalSales,
SalesYear
FROM
Sales_CTE
GROUP BY
SalesYear,
SalesPersonID
ORDER BY
SalesPersonID,
SalesYear;
This CTE, Sales_CTE
, simplifies the query by first selecting the necessary sales data (SalesPersonID, SalesOrderID, SalesYear) and then the outer query aggregates and presents this data.
Example B: CTE for Calculating Averages
Building on the previous example, this CTE calculates the average number of sales orders per sales representative across all years.
WITH Sales_CTE (SalesPersonID, NumberOfOrders) AS
(
SELECT
SalesPersonID,
COUNT(*)
FROM
Sales.SalesOrderHeader
WHERE
SalesPersonID IS NOT NULL
GROUP BY
SalesPersonID
)
SELECT
AVG(NumberOfOrders) AS "Average Sales Per Person"
FROM
Sales_CTE;
Here, Sales_CTE
first determines the number of orders for each salesperson, and then the main query easily calculates the average from this CTE.
Example C: Multiple CTEs in a Single Query
This example shows how to use two CTEs within one query to compare total sales against sales quotas.
WITH Sales_CTE (SalesPersonID, TotalSales, SalesYear) AS
(
-- Define the first CTE query.
SELECT
SalesPersonID,
SUM(TotalDue) AS TotalSales,
YEAR(OrderDate) AS SalesYear
FROM
Sales.SalesOrderHeader
WHERE
SalesPersonID IS NOT NULL
GROUP BY
SalesPersonID,
YEAR(OrderDate)
)
, -- Use a comma to separate multiple CTE definitions.
Sales_Quota_CTE (BusinessEntityID, SalesQuota, SalesQuotaYear) AS
(
-- Define the second CTE query, which returns sales quota data by year for each sales person.
SELECT
BusinessEntityID,
SUM(SalesQuota)AS SalesQuota,
YEAR(QuotaDate) AS SalesQuotaYear
FROM
Sales.SalesPersonQuotaHistory
GROUP BY
BusinessEntityID,
YEAR(QuotaDate)
)
-- Define the outer query by referencing columns from both CTEs.
SELECT
s.SalesPersonID,
s.SalesYear,
FORMAT(s.TotalSales,'C','en-us') AS TotalSales,
sq.SalesQuotaYear,
FORMAT (sq.SalesQuota,'C','en-us') AS SalesQuota,
FORMAT (s.TotalSales - sq.SalesQuota, 'C','en-us') AS Amt_Above_or_Below_Quota
FROM
Sales_CTE s
JOIN
Sales_Quota_CTE sq ON sq.BusinessEntityID = s.SalesPersonID AND s.SalesYear = sq.SalesQuotaYear
ORDER BY
s.SalesPersonID,
s.SalesYear;
Partial result set showing sales performance against quota.
This example uses Sales_CTE
to get sales totals and Sales_Quota_CTE
for quota data, then joins them to produce a comparative report.
Example D: Recursive CTE for Hierarchical Data
This example demonstrates a recursive CTE to display an organizational hierarchy.
-- Create an Employee table.
CREATE TABLE dbo.MyEmployees (
EmployeeID SMALLINT NOT NULL,
FirstName NVARCHAR(30) NOT NULL,
LastName NVARCHAR(40) NOT NULL,
Title NVARCHAR(50) NOT NULL,
DeptID SMALLINT NOT NULL,
ManagerID SMALLINT NULL,
CONSTRAINT PK_EmployeeID PRIMARY KEY CLUSTERED (EmployeeID ASC),
CONSTRAINT FK_MyEmployees_ManagerID_EmployeeID FOREIGN KEY (ManagerID) REFERENCES dbo.MyEmployees (EmployeeID)
);
-- Populate the table with values.
INSERT INTO dbo.MyEmployees
VALUES
(1, N'Ken', N'Sánchez', N'Chief Executive Officer',16, NULL)
,(273, N'Brian', N'Welcker', N'Vice President of Sales', 3, 1)
,(274, N'Stephen', N'Jiang', N'North American Sales Manager', 3, 273)
,(275, N'Michael', N'Blythe', N'Sales Representative', 3, 274)
,(276, N'Linda', N'Mitchell', N'Sales Representative', 3, 274)
,(285, N'Syed', N'Abbas', N'Pacific Sales Manager', 3, 273)
,(286, N'Lynn', N'Tsoflias', N'Sales Representative', 3, 285)
,(16, N'David', N'Bradley', N'Marketing Manager', 4, 273)
,(23, N'Mary', N'Gibson', N'Marketing Specialist', 4, 16);
WITH DirectReports(ManagerID, EmployeeID, Title, EmployeeLevel) AS
(
-- Anchor member: Select top-level managers (no ManagerID).
SELECT
ManagerID,
EmployeeID,
Title,
0 AS EmployeeLevel
FROM
dbo.MyEmployees
WHERE
ManagerID IS NULL
UNION ALL
-- Recursive member: Join employees to their managers in the CTE.
SELECT
e.ManagerID,
e.EmployeeID,
e.Title,
EmployeeLevel + 1
FROM
dbo.MyEmployees AS e
INNER JOIN
DirectReports AS d ON e.ManagerID = d.EmployeeID
)
-- Select from the CTE to display the hierarchy.
SELECT
ManagerID,
EmployeeID,
Title,
EmployeeLevel
FROM
DirectReports
ORDER BY
ManagerID;
This recursive CTE, DirectReports
, starts with top-level managers (anchor member) and recursively adds employees reporting to each manager (recursive member), building a hierarchical list.
Example E: Recursive CTE for Bill of Materials
This example uses a recursive CTE to explore the components of a product assembly.
USE AdventureWorks2022;
GO
WITH Parts(AssemblyID, ComponentID, PerAssemblyQty, EndDate, ComponentLevel) AS
(
-- Anchor member: Top-level components for ProductAssemblyID = 800.
SELECT
b.ProductAssemblyID,
b.ComponentID,
b.PerAssemblyQty,
b.EndDate,
0 AS ComponentLevel
FROM
Production.BillOfMaterials AS b
WHERE
b.ProductAssemblyID = 800
AND b.EndDate IS NULL
UNION ALL
-- Recursive member: Find components of components.
SELECT
bom.ProductAssemblyID,
bom.ComponentID,
p.PerAssemblyQty,
bom.EndDate,
ComponentLevel + 1
FROM
Production.BillOfMaterials AS bom
INNER JOIN
Parts AS p ON bom.ProductAssemblyID = p.ComponentID
AND bom.EndDate IS NULL
)
-- Select from the CTE to display the bill of materials.
SELECT
AssemblyID,
ComponentID,
Name,
PerAssemblyQty,
EndDate,
ComponentLevel
FROM
Parts AS p
INNER JOIN
Production.Product AS pr ON p.ComponentID = pr.ProductID
ORDER BY
ComponentLevel,
AssemblyID,
ComponentID;
Hierarchical list of product assemblies and components.
The Parts
CTE recursively navigates the BillOfMaterials
table to list all parts needed for product assembly 800 and their sub-components.
Example F: Updating Data with a Recursive CTE
CTEs can also be used in UPDATE
statements. This example modifies the PerAssemblyQty
for components of ‘Road-550-W Yellow, 44’ (ProductAssemblyID 800).
USE AdventureWorks2022;
GO
WITH Parts(AssemblyID, ComponentID, PerAssemblyQty, EndDate, ComponentLevel) AS
(
SELECT
b.ProductAssemblyID,
b.ComponentID,
b.PerAssemblyQty,
b.EndDate,
0 AS ComponentLevel
FROM
Production.BillOfMaterials AS b
WHERE
b.ProductAssemblyID = 800
AND b.EndDate IS NULL
UNION ALL
SELECT
bom.ProductAssemblyID,
bom.ComponentID,
p.PerAssemblyQty,
bom.EndDate,
ComponentLevel + 1
FROM
Production.BillOfMaterials AS bom
INNER JOIN
Parts AS p ON bom.ProductAssemblyID = p.ComponentID
AND bom.EndDate IS NULL
)
UPDATE Production.BillOfMaterials
SET PerAssemblyQty = c.PerAssemblyQty * 2
FROM
Production.BillOfMaterials AS c
JOIN
Parts AS d ON c.ProductAssemblyID = d.AssemblyID
WHERE
d.ComponentLevel = 0;
This example leverages the Parts
CTE to identify the relevant components and then updates their PerAssemblyQty
in the Production.BillOfMaterials
table.
CTEs in Azure Synapse Analytics and Analytics Platform System (PDW)
In Azure Synapse Analytics and Analytics Platform System (PDW), CTEs have specific features and limitations:
- Supported Statements: CTEs are supported in
SELECT
,CREATE VIEW
,CREATE TABLE AS SELECT
(CTAS),CREATE REMOTE TABLE AS SELECT
(CRTAS), andCREATE EXTERNAL TABLE AS SELECT
(CETAS) statements. - External and Remote Tables: CTEs can reference both remote and external tables.
- Multiple CTE Definitions: Multiple CTE query definitions are allowed.
- Usage in DML: CTEs can be followed by
SELECT
,INSERT
,UPDATE
,DELETE
, orMERGE
statements. - No Recursive CTEs: Recursive CTEs (CTEs that reference themselves) are not supported in Azure Synapse Analytics and PDW.
- Single WITH Clause: Nested
WITH
clauses are disallowed. Subqueries within a CTE cannot contain nestedWITH
clauses. - ORDER BY Restriction:
ORDER BY
is not permitted inCTE_query_definition
unless aTOP
clause is also specified. - Semicolon Requirement: A semicolon must precede a
WITH
clause if the CTE is part of a batch and not the first statement. - Prepared Statements: CTEs behave like other
SELECT
statements when used withsp_prepare
, but CETAS with CTEs prepared bysp_prepare
might exhibit behavior differences compared to SQL Server due to binding implementation. Error detection for incorrect column references in CTEs within CETAS prepared statements might occur duringsp_execute
rather thansp_prepare
.
Conclusion: Leveraging CTEs for Efficient SQL Server Queries
Sql Server Common Table Expressions are invaluable tools for writing cleaner, more understandable, and efficient SQL queries. They simplify complex logic, especially when dealing with hierarchical data or multi-step data transformations. While non-recursive CTEs enhance query structure and readability, recursive CTEs open doors to powerful traversal and manipulation of hierarchical datasets. Understanding and effectively utilizing CTEs is a crucial skill for any SQL Server developer or database professional aiming to write robust and maintainable database queries.