In the realm of database management, ensuring optimal query performance is paramount. SQL Server relies heavily on query optimizer statistics to craft efficient execution plans. When these statistics become outdated, query performance can suffer significantly. This article delves into the crucial practice of updating statistics in SQL Server, focusing on the UPDATE STATISTICS
command and its best practices. Whether you’re managing a small database or a large data warehouse, understanding and implementing effective statistics updates is essential for maintaining a responsive and efficient SQL Server environment.
The UPDATE STATISTICS
Transact-SQL statement is your primary tool for refreshing these vital statistics on tables and indexed views. While SQL Server automatically updates statistics as needed, proactively managing these updates can provide substantial performance gains. This guide will explore the syntax, options, and practical considerations for using UPDATE STATISTICS
to keep your database running smoothly.
-- Syntax for SQL Server and Azure SQL Database
UPDATE STATISTICS table_or_indexed_view_name
[ { { index_or_statistics__name } | ( { index_or_statistics_name } [ ,...n ] ) } ]
[ WITH
[ FULLSCAN [ [ , ] PERSIST_SAMPLE_PERCENT = { ON | OFF } ]
| SAMPLE number { PERCENT | ROWS } [ [ , ] PERSIST_SAMPLE_PERCENT = { ON | OFF } ]
| RESAMPLE [ ON PARTITIONS ( { <partition_number> | <range> } [, ...n] ) ]
| <update_stats_stream_option> [ ,...n ]
]
[ [ , ] [ ALL | COLUMNS | INDEX ] ]
[ [ , ] NORECOMPUTE ]
[ [ , ] INCREMENTAL = { ON | OFF } ]
[ [ , ] MAXDOP = max_degree_of_parallelism ]
[ [ , ] AUTO_DROP = { ON | OFF } ]
];
<update_stats_stream_option> ::=
[ STATS_STREAM = stats_stream ]
[ ROWCOUNT = numeric_constant ]
[ PAGECOUNT = numeric_constant ]
-- Syntax for Azure Synapse Analytics and Parallel Data Warehouse
UPDATE STATISTICS [ schema_name . ] table_name
[ ( { statistics_name | index_name } ) ]
[ WITH { FULLSCAN | SAMPLE number PERCENT | RESAMPLE } ]
[;]
-- Syntax for Microsoft Fabric
UPDATE STATISTICS [ schema_name . ] table_name
[ ( { statistics_name } ) ]
[ WITH { FULLSCAN | SAMPLE number PERCENT } ]
[;]
Understanding SQL Server Statistics and Query Optimizer
SQL Server’s query optimizer relies on statistics to estimate the distribution of data within tables and indexes. These statistics are crucial for the optimizer to choose the most efficient query execution plan. Outdated or inaccurate statistics can lead to suboptimal plans, resulting in slow query performance and increased resource consumption.
Think of statistics as a roadmap for the query optimizer. Just as an outdated map can lead you down inefficient routes, stale statistics can misguide the optimizer, causing it to select less-than-ideal paths for retrieving data. Regularly updating these statistics is like updating your roadmap, ensuring the query optimizer has the most accurate information to navigate your database effectively.
When to Update Statistics in SQL Server
SQL Server automatically updates statistics through the AUTO_UPDATE_STATISTICS
database option. However, there are scenarios where manual intervention with UPDATE STATISTICS
is beneficial or even necessary:
- Significant Data Modifications: After large data loads, bulk inserts, or substantial changes due to ETL processes, the existing statistics may no longer accurately reflect the current data distribution. Manually updating statistics ensures the query optimizer is aware of these changes.
- After Index Maintenance: Operations like rebuilding or reorganizing indexes can alter data distribution. Updating statistics post-index maintenance ensures statistics are aligned with the new index structure.
- Performance Tuning: If you identify slow-running queries and suspect outdated statistics are the culprit, manually updating statistics for the tables involved can be a quick and effective troubleshooting step.
- Scheduled Maintenance: Incorporating
UPDATE STATISTICS
into your regular database maintenance routines ensures consistent and proactive statistics management.
While frequent updates might seem beneficial, remember that updating statistics does come with a cost. It requires scanning data and can lead to query recompilations. Therefore, striking a balance and updating statistics strategically is key. Avoid overly frequent updates that might introduce unnecessary overhead.
Exploring UPDATE STATISTICS
Options
The UPDATE STATISTICS
command offers various options to fine-tune how statistics are updated. Let’s examine the key options:
1. Target Specification: Table, Index, or Statistics Name
You can target your update at different levels:
- Table Level (
UPDATE STATISTICS table_name
): Updates all statistics associated with the specified table, including statistics on indexes, single-column statistics, and statistics created usingCREATE STATISTICS
. - Specific Statistics or Index (
UPDATE STATISTICS table_name index_or_statistics_name
): Allows you to update statistics for a particular index or named statistics object. This provides granular control when you know specific statistics are likely to be outdated.
2. Sampling Methods: FULLSCAN, SAMPLE, RESAMPLE
Choosing the right sampling method is crucial for balancing accuracy and performance:
-
FULLSCAN: Scans all rows in the table or indexed view to compute statistics. This provides the most accurate statistics but is also the most resource-intensive, especially for large tables.
FULLSCAN
andSAMPLE 100 PERCENT
are equivalent.UPDATE STATISTICS Sales.SalesOrderDetail WITH FULLSCAN;
-
SAMPLE
number
{PERCENT | ROWS}: Updates statistics based on a sample of the data. This is faster thanFULLSCAN
and often provides sufficient accuracy for the query optimizer. SQL Server automatically samples data by default when statistics are updated without specifyingFULLSCAN
orSAMPLE
.UPDATE STATISTICS Production.Product(Products) WITH SAMPLE 50 PERCENT;
Using
SAMPLE
is generally recommended for most scenarios as it provides a good balance between performance and accuracy. The query optimizer intelligently determines the default sample size, making manual specification often unnecessary. -
RESAMPLE: Updates each statistic using its previously used sampling rate. This can be useful when you want to maintain consistency in sampling but still refresh the statistics. Note that
RESAMPLE
can sometimes lead to a full table scan depending on how the statistics were initially created. In Warehouse in Microsoft Fabric,RESAMPLE
is not supported.
3. PERSIST_SAMPLE_PERCENT
Introduced in later SQL Server versions, PERSIST_SAMPLE_PERCENT
allows you to control whether the specified sample percentage is persisted for subsequent automatic statistics updates.
- PERSIST_SAMPLE_PERCENT = ON: Retains the specified sampling percentage for future automatic updates.
- PERSIST_SAMPLE_PERCENT = OFF (Default): Resets the sampling percentage to the default sampling rate for subsequent automatic updates.
This option is beneficial when you’ve determined that a specific sampling percentage is optimal for certain statistics and want to ensure that automatic updates adhere to this setting.
4. NORECOMPUTE
The NORECOMPUTE
option disables automatic statistics updates (AUTO_UPDATE_STATISTICS
) for the specified statistics.
UPDATE STATISTICS Production.Product(Products) WITH NORECOMPUTE;
Caution: Using NORECOMPUTE
is strongly discouraged in most situations. Disabling automatic updates can lead to severely outdated statistics and significantly degrade query performance over time. This option should only be used in very specific, controlled scenarios by experienced database administrators who fully understand the implications. Re-enable automatic updates by running UPDATE STATISTICS
without NORECOMPUTE
or using sp_autostats
.
5. INCREMENTAL
The INCREMENTAL
option is relevant for partitioned tables. When INCREMENTAL = ON
, statistics are updated and stored at the partition level. When INCREMENTAL = OFF
(default), existing partition statistics are dropped, and SQL Server recomputes statistics for the entire table as a single unit. Incremental statistics are generally recommended for partitioned tables as they can improve the efficiency of statistics maintenance. However, incremental statistics have limitations and are not supported for all types of statistics (e.g., statistics on non-partition-aligned indexes, filtered indexes, views, etc.).
6. MAXDOP
The MAXDOP
option, available in newer SQL Server versions, allows you to control the maximum degree of parallelism for the statistics update operation.
UPDATE STATISTICS Sales.SalesOrderDetail WITH MAXDOP = 4;
Limiting parallelism can be useful in environments with high CPU utilization or when you want to throttle the resource consumption of statistics updates.
7. AUTO_DROP
Introduced in SQL Server 2022, AUTO_DROP = ON
allows statistics objects created by third-party tools to be automatically dropped if they would block schema changes. This behavior mirrors auto-created statistics and helps prevent schema modification blocking issues.
UPDATE STATISTICS Customer (CustomerStats1) WITH AUTO_DROP = ON
Practical Examples of UPDATE STATISTICS
Let’s look at some practical examples of using UPDATE STATISTICS
:
Example 1: Update all statistics on the SalesOrderDetail
table.
USE AdventureWorks2022;
GO
UPDATE STATISTICS Sales.SalesOrderDetail;
GO
Example 2: Update statistics for the AK_SalesOrderDetail_rowguid
index.
USE AdventureWorks2022;
GO
UPDATE STATISTICS Sales.SalesOrderDetail AK_SalesOrderDetail_rowguid;
GO
Example 3: Update statistics using 50% sampling for named statistics Products
.
USE AdventureWorks2022;
GO
CREATE STATISTICS Products ON Production.Product ([Name], ProductNumber) WITH SAMPLE 50 PERCENT
-- Time passes. The UPDATE STATISTICS statement is then executed.
UPDATE STATISTICS Production.Product(Products) WITH SAMPLE 50 PERCENT;
Example 4: Update statistics with FULLSCAN
and disable automatic updates (NORECOMPUTE
). (Use with caution!)
USE AdventureWorks2022;
GO
UPDATE STATISTICS Production.Product(Products) WITH FULLSCAN, NORECOMPUTE;
GO
Example 5: Update statistics on Azure Synapse Analytics using FULLSCAN
.
UPDATE STATISTICS Customer (CustomerStats1) WITH FULLSCAN;
Best Practices for Managing SQL Server Statistics
- Regularly Update Statistics: Implement a proactive strategy for updating statistics, especially after significant data modifications or index maintenance.
- Use Default Sampling for Most Cases: In most scenarios, rely on SQL Server’s default sampling for statistics updates. It provides a good balance of accuracy and performance.
- Consider
FULLSCAN
for Critical Tables: For tables involved in performance-critical queries or those with highly skewed data distributions, consider usingFULLSCAN
periodically, especially during off-peak hours. - Monitor Statistics Age: Use
STATS_DATE()
function to monitor the last update time of statistics to identify potentially outdated statistics. - Automate Statistics Maintenance: Utilize SQL Server Agent jobs or maintenance solutions like Adaptive Index Defrag to automate statistics updates and index maintenance.
- Avoid
NORECOMPUTE
in General: Reserve theNORECOMPUTE
option for very specific and well-justified scenarios. Disabling automatic updates can lead to severe performance problems. - Understand Incremental Statistics for Partitioned Tables: Leverage incremental statistics for partitioned tables to improve the efficiency of statistics management.
- Test and Monitor: After implementing any statistics update strategy, thoroughly test query performance and monitor the impact to ensure the changes are beneficial.
Conclusion
Effectively managing SQL Server statistics is a cornerstone of maintaining optimal database performance. The UPDATE STATISTICS
command provides the necessary tools to refresh and fine-tune these crucial components of the query optimizer. By understanding the various options, sampling methods, and best practices outlined in this guide, you can proactively manage your SQL Server statistics and ensure your queries execute efficiently, delivering a responsive and performant database environment. Regularly updating statistics, especially after significant data changes, is a simple yet powerful step towards optimizing your SQL Server database for peak performance.
Note: This article is for informational purposes and based on the provided source material. Always refer to the official Microsoft SQL Server documentation for the most up-to-date and accurate information.
Here is the image of the syntax section.
Here is the image to represent the importance of statistics.