Boost SQL Query Performance with SQL Server Update Statistics

In the realm of database management, ensuring optimal query performance is paramount. SQL Server relies heavily on query optimizer statistics to craft efficient execution plans. When these statistics become outdated, query performance can suffer significantly. This article delves into the crucial practice of updating statistics in SQL Server, focusing on the UPDATE STATISTICS command and its best practices. Whether you’re managing a small database or a large data warehouse, understanding and implementing effective statistics updates is essential for maintaining a responsive and efficient SQL Server environment.

The UPDATE STATISTICS Transact-SQL statement is your primary tool for refreshing these vital statistics on tables and indexed views. While SQL Server automatically updates statistics as needed, proactively managing these updates can provide substantial performance gains. This guide will explore the syntax, options, and practical considerations for using UPDATE STATISTICS to keep your database running smoothly.

-- Syntax for SQL Server and Azure SQL Database
UPDATE STATISTICS table_or_indexed_view_name
[ { { index_or_statistics__name } | ( { index_or_statistics_name } [ ,...n ] ) } ]
[ WITH
    [ FULLSCAN [ [ , ] PERSIST_SAMPLE_PERCENT = { ON | OFF } ]
    | SAMPLE number { PERCENT | ROWS } [ [ , ] PERSIST_SAMPLE_PERCENT = { ON | OFF } ]
    | RESAMPLE [ ON PARTITIONS ( { <partition_number> | <range> } [, ...n] ) ]
    | <update_stats_stream_option> [ ,...n ]
    ]
[ [ , ] [ ALL | COLUMNS | INDEX ] ]
[ [ , ] NORECOMPUTE ]
[ [ , ] INCREMENTAL = { ON | OFF } ]
[ [ , ] MAXDOP = max_degree_of_parallelism ]
[ [ , ] AUTO_DROP = { ON | OFF } ]
];

<update_stats_stream_option> ::=
    [ STATS_STREAM = stats_stream ]
    [ ROWCOUNT = numeric_constant ]
    [ PAGECOUNT = numeric_constant ]
-- Syntax for Azure Synapse Analytics and Parallel Data Warehouse
UPDATE STATISTICS [ schema_name . ] table_name
[ ( { statistics_name | index_name } ) ]
[ WITH { FULLSCAN | SAMPLE number PERCENT | RESAMPLE } ]
[;]
-- Syntax for Microsoft Fabric
UPDATE STATISTICS [ schema_name . ] table_name
[ ( { statistics_name } ) ]
[ WITH { FULLSCAN | SAMPLE number PERCENT } ]
[;]

Understanding SQL Server Statistics and Query Optimizer

SQL Server’s query optimizer relies on statistics to estimate the distribution of data within tables and indexes. These statistics are crucial for the optimizer to choose the most efficient query execution plan. Outdated or inaccurate statistics can lead to suboptimal plans, resulting in slow query performance and increased resource consumption.

Think of statistics as a roadmap for the query optimizer. Just as an outdated map can lead you down inefficient routes, stale statistics can misguide the optimizer, causing it to select less-than-ideal paths for retrieving data. Regularly updating these statistics is like updating your roadmap, ensuring the query optimizer has the most accurate information to navigate your database effectively.

When to Update Statistics in SQL Server

SQL Server automatically updates statistics through the AUTO_UPDATE_STATISTICS database option. However, there are scenarios where manual intervention with UPDATE STATISTICS is beneficial or even necessary:

  • Significant Data Modifications: After large data loads, bulk inserts, or substantial changes due to ETL processes, the existing statistics may no longer accurately reflect the current data distribution. Manually updating statistics ensures the query optimizer is aware of these changes.
  • After Index Maintenance: Operations like rebuilding or reorganizing indexes can alter data distribution. Updating statistics post-index maintenance ensures statistics are aligned with the new index structure.
  • Performance Tuning: If you identify slow-running queries and suspect outdated statistics are the culprit, manually updating statistics for the tables involved can be a quick and effective troubleshooting step.
  • Scheduled Maintenance: Incorporating UPDATE STATISTICS into your regular database maintenance routines ensures consistent and proactive statistics management.

While frequent updates might seem beneficial, remember that updating statistics does come with a cost. It requires scanning data and can lead to query recompilations. Therefore, striking a balance and updating statistics strategically is key. Avoid overly frequent updates that might introduce unnecessary overhead.

Exploring UPDATE STATISTICS Options

The UPDATE STATISTICS command offers various options to fine-tune how statistics are updated. Let’s examine the key options:

1. Target Specification: Table, Index, or Statistics Name

You can target your update at different levels:

  • Table Level (UPDATE STATISTICS table_name): Updates all statistics associated with the specified table, including statistics on indexes, single-column statistics, and statistics created using CREATE STATISTICS.
  • Specific Statistics or Index (UPDATE STATISTICS table_name index_or_statistics_name): Allows you to update statistics for a particular index or named statistics object. This provides granular control when you know specific statistics are likely to be outdated.

2. Sampling Methods: FULLSCAN, SAMPLE, RESAMPLE

Choosing the right sampling method is crucial for balancing accuracy and performance:

  • FULLSCAN: Scans all rows in the table or indexed view to compute statistics. This provides the most accurate statistics but is also the most resource-intensive, especially for large tables. FULLSCAN and SAMPLE 100 PERCENT are equivalent.

    UPDATE STATISTICS Sales.SalesOrderDetail WITH FULLSCAN;
  • SAMPLE number {PERCENT | ROWS}: Updates statistics based on a sample of the data. This is faster than FULLSCAN and often provides sufficient accuracy for the query optimizer. SQL Server automatically samples data by default when statistics are updated without specifying FULLSCAN or SAMPLE.

    UPDATE STATISTICS Production.Product(Products) WITH SAMPLE 50 PERCENT;

    Using SAMPLE is generally recommended for most scenarios as it provides a good balance between performance and accuracy. The query optimizer intelligently determines the default sample size, making manual specification often unnecessary.

  • RESAMPLE: Updates each statistic using its previously used sampling rate. This can be useful when you want to maintain consistency in sampling but still refresh the statistics. Note that RESAMPLE can sometimes lead to a full table scan depending on how the statistics were initially created. In Warehouse in Microsoft Fabric, RESAMPLE is not supported.

3. PERSIST_SAMPLE_PERCENT

Introduced in later SQL Server versions, PERSIST_SAMPLE_PERCENT allows you to control whether the specified sample percentage is persisted for subsequent automatic statistics updates.

  • PERSIST_SAMPLE_PERCENT = ON: Retains the specified sampling percentage for future automatic updates.
  • PERSIST_SAMPLE_PERCENT = OFF (Default): Resets the sampling percentage to the default sampling rate for subsequent automatic updates.

This option is beneficial when you’ve determined that a specific sampling percentage is optimal for certain statistics and want to ensure that automatic updates adhere to this setting.

4. NORECOMPUTE

The NORECOMPUTE option disables automatic statistics updates (AUTO_UPDATE_STATISTICS) for the specified statistics.

UPDATE STATISTICS Production.Product(Products) WITH NORECOMPUTE;

Caution: Using NORECOMPUTE is strongly discouraged in most situations. Disabling automatic updates can lead to severely outdated statistics and significantly degrade query performance over time. This option should only be used in very specific, controlled scenarios by experienced database administrators who fully understand the implications. Re-enable automatic updates by running UPDATE STATISTICS without NORECOMPUTE or using sp_autostats.

5. INCREMENTAL

The INCREMENTAL option is relevant for partitioned tables. When INCREMENTAL = ON, statistics are updated and stored at the partition level. When INCREMENTAL = OFF (default), existing partition statistics are dropped, and SQL Server recomputes statistics for the entire table as a single unit. Incremental statistics are generally recommended for partitioned tables as they can improve the efficiency of statistics maintenance. However, incremental statistics have limitations and are not supported for all types of statistics (e.g., statistics on non-partition-aligned indexes, filtered indexes, views, etc.).

6. MAXDOP

The MAXDOP option, available in newer SQL Server versions, allows you to control the maximum degree of parallelism for the statistics update operation.

UPDATE STATISTICS Sales.SalesOrderDetail WITH MAXDOP = 4;

Limiting parallelism can be useful in environments with high CPU utilization or when you want to throttle the resource consumption of statistics updates.

7. AUTO_DROP

Introduced in SQL Server 2022, AUTO_DROP = ON allows statistics objects created by third-party tools to be automatically dropped if they would block schema changes. This behavior mirrors auto-created statistics and helps prevent schema modification blocking issues.

UPDATE STATISTICS Customer (CustomerStats1) WITH AUTO_DROP = ON

Practical Examples of UPDATE STATISTICS

Let’s look at some practical examples of using UPDATE STATISTICS:

Example 1: Update all statistics on the SalesOrderDetail table.

USE AdventureWorks2022;
GO
UPDATE STATISTICS Sales.SalesOrderDetail;
GO

Example 2: Update statistics for the AK_SalesOrderDetail_rowguid index.

USE AdventureWorks2022;
GO
UPDATE STATISTICS Sales.SalesOrderDetail AK_SalesOrderDetail_rowguid;
GO

Example 3: Update statistics using 50% sampling for named statistics Products.

USE AdventureWorks2022;
GO
CREATE STATISTICS Products ON Production.Product ([Name], ProductNumber) WITH SAMPLE 50 PERCENT
-- Time passes. The UPDATE STATISTICS statement is then executed.
UPDATE STATISTICS Production.Product(Products) WITH SAMPLE 50 PERCENT;

Example 4: Update statistics with FULLSCAN and disable automatic updates (NORECOMPUTE). (Use with caution!)

USE AdventureWorks2022;
GO
UPDATE STATISTICS Production.Product(Products) WITH FULLSCAN, NORECOMPUTE;
GO

Example 5: Update statistics on Azure Synapse Analytics using FULLSCAN.

UPDATE STATISTICS Customer (CustomerStats1) WITH FULLSCAN;

Best Practices for Managing SQL Server Statistics

  • Regularly Update Statistics: Implement a proactive strategy for updating statistics, especially after significant data modifications or index maintenance.
  • Use Default Sampling for Most Cases: In most scenarios, rely on SQL Server’s default sampling for statistics updates. It provides a good balance of accuracy and performance.
  • Consider FULLSCAN for Critical Tables: For tables involved in performance-critical queries or those with highly skewed data distributions, consider using FULLSCAN periodically, especially during off-peak hours.
  • Monitor Statistics Age: Use STATS_DATE() function to monitor the last update time of statistics to identify potentially outdated statistics.
  • Automate Statistics Maintenance: Utilize SQL Server Agent jobs or maintenance solutions like Adaptive Index Defrag to automate statistics updates and index maintenance.
  • Avoid NORECOMPUTE in General: Reserve the NORECOMPUTE option for very specific and well-justified scenarios. Disabling automatic updates can lead to severe performance problems.
  • Understand Incremental Statistics for Partitioned Tables: Leverage incremental statistics for partitioned tables to improve the efficiency of statistics management.
  • Test and Monitor: After implementing any statistics update strategy, thoroughly test query performance and monitor the impact to ensure the changes are beneficial.

Conclusion

Effectively managing SQL Server statistics is a cornerstone of maintaining optimal database performance. The UPDATE STATISTICS command provides the necessary tools to refresh and fine-tune these crucial components of the query optimizer. By understanding the various options, sampling methods, and best practices outlined in this guide, you can proactively manage your SQL Server statistics and ensure your queries execute efficiently, delivering a responsive and performant database environment. Regularly updating statistics, especially after significant data changes, is a simple yet powerful step towards optimizing your SQL Server database for peak performance.


Note: This article is for informational purposes and based on the provided source material. Always refer to the official Microsoft SQL Server documentation for the most up-to-date and accurate information.


Here is the image of the syntax section.

Here is the image to represent the importance of statistics.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *