Tag Archives: rowmodctr

Statistics used in the cached execution plans

Statistics used in the cached execution plans – Stored Procedures


Summary

Query optimisation process sometimes requires understanding on how Sql Server’s Query engine compiles, re-compiles and executes sql batches. Some of the most important elements used by Query optimiser when constructing a good plan are the “Interesting statistics”. These are statistical information used by Query optimiser  when constructing a good enough query execution plan.
This blog attempts to explain what are the “interesting statistics”, when they are updated and how the statistical information relates to the query recompilation process. The topic is related to Temporary tables statistics when used in stored procedures.

Batch compilation and recompilation

To begin with, let’s analyse the batch compilation/recompilation diagram (By Arun Marathe, Jul 2004, Batch Compilation, Recompilation and Plan Caching Issues in Sql Server 2005). The idea is to create a set of  experiments that will capture the behavior of a stored procedure  through the different phases of the query compilation/recompilation process, particularly those related to the statistics that are used to generate the execution plan.


Figure 1, Batch Compilation/Recompilation diagram

I’ve used AdwentureWorks database to set up the test environment and MS Profiler to capture various Events relevant for the experiments.

MS Profiler events

    1. Attention (Errors and Warnings)
    2. Auto Stats (Performance)
    3. SP:CacheHit (Stored Procedures)
    4. SP:CacheInsert  (Stored Procedures)
    5. SP:CacheMiss  (Stored Procedures)
    6. SP:CacheRemove  (Stored Procedures)
    7. SP:Completed  (Stored Procedures)
    8. SP:Recompile  (Stored Procedures)
    9. SP:Starting  (Stored Procedures)
    10. RPC: Starting (Stored Procedures)*
    11. RPC:Completed (Stored Procedures)*
    12. SP:StmtStarting  (Stored Procedures)
  1. SQL:StmtRecompile (TSQL)
  2. SQL:StmtStarting  (TSQL)

Database Objects
Set AdventureWorks DB compatibility level to 140 – Sql Server 2017. The version provides easy access to the information about the interesting statistics saved with the query plan (SSMS – SELECT Plan Operator, Properties,OptimizerStatsUsage).

Below is the set of Sql Server object definitions used for the testing.

Information about the statistics/indexes on the tables can be retrieved using the queries below.

The following examples assume the default settings for the Sql  Server’s options related to the statistics:
 AUTO_CREATE_STATISTICS ON
– AUTO_UPDATE_STATISTICS ON
AUTO_UPDATE_STATISTICS_ASYNC OFF 

A bit of theory first before proceeding with the tests. : )

colmodctr

colmodctr is an ever increasing counter that tracks the changes made on tables (a counter per column excluding the non-persistent computed columns). colmodctr is not transactionally consistent which means that is not affected by the rolled back changes i.e if a transaction inserts 10 rows in a table and then roll-back, the counter will still report 10 changes.
Sql Server Statistics (automatically/manually created/updated) on a column(s) will store the snapshot value of the colmodctr for the leftmost column in the stats-blob.
The counter is a very important since it’s one of the elements that drives the query recompilation decisions related to the statistics changed reasons. 

colmodctr counter can be accessed through the following system views.


Figure 2, colmodctr, system views – standard and hidden

One way  to access the hidden tables is to; Open a separate SSMS instance, close the object explorer, create a single connection using Server name: i.e ADMIN:(local)
NOTE: The structure of the hidden tables and the tables’ accessibility is not documented and may be changed in the future versions.

Recompile thresholds (RT)

RT concept defines the number of changes on a table column needed to be done in order to indicate the statistical information of that column as stale. 
The changes includes the column values changes through the DML operations such as INSERT, UPDATE, DELETE… i.e Inserting 10 new rows in a table is considered as 10 changes(identified by the colmodctr counters mentioned before).
If the table does not have statistical information i. e HEAP table with no indexes and no manually created statistics, and the query plans that references the table does not load/automatically create interesting statistics, the only relevant change when performing the RT crossing test will be the change in the number of rows inserted and/or deleted.

colmodctr(current) – colmodctr(snapshot) |  >= RT

or

 | cardinality(current) – cardinality(snapshot) |  >= RT

current     – refers to the current value of the modification counter
snapshot – refers to the value of the mod. counter captured during the last plan compilation(recopilation).
cardinality* – the number of rows in the table.

*cardinality has different meaning in the different contexts: Cardinality may represent the uniqueness of data values in a particular column – the lower the cardinality the more duplicated values in the column.
Cardinality is also a way to define the relationship between two entities in a data model. It is also known as the degree of relationship i 1-1, 1-m, m-n.

The Threshold Crossing  Test evaluates to TRUE if the number of changes is greater than the predefined RT value (see Figure 3)

Recompilation thresholds(RT) for all the tables referenced in the query are stored along with the query plan.

RT depends on the table type(permanent vs temporary) and the number of rows in the table.


Figure 3, Recompile thresholds

Special case. RT = 1 if the table has 0 rows (with or without statistics)

NOTE: Starting from SQL Server 2008 R2 SP1, Microsoft introduced TF2371. The trace flag activates the dynamic recompile threshold calculation. The higher number of rows in a table, the lower the RT. The functionality is implemented to allow automatic statistics updates to kick off more frequently for the big tables. i.e RT for a 10,000 row table is 500 + 0.20*10,000 = 2,500 – the number of changes required to trigger query recompile. For a table with 100M rows, the RT is 20,000,500. For some applications the RT may be too high, resulting in the sub-optimal plans due to the lack of query recompilation. Hence the TF2371.
Starting from SQL Server 2016, the TF2371 is turned on by default.

Here is a couple of examples to illustrate Figure3.
If there is a table A that contains 230 rows, RT for the table will be set to 500. This means that if we i.e insert 500 rows, the total number of rows (c)  will change to 730 (c>=230+500) which is enough changes to make the table’s statistics stale.
The change itself does not mean much if there are no queries that references the table : )
The query plans may or may not initiate the auto-statistics creation on the specific table columns. Also, the referenced tables may not have any statistical information i.e HEAP table with no non-clustered indexes.

Experiments

Experiment 1 (stats change before query execution)

In this experiment we will make “enough” changes to the ListPrice column (dbo.Products table) BEFORE running the stored procedure for the first time, 
The column is a key column in NCI_Products_ListPrice, the non-clustered index and has statistical information attached to it (the stats object name is the same as the NCI)

Lets begin the experiment by creating the test objects and checking the statistical information on the tables.

Step 1, Check the initial stats/rowmodctr information

Figure 4, Initial rowmodctr information

Step 2, Check stats BLOB and make changes on dbo.Products table

Run the DBCC  command below before and after the UPDATE to confirm that there were no changes in the stats BLOB information.

NOTE: rowmodctr is not transactionally consistent.

Figure 5, stats BLOB information

Figure 6, rowmodctr after the initial dbo.Products update

The changes are detected and available through Sql Server’s metadata.

Step 3, Run the stored procedure and observe the captured events by the Profiler.

Figure 7, Statistics refresh

Following the batch compilation diagram we can identify the following steps.

  1. Cache Lookup step resulted in the SP:CasheMiss event. dbo.TestQueryExecution stored proc. does not exist in the cache.
  2. Query Compilation Begins. SQL Server engine is about to load all of the  “interesting statistics”. The loaded statistics can be retrieved from the Actual Execution Plan, the SELECT physical  operator – OptimiserStatsUsage property.
  3. Query engine checks if any of the loaded  interesting statistics are stale. If yes, the system stops the batch compilation process and refreshes the statistics. In our case the system has 
    • Identified the number of changes made on the ListPrice column. From the stats/index information gathered after the initial update, the number of changes (rowmodctr/Modifications) is 610
    • Performed RT crossing test.  The test passed since the number of changes(610) exceeded the RT for tables with the number of rows greater than 500. RT = 500 + 0.20 * 504 ~ 601, 601 < 610
    • Executed StatMan, an internal process which automatically maintains statistics. The process updated the stale statistics NCI_Products_ListPrice on dbo.Product table

      If we check the stats blob from the Step 2, we will see that the Updated column changed its value to the current date – the stats blob has been updated.
      The AutoStats event reported the UPDATE of the statistics with EventSubClass = 1 – Other. More on the event can be found here.

  4. Query Optimiser starts to generate the query plan – a plan for each query statement.
    • The second query in the batch has a predicate on the Name column of the dbo.Products table. In an attempt to make better cardinality estimates on the rows that needs to be processed, Query Optimiser decided to automatically create statistical information on the column.
      The system stops the batch compilation process and again executes the StatsMan process to create the new statistics.

      After creating the stats, QO decided not to use it  : (
      Below is the list of the “interesting statistics” loaded during the Query compilation process. The list does not include automatically created stats on the Name column.

      As a result of the updated statistics on the ListPrice column , the rowmodctr for the column was reset. 

    • QO sets the new recompilation thresholds(RT) for all tables used in the queries.
      1. RT(dbo. SalesOrderDetail) = 500 + 0.20(121317) =24763.4 (~24764)
      2. RT(dbo.Products) = 500 + 0.20(504)= 600.8(~601)
        This meas that QO will initiate query recompile due to “Statistics changed” reason if
        1. dbo. SalesOrderDetail
          1. 24764 or more inserted/deleted rows
          2. 24764 or more changes on: SalesOrderDetailID, ProductID columns
        2. dbo.Products
          1. 601 or more inserted rows
          2. 601 or more changes on: ProductID, ListPrice, Name columns
  5. The query execution starts. The query plans are constructed and cached. SP:CacheInsert event reported that the stored procedure has been cached.

Experiment 2 (stats change during the query execution)

In this experiment we will make “enough” changes to the Name column (dbo.Products table) HALFWAY THROUGH the stored procedure execution.

Step 1 Set up the environment

  • Run the script to reset the test environment
  • Add a WAITFOR statement between the two queries in the stored procedure
  • Use PowerShell to execute the stored procedure. Add HostName property. Use the HostName to capture only the events related to the PS call. This will prevent MS Profiler from capturing events related to the UPDATE statement that will run in parallel.
  • Add an ApplicationName filter to the Profiler trace (ApplicationName LIKE experiment)

Step 2, Run the PowerShell cmdlet, switch to SSMS and run the UPDATE queries below. The queries will generate enough changes to make the automatically created statistics on the Name column stale.

Step 3. Analyse the captured MS Profiler trace.
Figure 8, Query recompile

  • The first thing that is different from the last run is the SP:CacheHit event. The event shows that our stored procedure was found in the Plan cache. The previously set RTs and the interesting statistics are part of the cached information.
    NOTE: Auto created statistic on the Name column was not used during the initial query compilation – the stats are not part of the interesting stats.
  • This time there were no changes on the columns that would initiate statistics updates, no new auto created stats and the existing cached query plan does not need to be recompiled due to “statistic changed” reasons. The process proceeds with the query execution.
  • The first query is successfully executed following by the  WAITFOR statement. During the statement execution (6 seconds delay) a separate query has made enough changes on the Name column(dbo.Products) to pass the RT crossing test for the table and flag the auto created statistics on the column as stale. Even if not used by QO during the plan generation, the stats are marked as stale.
  • (1) The query execution stops at the  “Any stats stale?”  step . The System initiates the query recompile process – SP: Recompile due to 2 – Statistics changed reason. The event is followed by the statement level SQL:StmtRecompile event which indicates that only the second query needs to be recompiled.
  • (2) Again, the StatsMan process kicks in and updates the stale statistics. The RTs are set (in this case the number of row  has not changed ,hence the RTs stayed the same).Rowmodctr value for the Name column is reset. to 0 
  • (3) The AutoStats event reported Statistics Update  having EventSubClass = 1 – Other
  • (4) The SP:StmtStarting event reports that the second query has been recompiled and the batch execution continues.

Experiment 3 (tables with no stats on columns)

The experiment demonstrates how queries get recompiled when referencing tables with no statistics. The recompiles due to the “statistics changed” reasons are initiated by the RT-table cardinality crossing test results only.
As previously mentioned, the cardinality based RT crossing test is defined as

 | cardinality(current) – cardinality(snapshot) |  >= RT

Lets create a test table and a stored procedure to perform the above experiment.

Step 1, set up the test environment

Add some data to the table..

The initial statistical information looks like (find how to retrieve the metadata related to the statistical information at the beginning of the post)


Figure 9, rowmodctr with no statistical information

Step Run the stored proc for the first time. The RT is set to 500.

Step 3 Make enough changes to the table to pass the cardinality crossing test. Insert 500 rows. Do not use explicit transaction yet.

Step 3 Run the stored procedure again and observe the query execution behavior in Profiler.

Figure 10, Query recompile, table cardinality change – no stats

  • The new rowmodctr information looks like

    The new number of rows (rowcnt) is recorded along with the number of changes, rowmodctr=730. In this case the rowmodctr value is not relevant since the RT crossing test depends only on changes in the table cardinality. This will be more visible if we ROLLBACK the row insertion operation which will be covered later.
  • The second execution followed the “Cashe lookup = Success” path (see the batch compilation diagram) and failed to pass the very last step  “Any stats stale?“.
  • At this stage, the system has detected that the RT cardinality crossing test has passed due to the number of changes(new rows) inserted in the table.
  • The system stopped the execution process and  initiated the stored proc/statement recompile – SP:Recompile, SQL:StmtRecompile.  As in the previous examples, the reason for the recompile was 2 – Statistics changed.
    NOTE: The recompile process is not followed by the StatMan process since the query does not have any statsBlob information to be refreshed/created.

Experiment 3.1 (rowmodcnt not in use)

The next example shows that the RT cardinality crossing test is not related to rowmodctr as it may seem from the previous example where the number of changes followed table cardinality changes.

  • Follow the steps from the previous example.
  • Execute the INSERT query  from the Step 3 within an explicit transaction
  • Observe that there are no query recompiles due to “statistic change since there were no table cardinality changes – the ROLLBACK “canceled” row insertions.
  • The statistical information shows that the rowmodctr= 720.

Conclusion

Query compilation, execution and recompilation sequence among other steps includes; loading interesting statistics – the statistical information on different table columns that Query Optimiser may find useful when creating a good plan and auto-creating statistical information on the columns that participate in i.e WHERE filter, GROUP BY ..etc. 
Sql Server query engine also checks the validity of the loaded statistical information during the initial stored procedure compilation and again during the stored procedure execution phase. If the loaded statistics are found to be stale, the former pauses stored procedure compilation, refreshes(re-samples/refreshes) the loaded statistical information and continues compilation process. If Query engine detects stale loaded statistics during the execution phase,  the process stops, refreshes(re-samples/updates) statistics and restarts compilation process – query recompilation. The re-compiles are done per query not per batch.
The examples in this blog showed that the statistical information can be automatically maintained by the queries that use them. Statistics can be also maintained manually.
To mark statistics as “Stale”, QO uses the Recompile Threshold(RT) crossing test. The test tracks the number of changes on the significant(leftmost) columns within the statistic BLOBs. The information is stored in an ever-increasing, non transactionally consistent counter – “rowmodctr”.  The RTs are stored per table and within the compiled query.
The RT crossing test can be based only on the changes in the number of rows in a table.

 

Thanks for reading.

Dean Mincic