Tag Archives: sql server

Pivoting with Python in Sql Server


Summary

In SQL Server 2016, Microsoft introduced a new system stored procedure sys.sp_execute_external_script. The idea was to extend the capabilities of SQL Server engine to be able to execute external code i.e code written in R, Java, or Python. SQL 2017 supports R and Python. The new functionality is a part of Sql Server’s Machine Learning Services. The purpose of this blog is to “tickle devs imagination” on how to use Python for Pivoting and more..

From a super high-level point of view, the process goes like this: we call sys.sp_execute_external_script indicating that we want to use e.g Python language, and pass in our python code. We also define a data set(an Sql query) that the code will use as an input data source. The code performs analytical tasks over the input data source and returns a result-set in the form of a pandas DataFrame. We use python’s methods to “tweak” the data frame to match the final shape of the output sql dataset. Optionally, we describe the output(define column names and their data types) by using WITH RESULT SET stored procedure option.

So, I thought it would be cool to try to do pivoting/multi-pivoting using Python code. What I discovered are the amazing things you can do with Python in SQL Server.

NOTE: More information about how Sql Server engine executes external code can be found here.

Prepare the environment for Python

First thing, we need to install Sql Server Machine Learning Services and Language Extensions.
Figure 8, Sql Server ML Services

Make sure that the SQL Server Launchpad service is up and running.
The next step is to allow Sql Server to execute the external scripts and we are good to go.

sp_configure 
     @configname = 'show advanced options'
    ,@configvalue = 1
GO
RECONFIGURE WITH OVERRIDE;
GO

sp_configure 
     @configname = 'external scripts enabled'
    ,@configvalue = 1            
GO
RECONFIGURE WITH OVERRIDE;
GO

Python’s Pivot

Let us present the sum of freight(Shipping cost) values per order year for each country that ordered our products, but this time using Python instead tSQL’s PIVOT operator – you can find the tSQL example here.
Set up  dbo.Orders_TestPivot  test table and run the python script.


Figure 1, Python’s pivot result

Note: During the testing, I found it difficult to use only SSMS to write Python code (similar to working with dynamic sequel) with no debugger, IntelliSense, etc. I used the Visual Studio Code tool with Python 3.8. Here is the code I used for testing. 

The system stored procedure sp_execute_external_script is similar to sp_executesql, but along with the code to be executed, parameter definitions, and parameter  values, we also pass the following values(from our pivot script):

@input_data_1 – There are a couple of interesting things with the query used as a base for Python Pivoting.

  1. Python does define Pivot grouping element, therefore, we don’t need a table expression that implicitly defines Pivot elements where the grouping element is everything else but spreading and aggregate element – see pivot operation directly on a table.
  2. The query result-set(in our case named df) is internally transformed to DataFrame object – a table-like structure defined within pandas. Pandas is an open-source data analysis library build on top of the Python language. DataFrame does not support all Sql Server data types e.g MONEY and DECIMAL are not supported and that’s why the two columns Freight and OrderValue need to be converted to FLOAT.
    Supported types : Supported types: bit, tinyint, smallint, int, bigint, uniqueidentifier, real, float, char, varchar, nchar, nvarchar, varbinary, date, datetime, smalldatetime.

How it works

As mentioned before, after passing the input query, the query gets executed and the resultset, natively in a form of a table expression, gets transformed into a DataFrame object named df. The code below runs the pivot_table method(far more powerful than tSQL’s PIVOT operator 🙂 ) on the DataFrame object. The final result is then stored in the dfpivot_out variable of type DataFrame, previously defined as an output dataset name.

       ....    
        ,@script   = N'

dfpivot_out = df.pivot_table(index = ["Shipcountry"], \
                              columns =["OrderYear"], \
                              values = ["Freight"], \
                              aggfunc={"Freight":sum}, \
                              fill_value=None).reset_index(level="Shipcountry")

## dfpivot_out =dfpivot_out.reset_index(level="Shipcountry") ##we can reshape the data frame in a separate statement.
print(dfpivot_out)
'

....

Note: Python code above starts with no indentation 🙂

Pivot_table

In our example, we are passing four parameters to the pivot_table method.

index – This parameter explicitly defines a list of grouping element(s). Due to the difference between DataFrame and sequel’s table expression structures, the Index column will not be visible in the final output (see reset_index method)
columns – defines a list of spreading elements.
values – defines a list of columns whose values will be aggregated.
aggfunc – defines a list of pairs (value column: aggregate function). Basically, we can apply different aggregate functions on different aggregate columns defined in the values list.

Before explaining the reset_index() method, remove the method from the code and comment out WITH RESULT SET option.

...
values = ["Freight"], \
aggfunc={"Freight":sum}) ##.reset_index(level="Shipcountry")
...
        ,@output_data_1_name =N'dfpivot_out'
    WITH --RESULT SETS  ((Shipcountry NVARCHAR(15),[2018] MONEY, [2019]  MONEY,[2020] MONEY));

After running the code, have a look at the result of the print statement under the Messages pane in SSMS. This is how DataFrame graphically looks like

Figure 2, panda’s DataFrame shape

The index values are not presented as a DataFrame column. There are many ways to manipulate the DataFrame output to match the sql result-set shape. One way is to use reset_index(level=”Shipcountry”) method on the DataFrame. This will “convert” the index into a column. The new, default index will be created with the unique, ever-increasing integer values, starting from 0.
Run the code in its original form and compare the print output.

Multi aggregate Pivot with Python

This time we want to calculate the total Freight and the average order value in different countries per year. Again, compare the tSQL approach with the Python code.

Compare tSQL example with the Python code. (Just a few “tweaks” to the code above and there you go 🙂 )

...
        ,@script   = N'
import numpy as np
dfpivot_out = df.pivot_table(index = ["Shipcountry"], \
                              columns =["OrderYear"], \
                              values = ["Freight","OrderValue"], \
                              aggfunc={"Freight":sum,"OrderValue":np.average}).reset_index(level="Shipcountry")
print(dfpivot_out)
' 
...
   WITH RESULT SETS  ((Shipcountry  varchar(2000),[2018] FLOAT, [2019] FLOAT,[2020] FLOAT
                                              ,[Avg Order Value(2018)] FLOAT
                                              ,[Avg Order Value(2019)] FLOAT
                                              ,[Avg Order Value(2020)] FLOAT))

Note: For this example, I’ve imported another python library, – numpy, to be able to use its average aggregate function.

… and here is another one.
Find total Freight and average order value in different countries and different regions per year. The code can be found here.

Conclusion

Playing with pivot operations is just a tip of the iceberg.  There are many different functions available in Python that we can use in SQL Server for all sorts of data analysis.  The good thing is that the data does not need to be moved away from SQL Server. However, It is still important to completely understand how python code executes in SQL environment i.e performance impact on the existing workload etc. Nevertheless, I found Python very intuitive and easy to work with, so, sorry c#, but I seem to have found a new second-best friend  🙂

Thanks for reading.

Dean Mincic

Sysname datatype and NULL column constraint

Sysname datatype and NULL column constraint


Sysname is a system supplied user defined datatype that is functionally equivalent to NVARCHAR(128). The unicode data type may be used to reference database object names.  The maximum length of the Sql Server object names is 128 unicode characters with an exception of the local temp tables*

*The maximum length of a local temporary table name is 116(N’#’+115) unicode characters. The “missing” 12 characters belongs to the hexadecimal  value appended to the name. The value makes the temp table names unique across an Sql Server instance.

When we create a table, by default, the columns can store NULL values.
Dynamic query below creates a table with 34 columns. Each column is of a different Sql Server data type.

USE tempdb
GO

DROP TABLE IF EXISTS dbo.DefaultColumnConstraints
GO
                                      
 --craete dynamic sql 
 DECLARE @tsql NVARCHAR(MAX);

 SET @tsql = N'CREATE TABLE dbo.DefaultColumnConstraints(';

 SELECT @tsql += CHAR(13) + CHAR(9) + N'COL' + 
                 CAST(ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS NVARCHAR(2)) + N'_'+[name] +
                 CHAR(9)+ [name] + N',' 
 FROM sys.systypes 

 SET @tsql = STUFF(@tsql,LEN(@tsql),1,'') + N')' --remove the last ','
 
PRINT @tsql
EXEC( @tsql)


Figure 1, Create table script

We didn’t explicitly specified the NOT NULL column constraints. This means, by default, that it’s possible to store NULL values in the columns (the columns are nullable). However, column COL26_sysname  refuses to follow the rule 🙂

 INSERT INTO dbo.DefaultColumnConstraints 
    DEFAULT VALUES;

Msg 515, Level 16, State 2, Line 108
Cannot insert the value NULL into column ‘COL26_sysname’, table ‘tempdb.dbo.DefaultColumnConstraints’; column does not allow nulls. INSERT fails.

The statement has been terminated.

A Column of a SYSNAME data type is not nullable by default. To allow NULL values to be stored in the column, we need to explicitly specify the nullability of the column.

 ALTER TABLE dbo.DefaultColumnConstraints
    ALTER COLUMN COL26_sysname SYSNAME NULL;

After the change the insert query executes with no errors.

Fgure 2, Nullable Columns

Timestamp is another data type that does not allow NULL values by default.

The query below selects all non-nullable columns (by default).

    SELECT [name]
           ,is_nullable
           ,max_length
           ,collation_name
    FROM sys.types
    WHERE is_nullable = 0


Figure 3, Not nullable datatypes by default

Side note: 
TIMESTAMP is the data type synonym for the ROWVERSION datatype(Sql Server 2012+). A ROWVERSION is automatically generated, unique 8b binary number within a database. It is simply an incrementing number generally used to indicate a row version.

Columns of the ROWVERSION data type automatically assign a new, unique binary number. It is not possible to insert any value except NULL (which then gets replaced with a proper binary number). Column NULL  constraints does not work with the ROWVERSION data type.

USE tempdb
GO

DROP TABLE IF EXISTS dbo.TestRowversionDataType
GO
    CREATE TABLE dbo.TestRowversionDataType(
        RowId TIMESTAMP NULL --ROWVERSION
)
GO

--fails
INSERT INTO dbo.TestRowversionDataType(RowId)
    SELECT 0x0000000000000111
/*
Msg 273, Level 16, State 1, Line 126
Cannot insert an explicit value into a timestamp column. Use INSERT with a column list to exclude the timestamp column, or insert a DEFAULT into the timestamp column.
*/
GO
--success (a new rowversion value is added)
INSERT INTO dbo.TestRowversionDataType(RowId)
    SELECT NULL;
GO
--success (a new rowversion value is added)
INSERT INTO dbo.TestRowversionDataType(RowId)
    DEFAULT VALUES;
GO

SELECT * FROM dbo.TestRowversionDataType


Figure 4, Rowversion data type

A question for the audience would be; why rowversion is the only Sql Server system data type that is represented with its synonym, timestamp, in sequel’s metadata

SELECT *
FROM sys.types
WHERE [name] IN ('rowversion','timestamp')

FIgure 5, timestamp vs rowversion

Conclusion

By default, all Sql Server system data types are nullable when assigned to the table columns – except sysname datatype. In case of SYSNAME, the default behavior does not allow NULL values. It is a good practice to explicitly include NULL constraints (nullability) when creating a table.

Thanks for reading.

 

Temporary tables statistics when used in stored procedures

Temporary tables statistics when used in stored procedures


Summary

Sometimes when we design solutions that implement complex business rules we tend to use temporary objects, temporary tables in particular. Decoupling complex queries into the smaller “intermediate” results may help optimiser to come up with a better plan since it needs to resolve simpler queries. It can also make code more readable and maintainable.  This approach, of course, needs to be carefully planned since the excessive use of temporary objects may degrade query performances, deviate from set-based design principles, and do more damage than good. Some of the common patterns “When to break down big queries” can be found here.
Depending on the task, we might decide to use temporary tables over temp variables. This may be due to the different scope and/or due to the fact that temp tables are more “robust” and support statistics.

This blog explores specific case scenarios, including temp tables used in stored procedures and the unique behavior of related statistical information that can lead to suboptimal query plans.

Sample data

Let’s create a couple of test tables and a stored procedure that we’ll use throughout the article.
Platform: Microsoft SQL Server 2017 (RTM) Developer Ed.

DROP TABLE IF EXISTS dbo.Transactions --fk constr.
GO
DROP TABLE IF EXISTS  dbo.Products
GO
--Products table
CREATE TABLE dbo.Products(
     ProductID INTEGER IDENTITY(1,1) NOT NULL
        CONSTRAINT PK_ProductID PRIMARY KEY CLUSTERED
    ,[Name] NVARCHAR(256) NOT NULL
    ,ListPrice DECIMAL(10,2) NOT NULL
    ,ModifiedDate DATETIME2
        INDEX NCI_Name([Name])
)
GO
--Populate Products table with 500 distinct products
;WITH getRowNums AS
(
    SELECT TOP 500
         RowNo = ROW_NUMBER() OVER(ORDER BY(SELECT NULL) )   
        ,[Name]
    FROM sys.columns c
),getDistribution AS 
(
    SELECT  [Name] = CASE 
                         WHEN rws.RowNo % 5 =0   THEN N'A' -- 100 
                         WHEN rws.RowNo % 99 = 0 THEN N'B' -- 4 
                         WHEN rws.RowNo % 36 = 0 THEN N'C' -- 10 
                         WHEN rws.RowNo % 14 = 0 THEN N'E' -- 27 
                         WHEN rws.RowNo % 499 = 0 THEN 'Z' -- 1 
                     ELSE N'F' --358 products with name that starts with letter 'A'
                 END + rws.[Name] + CAST(rws.RowNo AS NVARCHAR(3))
            ,Price = ABS(CHECKSUM(NEWID())) % 44.23 
            ,ModifiedDate = DATEADD(MINUTE,RowNo,SYSDATETIME()) 
    FROM getRowNums rws
)
INSERT INTO Products([Name],ListPrice,ModifiedDate)
    SELECT [Name],Price,ModifiedDate
    FROM getDistribution

-- check the product names distribution
/* 
SELECT [Name Starts With..] = LEFT([Name],1)
      ,[Distribution] = COUNT([Name])
FROM dbo.Products
GROUP BY LEFT([Name],1)
ORDER BY [Distribution] ASC
*/
-- Order Transactions table
DROP TABLE IF EXISTS dbo.Transactions
GO
CREATE TABLE dbo.Transactions(
    OrderId INT IDENTITY(1000,1) NOT NULL
        CONSTRAINT PK_OrderId PRIMARY KEY CLUSTERED
    ,ProductId INT
        CONSTRAINT FK_ProductId_ProductsProductId 
            REFERENCES dbo.Products(ProductID)
    ,OrderQuantity INT
    ,ModifiedDate DATETIME2
  ,INDEX NCI_ProductId(ProductId)
)
GO
--each product was ordered 500 times
INSERT INTO dbo.Transactions(ProductID
                            ,OrderQuantity
                            ,ModifiedDate
)
    SELECT ProductID
          ,Quantity = ABS(CAST(NEWID() AS BINARY(6)) % 10) + 1 --random int between 0-10
          ,ModifiedDate = SYSDATETIME()
    FROM dbo.Products p
     CROSS JOIN (SELECT TOP 500 r = 1
                 FROM sys.columns)  x
GO

--stored proc: 
--Show the number of orders for products whose names begin with certain letters.
DROP PROCEDURE IF EXISTS dbo.testTempTableStats
GO
CREATE PROCEDURE dbo.testTempTableStats(
    @StartsWith NVARCHAR(5)
)
AS
BEGIN
    CREATE TABLE #temp( 
            ProductId INTEGER
            ,ProductName NVARCHAR(256)
    )
    
    INSERT INTO #temp
        SELECT pr.ProductID
               ,pr.[Name]
        FROM dbo.Products pr
        WHERE pr.[Name] LIKE @StartsWith + '%'                 

        SELECT t.ProductName
              ,NoOfOrders = COUNT(DISTINCT tr.OrderId) 
        FROM dbo.Transactions tr
            INNER JOIN #temp t
                ON tr.ProductId = t.ProductId
        GROUP BY t.ProductName

    DROP TABLE #temp;
END

The code above is a “complex query” that we decided to decouple by using a temp table. First, we store all relevant products in the temp table and then join the table with a big table in order to calculate the required aggregates. This is one of the common temp tables use cases. We expect our stored procedure to be called with widely varying parameters i.e For all products names that start with ‘B'(4 products) we expect 2000 matching rows that needs to be aggregated (500 orders per product), and for those products whose name starts with ‘A’ (100 products), we expect 25x more -50000.
Since temp tables support statistics and our DB is set up to perform auto create statistics/auto update stats, we also expect to get the good plans for the varying input parameters. In other words we expect query plan to be recompiled if needed – In addition, we also expect query to recompile due to the correctness related reasons (the temp table is created and destroyed every time we run the query)

Test scenario

Run the code from Session1.

DBCC FREEPROCCACHE
GO
EXEC testTempTableStats
        @StartsWith = N'C';
GO


Figure 1, The plan created for the first input parameter

As we can see, the plan is good. The temp table Carnality estimates are good, Nested Loop physical operator estimates are aligned with the actual no of rows and the Sort operator was granted enough memory to perform distinct sort.
Now, say another client runs the same stored proc in the context of a different session with a different parameter.
Run the code from Session2

EXEC testTempTableStats
        @StartsWith = N'A';


Figure 2, The original plan did not change for the new parameter

Session2 Client gets a horrible plan. Starting from the temp table carnality, everything seems to be off. Nested Loop physical operator estimates are now totally wrong, and the Sort operator does not have enough granted memory to perform the sort operation and needs to spill 231 pages(~2MB)  to the disk.
If we execute the same stored proc in the context of a session passing @StartsWith parameter value N’F’, we’ll get the same plan but again with wrong estimates and even worse Sort operator spills.

Analysis

From the previous example, we can see that the first passed parameter dictates the query plan structure. The plan “never changes” for consecutive calls. This looks like a parameter sniffing problem, but our query predicates do not include any parameters.
To investigate and explain this behavior let’s start by monitoring the following parameters:

  • Query compilation/recompilation  – Using MS Profiler
  • Temp table’s automatically created statistics – Using database console command SHOW_STATISTICS.

Set up Profiler
Include the following events in Profiler’s trace


Figure 3, Profiler – events and properties

Add temp table statistic tracking to the query

ALTER PROCEDURE dbo.testTempTableStats
   ...
   ..         
        SELECT t.ProductName
              ,NoOfOrders = COUNT(DISTINCT tr.OrderId) 
        FROM dbo.Transactions tr
            INNER JOIN #temp t
                ON tr.ProductId = t.ProductId
        GROUP BY t.ProductName

    -- Show automatically created statistics for the column "ProductName"
    DBCC SHOW_STATISTICS ('tempdb.dbo.#Temp', ProductName) 
         WITH STAT_HEADER
             ,HISTOGRAM;

    DROP TABLE #temp;
END

Start the trace and run the stored procedure again.
Note: Altering stored procedure automatically removes existing, cached batch.

EXEC testTempTableStats @StartsWith = N'C';

The query result shows auto-created statistics on the temp table’s ProductName column.
Figure 4, First parameter statistics

The plan created is the same as on Figure1

The Profiler trace shows the following sequence of events:

Figure 5, Profiler -The first plan compilation

This plan is exactly what we expected to see.
The unit of compilation and caching is a batch – in this case, the content of our stored procedure. During batch compilation, the individual statements are compiled one after another.

Short detour: What happens when we create a stored proc? After hitting F5(executing CREATE PROCEDURE …), SQL Server parses the procedure definition for syntactical accuracy. If the statements are syntactically correct, the text of the stored procedure is stored in an internal table accessible through the sys.sql_modules system view. At this stage, it is possible to reference non-existing objects within the batch i.e it is possible to include a query that inserts data in a non-existing table, or to select data from a non-existing function. Later, when we execute the stored procedure for the first time, the process known as “Deferred Name Resolution” will check the names of the referenced objects and consequently initiate the Recompile of the query segment which references the objects.

Follow steps 1- 6 on the trace above.

  1. (not captured by the trace). The execution environment has performed the initial batch compilation. Each query within the batch is compiled separately(Starting from SQL Server Yukon :). The initial compilation was incomplete and only a skeleton plan was created – CREATE TABLE query is the only one fully compiled. The second, INSERT query and the following SELECT query could not be compiled since the temp table did not exist before the very first execution. The skipped compilations are deferred until the execution time.
  2. Batch execution started and the first query was executed (a brand new temp table was created)
  3. The second query (INSERT INTO..#Temp) has to be recompiled due to Correctness related reasons. The EventSubClass property describes the cause of Recompilations
        • A new object(temp table) has been created. The query recompile has been initiated.
        • Query Optimiser loads all “interesting” statistics that can help build a good plan. The statistic belongs to the tables included in the query. If needed, QO will pause the process to automatically create additional statistics on the columns that can help build the plan*. For this query, QO does not need to auto-create stats.
  4. The INSERT INTO …#Temp.. query is now recompiled and executed.
  5. The third query(the one we are analysing) has to be recompiled due to similar reasons as the previous query. After loading all “interesting” stats – which belong to dbo.Transactions, QO decided to stop the execution process and to auto-create statistics on #temp table columns: ProductId and ProductName
  6. StatMan – an internal process, that creates auto-stats on the ProductId column(#temp table). The ProductId column is used in the JOIN predicate.
  7. StatMan creates auto-stats on the ProductName column(#temp table). The ProductName column is used for the grouping. The stat header and histogram are selected in the query output.
    Auto stat name convention: _WA_Sys_prefix. * In our example:  _WA_Sys_00000002_AA7DA731,  00000002 – Column ordinal position, AA7DA731 – Hexadecimal #Temp table ObjectID.SELECT CAST(0xAA7DA731 as INTEGER) /* Object_id = -1434605775 */
  8. The SELECT query is now recompiled and executed.
  9. (Batch execution successfully completed)

*NOTE: _WA – Stands for Waterloo, Canada(Sybase, the Father of Sql Server was developed in Waterloo) . This is mentioned in 70-462.

Clear Profiler’s trace and execute the same stored procedure, this time with parameter  ‘A’

EXEC testTempTableStats @StartsWith = N'A';

Figure 6, Stale statistics

Strangely enough, the statistical information (header and histogram) is totally wrong. i.e Table carnality should be 100, not 10 and the histogram steps should show product names that start with ‘A’ not ‘C’.

Figure 6 – Profiler, No query recompiles

The trace shows no recompiles due to schema changes(CREATE TABLE #Temp) and no stat. updates. From the optimality point of view, the Query engine executed the first query(input param ‘C’). The result set is correct, but it seems that everything else is wrong.  The first question is why we are missing recompiles due to schema changes (a brand new temp table was created and populated with the new values). The second question is why the auto-created statistics on the temp table were never updated. Is this why our query never gets recompiled due to plan optimality-related reasons.

Temporary tables caching

The temp table caching mechanism may explain the lack of schema-related recompiles in the previous example. This may also explain the stale statistics.
The caching feature was introduced in SQL Server 2005. This functionality provides the caching of temporary objects(temp tables, table variables, and TVFs) across repeated calls of routines(Stored procedures, triggers, and UDFs).
In short, When a stored procedure execution ends, SQL Server truncates* and renames the table, keeping only one IAM and one data page. The structure will be used by the subsequent calls instead of allocating new pages from scratch when the object is created again.

NOTE: For the temp objects smaller than 8MB, the truncation happens immediately after module execution ends. For the larger temp objects, SQL Serer performs “deferred drop” and immediately returns control to the application.

The caching mechanism works for the temp tables that is created by using CREATE TABLE or SELECT INTO statements. The caching is not possible when there is an explicit DDL on a temp table after it is created i.e ALTER #table ADD Constraint, CREATE STATISTICS** on table colums or there is a named constraint on a temp table(this is generally a bad idea since it can lead to constraint name conflict). Temporary tables are also not cached if they are part of dynamic SQL or ad-hoc batch.

NOTE: Statistics created using an explicit CREATE STATISTICS statement are not linked to a cached temporary object. Auto-Created statistics(like the ones presented in the example) are linked to a cached temp object. UPDATE STATISTICS does not prevent temp table caching.

The following query shows that there is one cached temp object currently not in use

SELECT  mcc.[name],
        mcc.[type],
        mcc.pages_kb,
        mcc.pages_in_use_kb,
        mcc.entries_count,
        mcc.entries_in_use_count
FROM sys.dm_os_memory_cache_counters AS mcc
WHERE mcc.[type] = N'CACHESTORE_TEMPTABLES'
name                      type                   pages_kb  pages_in_use_kb  entries_count  entries_in_use_count
------------------------  ---------------------  --------  ---------------  -------------  --------------------
Temp Tables & Table Vars  CACHESTORE_TEMPTABLES  16        0                1              0

We track the temp table name by adding SELECT OBJECT_ID(‘tempdb.dbo.#temp’)  in the example stored proc. This will show that the temp table object_id never changes. When the stored procedure completes execution, the internal process changes the name of the temp table to hexadecimal form. This will be the case even if we explicitly drop the temp table at the end of the sproc batch.

If a stored procedure is executed concurrently, multiple separate cached objects may be created in tempdb.  There is a cached temp object per execution context.

This feature explains why our plan has not been recompiled due to schema changes related reasons. But what about statistics? As previously mentioned, it is expected for the query processor to be able to detect the statistical changes that can affect the cached plan validity and to adapt to it. But it didn’t t happen, why?

Recompilation Reasons

There are many different reasons that can cause a query to be recompiled(EventSubClass Figure 5). The list of all possible recompilation reasons (SP:Recompile event) can be retrieved using the query below.

SELECT xemv.map_key,
       xemv.map_value
FROM sys.dm_xe_map_values AS xemv
WHERE xemv.name = N'statement_recompile_cause'
ORDER BY xemv.map_key;

For the purpose of this analysis, we are interested only in Schema changed and Statistic changed reasons. Temp table caching can explain the absence of the former reason.
Updating statistics (both manual and auto-update) may trigger the Optimality(data) related recompilation of the plans that use these statistics.

RT – Recompile thresholds

SQL Server query processor is designed to adapt the cached query plans to the data changes. The changes are tracked by statistical information(histograms) linked to the “interesting” table columns. Statistical information changes may cause query recompilation due to plan optimality-related reasons.
During the query optimisation phase, the Query processor identifies and loads the “interesting” statistics. This helps QP to create good enough plans. In our case, QP loads the following, existing statistics.

  • NCI_Name (Statistics on dbo.Products Name column created as a part of the NCI index on the column)
  • NCI_ProductId(Statistics on dbo.Transactions ProductId column also created with the NCI index on the column)
  • (Auto Created)#temp table auto-created statistics on #temp.ProductName and #temp.ProductId

Use the “undocumented”  trace flag to find out the interesting statistics used in our example:

-- use execution plan XML output to find 
-- <ModTrackingInfo> tag
DBCC FREEPROCCACHE
GO
DBCC TRACEON(8666)
    EXEC dbo.testTempTableStats
        @StartsWith = 'C'
    GO
DBCC TRACEOFF(8666)
GO

The table contents are constantly changing (INSERT, UPDATE, DELETE, MERGE). SQL Server query processor tracks those changes in order to estimate their impact  on the existing cached plans.
For each “interesting statistics” loaded, Query processor creates a snapshot of a counter that counts the number of table modifications. Generally, every table referenced in a query will have such counters loaded.
The table contents are tracked:

  • Directly     – using table cardinality metadata information
  • Indirectly – using statistics on table columns.
 Each column has an RT (recompilation threshold) associated with it. RT is a function of table cardinality.

 RT = f(n), n -number of rows in a table (table cardinality)

Essentially, RT for a table defines the frequency with which queries that refer to the table recompile.

During the first query compilation, for each interesting statistics, a snapshot value of a counter(colmodctr) that counts a number of table modification is stored – a counter per column that participates in the interesting statistics.

Along with colmodctr snapshot values, Query processor sets the Recompile thresholds for every colmodctr created.
Before using an existing cached plan, the Query optimiser performs The threshold crossing test defined by the formula below.

 | colmodctr(current) – colmodctr(snapshot) |  >= RT

current     – refers to the current value of the modification counter
snapshot – refers to the value of the mod. counter captured during the last plan compilation(recompilation).
If the threshold crossing test evaluates to TRUE (the number of changes exceeds the pre-defined RT), for any of the interesting statistics, the query plan is recompiled.
The threshold crossing test will be performed even if the query processor does not load any interesting statistics. In that case, the test will be based simply on table cardinalities.

 | cardinality(current) – cardinality(snapshot) |  >= RT

The following table shows how RT is calculated for permanent and temporary tables

Temporary Tables
Table cardinality Recompile Threshold Passing RT test caridnality
n<6 6 >= (n+6)
6 <= n <= 500 500 >= (n+500)
n > 500 500 +0.20*n >= (n +500 +0.20*n)
Permanent tables
Table cardinality Recompile Threshold Passing RT test caridnality
6 <= n <= 500 500 >= (n+500)
n > 500 500 +0.20*n >= (n +500 +0.20*n)


colmodctr

colmodctr is the name of a counter which counts the number of modifications that a table has undergone. The counter is per column and is non-transnational (if we insert 10 rows and rollback the transaction, colmodctr value will increase by 20). The counter is also ever-increasing.

So far, Temp table caching can explain why our query does not get recompiled due to schema changes. Now, let’s find out why our query use stale statistics and why it does not perform optimality-related query recompilation.

Automatically created statistics are cashed along with the cached temp tables. In this case,  auto-created statistics are NOT reset at the end of the stored procedure execution cycle nor at the beginning of a consecutive stored proc. execution.
This is why statistics on a cached temporary table may belong to one of the previous stored procedure calls and are absolutely not related to the context of the current temp table.
If we used a “normal”(dbo.Temp) table instead of the #temp table, the table would be dropped and recreated on each stored procedure call. Consequently, auto-created statistics would be repeatedly dropped and created which would make them always up-to-date.

The modification counters which count changes on the interesting columns, in our example columns: ProductName and ProductId will not cause query recompiles even if the number of the changes passes the RT threshold test. This behavior is unique to the temp tables(Cached or non-cached *) when used within stored procedures.

Note*: To prevent temp table caching, perform a DDL operation on the newly created temp table. e.g

...
    CREATE TABLE #temp( 
             ProductId INTEGER)
    ALTER TABLE #temp
        ADD ProductName NVARCHAR(256)
..
/* -- this change will cause Err 2767, Could not locate statistics 'ProductName' in the system catalogs
   -- starting from the second execution
 DBCC SHOW_STATISTICS ('tempdb.dbo.#Temp', ProductName) 
             WITH STAT_HEADER
                 ,HISTOGRAM;

*/

Err 2667: The statistics will be internally present, but not available through sp_helpstats, DBCC SHOW_STATISTICS, and tempdb.sys.stats. My assumption would be that the auto-crated stats. BLOB entry gets “disconnected” from temp table’s object_id -> For non-cached temp tables, the object_id changes on every sp run.
What is also interesting with the non-cached table is that it does not force our query to be recompiled due to schema change reasons as we might expect.

Back to the original example, to be able to track the modification counters, expand dbo.testTempTableStats stored procedure with the dynamic query below.

--tSQL 2017 syntax
CREATE OR ALTER PROCEDURE dbo.testTempTableStats(
    @StartsWith NVARCHAR(5)
)
AS
BEGIN
    CREATE TABLE #temp( 
            ProductId INTEGER
            ,ProductName NVARCHAR(256)
    )
    
    INSERT INTO #temp
        SELECT pr.ProductID
               ,pr.[Name]
        FROM dbo.Products pr
        WHERE pr.[Name] LIKE @StartsWith + '%'                 
 
        SELECT t.ProductName
              ,NoOfOrders = COUNT(DISTINCT tr.OrderId) 
        FROM dbo.Transactions tr
            INNER JOIN #temp t
                ON tr.ProductId = t.ProductId
        GROUP BY t.ProductName

    ---sys.dm_db_stats_properties works only of the current db context :(
    EXECUTE( N'USE tempdb; 
                  SELECT  [Object] = object_name(s.object_id)
                            ,s.[name]
                            ,sp.last_updated
                            ,sp.[rows]
                            ,s.auto_created
                            ,sp.modification_counter 
                  FROM sys.stats s 
                  CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) sp
                  WHERE s.object_id = object_id(''#temp'',''U'');'
    );

    -- Show automatically created statistics for the column "ProductName"
    DBCC SHOW_STATISTICS ('tempdb.dbo.#Temp', ProductName) 
         WITH STAT_HEADER
             ,HISTOGRAM;
 
    DROP TABLE #temp;
END

Now execute the query with the parameter ‘C’

DBCC FREEPROCCACHE
GO
EXEC testTempTableStats
        @StartsWith = N'C';
GO

Figure 7, colmodctr – modification counters

On the first query execution, the query optimiser sets up the RT values for the two auto-created columns. The Recompilation thresholds are based on the temp table’s cardinality (NoOfRows = 10). Based on the previously mentioned formula ( 6 < n < 500, RT = 500 ), in order to pass the RT crossing test we’ll need at least 510(500+10) changes on either of the two columns to initiate query recompilation. This means that, if we execute stored proc 6 times passing parameter ‘A‘ (populates temp table with 100 rows), we will reach 6*100 = 600 changes which are more than we need to pass the test.

EXEC dbo.testTempTableStats 
             @StartsWith = 'A'
GO 6

The 6th execution shows the following results, but no expected recompile.


Figure 8, colmodctr – modification counters do not initiate query recompiles

The table below shows colmodctr counter’s progress:

Figure 9, modification counter progress

As we can see, the query was not recompiled due to optimality reasons. Non-cached temp tables will show the same behavior. The colmodctr simply does not work for temp tables within stored procedures.
If we repeat the experiment, but this time we pass parameter ‘B'(param ‘B’ results in 4 rows in the temp table) for the very first stored procedure call, we will get a similar output as in Figure 7. RT value is now set to 6,  1<n<6, RT = 6, which means that we need at least 10(6+4) changes on the interesting temp table columns to initiate query recompilation.
If we execute the same stored procedure(with param ‘B’), say two more times, the modification counters will show 16 changes and, as expected, with no recompile.
But if we execute the stored procedure with a parameter that generates 10 or more rows in the temp table, i.e ‘C’  or ‘E’ (10 and 27 rows) our query will recompile.

DBCC FREEPROCCACHE
GO
-- this will set up RT = 10
EXEC testTempTableStats 
        @StartsWith = 'B'
GO
-- modification counter does not initiate recompiles
EXEC testTempTableStats 
    @StartsWith = 'B'
GO 2 -- no recompiles

DBCC FREEPROCCACHE
GO
-- this will set up RT = 10
EXEC testTempTableStats 
        @StartsWith = 'B'
GO
--  a parameter that generates 10 or more rows in temp table do initiate query recompile
EXEC testTempTableStats 
    @StartsWith = 'C' -- 'E', [A-Z],'A' ...

The experiment shows that the Query processor does track temp table cardinality changes.
The threshold crossing test will be based on temp table cardinality.

 | cardinality(current) – cardinality(snapshot) |  >= RT

In our case RT =6, the expected current table cardinality is 10, | 10 – 4 | >=6, and the  RT crossing test evaluates to TRUE.

The interesting thing about this is if our code uses a temp table that has a relatively stable number of rows i.e ~500, it is likely that we’ll end up with one plan that will never change due to optimality reasons. If our sp creates a query plan for @StartsWith = ‘C’ parameter that sets up RT=500, the query optimizer will expect the temp table’s cardinality to increase to at least 510 rows, which will never happen if our table has a maximum of 500 rows. The difference will be more visible i.e in the case when the Query optimizer sets RT to 500 +20%*n (total number of rows – see the RT tables above). The expected cardinality that will initiate recompile will be n + 500 +0.2*n.

Possible solutions

WITH RECOMPILE

One of the obvious solutions for this problem would be to create a stored procedure WITH RECOMPILE. This will

  • Prevent temp table caching
  • Prevent stored procedure query plan caching
  • Recompile all queries in the stored procedure on each execution.
--tSQL 2017 syntax
CREATE OR ALTER PROCEDURE dbo.testTempTableStats(
    @StartsWith NVARCHAR(5)
)
WITH RECOMPLILE
AS
BEGIN
..

This method may work out of the box, but may also be expensive since the whole batch that can contain quite a few complex queries must compile on each call, and then, during the execution, separate queries may be recompiled due to previously mentioned reasons.

EXECUTE .. WITH RECOMPILE

In situation, we know that the cached query plan works for the most common range of input parameters. If we are able to identify those parameters for which the plan would be sub-optimal, we can create a wrapper stored procedure that controls the main stored procedure execution properties.

--wrapper stored procedure
CREATE OR ALTER PROCEDURE testTempTableStatsWrapper(
      @StartsWith NVARCHAR(5)
)
AS
BEGIN
    SET NOCOUNT ON;

    IF @StartsWith IN ('Z','B','C','E')
        EXECUTE testTempTableStats
                @StartsWith
    ELSE  
        EXECUTE testTempTableStats
                @StartsWith
            WITH RECOMPILE
    RETURN;
END

A stable plan will be cached for the parameters Z, B, C, and E, and a second, non-cached plan will be constructed each time we pass any other parameter value than the four mentioned above.

OPTION RECOMPILE and UPDATE STATISTICS

To overcome the problem of stale statistics, we can add OPTION RECOMPILE to our original query. This will force our query to recompile.

--tSQL 2017 syntax
CREATE OR ALTER PROCEDURE dbo.testTempTableStats(
    @StartsWith NVARCHAR(5)
)
AS
BEGIN
    ..
        SELECT t.ProductName
              ,NoOfOrders = COUNT(DISTINCT tr.OrderId) 
        FROM dbo.Transactions tr
            INNER JOIN #temp t
                ON tr.ProductId = t.ProductId
        GROUP BY t.ProductName
        OPTION (RECOMPILE);
...

 

If we repeat the first test and pass parameters ‘C’ and ‘F’ respectively, we can see that the “F” param‘s option recompile updates only the temp table’s cardinality, from 10 rows to 358 rows. The process skips updating auto-created statistics on ProductName and ProductId.

DBCC FREEPROCCACHE; 
GO
EXEC dbo.testTempTableStats @StartsWith = 'C'
GO
EXEC dbo.testTempTableStats @StartsWith = 'F'
GO


Figure 10, Stale statistics, correct cardinality

OPTION(RECOMPILE) forces the query to recompile, but it does not detect stale statistics. It also “activates” rowmodcnt (modification_counter). In this case, after the first execution(param ‘C’), RT is set to 500. QO needs to detect at least 510 changes on the significant columns to initiate the recompile. In our case, when the modification counter reaches 510+ changes(Two consecutive calls passing param. “A”), including temp table truncates,  QO will update statistics as a part of the query recompilation.

1st Execution Param “C” : RowNo=10: RT=500; modification_counter = 0
2nd Execution Param “A” : RowNo=358: RT=500; modification_counter = 10(truncate) + 358 = 368
3rd Execution Param “A” : RowNo=358: RT=500; modification_counter = 358(Truncate) +  368 = 726** RT crossing test evaluates to TRUE -> Update Statistics.

For non-cached temp tables, OPTION RECOMPILE does update auto-created statistics as well.

We can force the query optimizer to update auto-created statistics using UPDATE STATISTICS  #temp. The complete solution would be …

--tSQL 2017 syntax
CREATE OR ALTER PROCEDURE dbo.testTempTableStats(
    @StartsWith NVARCHAR(5)
)
AS
BEGIN
    CREATE TABLE #temp( 
            ProductId INTEGER
            ,ProductName NVARCHAR(256)
    )
    
    INSERT INTO #temp
        SELECT pr.ProductID
               ,pr.[Name]
        FROM dbo.Products pr
        WHERE pr.[Name] LIKE @StartsWith + '%'                 
 
        UPDATE STATISTICS #temp;        

        SELECT t.ProductName
              ,NoOfOrders = COUNT(DISTINCT tr.OrderId) 
        FROM dbo.Transactions tr
            INNER JOIN #temp t
                ON tr.ProductId = t.ProductId
        GROUP BY t.ProductName
        OPTION(RECOMPILE);
 
    ---sys.dm_db_stats_properties works only off the current db context :(
    EXECUTE( N'USE tempdb; 
                  SELECT  [Object] = object_name(s.object_id)
                            ,s.[name]
                            ,sp.last_updated
                            ,sp.[rows]
                            ,s.auto_created
                            ,sp.modification_counter 
                  FROM sys.stats s 
                  CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) sp
                  WHERE s.object_id = object_id(''#temp'',''U'');'
    );
 
    -- Show automatically created statistics for the column "ProductName"
    DBCC SHOW_STATISTICS ('tempdb.dbo.#Temp', ProductName) 
         WITH STAT_HEADER
             ,HISTOGRAM;
 
    DROP TABLE #temp;
END

It is a bit unusual place to put the UPDATE STATISTICS statement since we expect auto-created statistics to be constructed during our query compilation/recompilation process, not before. But then, if we put the statement after the query, UPDATE STATISTICS will not be able to have an effect on the preceding query.

BATCH MODE MEMORY GRANT FEEDBACK (SQL Server 2017+)

This solution is just an experimental one. In SQL Server 2017 Microsoft introduced Adaptive Query Processing, a few new query processing features set to improve query performance.  The new features (v1.0) are

  • Batch mode memory grant feedback
  • Batch mode adaptive joins and
  • Interleaved execution

More about this interesting set of features can be found here.
The first two features are available in queries that reference one or more objects with columnstore indexes. Opposed to row-mode execution style, when rows are passed/processed between iterators(operators) one at a time, the batch processing operates on a 64bytes structure(8 pages). The allocated memory space can host between 64 and 900 rows depending on the number of columns selected.
To activate the features(and the batch processing) we can create a “dummy” columnstore index with an int column and LEFT JOIN it with our query. The new join will not change the query logic nor the result-set.

--Create a dummy columnstore index 
DROP TABLE IF EXISTS dbo.dummyColumnstore;
GO
CREATE TABLE dbo.dummyColumnstore(i int)
GO
CREATE NONCLUSTERED COLUMNSTORE INDEX nc_csi_id 
    ON dbo.dummyColumnstore(i)
GO

--add an extra LEFT JOIN to the original query
CREATE OR ALTER PROCEDURE dbo.testTempTableStats(
    @StartsWith NVARCHAR(5)
)
AS
BEGIN

    CREATE TABLE #temp( 
            ProductId INTEGER
            ,ProductName NVARCHAR(256)
    )
    
    INSERT INTO #temp
        SELECT pr.ProductID
               ,pr.[Name]
        FROM dbo.Products pr
        WHERE pr.[Name] LIKE @StartsWith + '%'                 

        SELECT t.ProductName
              ,NoOfOrders = COUNT(DISTINCT tr.OrderId) 
        FROM dbo.Transactions tr
            INNER JOIN #temp t
                ON tr.ProductId = t.ProductId
            LEFT OUTER JOIN dbo.dummyColumnstore dcc -- columnstore idx
                ON tr.ProductId = dcc.i
        GROUP BY t.ProductName;

    DROP TABLE #temp;

    RETURN;
END

If we run the original test again, first with param ‘C’ and then with param ‘F’, we’ll notice a couple of interesting changes in the query execution plan.

DBCC FREEPROCCACHE; 
GO
EXEC dbo.testTempTableStats @StartsWith = 'C'
GO
EXEC dbo.testTempTableStats @StartsWith = 'F'
GO

The first execution creates a new plan

Figure 11, Adaptive Query Processing features

We can notice a new Adaptive Join operator and a new, batch execution mode for Hash Match Aggregate and Compute scalar operators.
The next execution (param ‘F’) will result in underestimated memory grants for the Hash Match Aggregate operator. This is due to stale statistics, as it was shown earlier.
Now, if we execute the same query again(param ‘F’), the Batch mode memory grant feedback feature will FIX the problem with the operator by allocating enough memory for the operation. This will fix the problem with the Aggregate operator and prevent data spills. However, the original problem with the stale statistics, etc… will stay.

Conclusion

Temporary tables when used in stored procedures may have totally wrong statistics. This can lead to performance degradation due to sub-optimal plans including not enough accurate memory grants, wrong temp table cardinality estimates, etc. This is partially due to the temp table caching mechanism.  The feature is extremely useful since it provides more efficient temp table reuse across frequent stored procedure calls, particularly in OLTP workloads.  Between stored proc. calls, the column modification counters on interesting temp table columns do not initiate query recompiles ( even if the RT crossing test evaluates to true).  However, the RT crossing test that depends on the temp table’s cardinality changes, if evaluated to true, will initiate query recompiles.
Forcing the Query optimizer to perform a statement recompile does not resolve the problem. For the cached temp tables, it fixes cardinality estimates but does not update relevant statistics. For non-cached tables, it resolves the problem (everything is up to date). Combining UPDATE STATISTICS #tempTable and OPTION RECOMPILE is one way to workaround the problem.
The example demonstrates a common scenario when a temp table is used to reduce query plan complexity.  The table stores a relatively small number of rows and drives a more complex query. If the cached plan is sensitive to the cardinality and statistical distribution of the values in the temp table, the plan reuse may result in stored procedure performance degradation.
For the larger temp tables with widely differing numbers of rows on every execution, it is likely that the main query recompiles due to temp table cardinality changes.
For the query plans that are less dependent on the temp table statistics and cardinality, we might consider using table variables instead(lightweight, no statistics, cardinality presented as 1row etc.)

 

Thanks for reading.

Dean Mincic

Adding non-NULL value columns ONLINE

Adding non-NULL value columns with values to a busy table – ONLINE

Adding a non-null column to a big table on a busy database can be tricky business. It might seem to be an easy task of altering table and adding a column ( an integer data type column – 4 bytes more or less, who cares) but it’s not. DBAs are always extra careful when perform those sorts of actions since it can cause serious implications to the db applications. For example, adding a non-null column to a busy table which contains 10M rows means that every row must be updated with a value. The table may be accessed by many different applications many times in a short period of time. These requests can be seriously blocked by the alter table/add a not null column/update column with a value, causing app timeouts, deadlocks and all sorts of headaches.

One of the common approaches is to add a NULL column and then to update the column values across all rows using the batch update scenario (update N number of rows per transaction) and then to alter column as non-null. This operation is usually performed “offline” or when the table is “quiet”.

Starting with SQL 2012+(Enterprise Edition)  the operation is not “painful” any more. Adding column(s) to a busy table is now instantaneous and has less impact to the OLTP operations. In fact, the operation can be performed online.
This still requires taking into the consideration other factors that can impact the operation such as; data type of the new column, row length etc.

The task for today will be to add a new , non-null column of integer data type with the default value 46 to a busy table that has  1M rows 🙂

Test table

IF OBJECT_ID('dbo.addANonNullColOnline','U') IS NOT NULL
    DROP TABLE dbo.addANonNullColOnline
GO
--create a test table
SELECT TOP 1000000 id = IDENTITY(INTEGER,1,1)
                   ,val = CAST(c2.[name] +'AAAA' AS VARCHAR(4)) --fill all 4 bytes
                            COLLATE DATABASE_DEFAULT
    INTO dbo.addANonNullColOnline
FROM sys.columns c1
    CROSS JOIN sys.columns c2
        CROSS JOIN sys.columns c3
GO
--set one value to NULL
UPDATE dbo.addANonNullColOnline
    SET val = NULL
WHERE id = 2
id          val
----------- -----
1           bitp
2           NULL
3           colg
4           hbco
5           maxi
6           null
7           offs
...
...
10000000    runt

The table contains 1M rows. One row, id=2 has a NULL value for column “val”. The rest of the column values are taking the maximum available space of 4bytes.
NOTE: Varchar stores non-unicode characters e.g VARCHAR(4) =’AAAA’ takes 4bytes. (1 byte per non-unicode character).

The query below provides basic information about the table rows:

SELECT index_level
       ,page_count
       ,record_count
       ,min_record_size_in_bytes
       ,max_record_size_in_bytes
       ,avg_record_size_in_bytes
       ,SizeInMB = ROUND( ((record_count * avg_record_size_in_bytes) /1024.00 )/1024.00,0)
FROM sys.dm_db_index_physical_stats(db_id('test'),object_id('dbo.addANonNullColOnline','U'),null,null,'detailed')

The metadata information shows that the “shortest” row (id = 2) takes 11bytes of disk space, and the “longest” row(all other rows) takes 19bytes. The whole table(all rows) takes around 18MB of the disk space.
Before we add a new, not null column to the table, lets show how the rows are physically stored.

Rows physical storage

In this case, rows are stored using the FixedVar storage format. The name comes from the order in which Sql Server storage engine stores different data types – fixed data types(integer, decimal, char..) and then variable data types(nvarchar, varchar, varbinary..).The format has been used since Sql Server 7.0(Sphinx) released in 1998. (smile).
FixedVar format internally adds a number of additional bytes to every row on page. The extra space per row is used to provide information about rows itself like (if the row has been deleted/ghost ,the number of columns, columns with NULL values, position of the variable columns if any,  timestamp and pointer to the verison store if optimistic locking is enabled.. etc..).
For the fixed columns, storage engine will add:

6bytes + CEILING(TotalNoOfColumns / 8.00bytes)

..and for the variable part of the row

2bytes(if at least one of the varColumns has non-null value) + 2bytes * NoOfVariableNonNULLColumns.

In our example, the minimum row size(11bytes , row id=2) and the maximum row size( all other rows 19 bytes) will be..

--min_record_size_in_bytes
SELECT [FixedPart: 6 + CEILING(2/8.00)] = 6 + CEILING(2/8.00)
      ,[FixedColumnValue(INT) bytes] = 4
      ,'---'='---'
      ,[VariablePart: 0 + 2*0] = 0 +2* 0 --no non-null variableColumns
      ,[VariableColValue(VARCHAR(4)] = NULL
      ,TotalRowSize = 6 + CEILING(2/8.00) + 4 + 0
GO
--max_record_size_in_bytes
SELECT  [FixedPart: 6 + CEILING(2/8.00)] = 6 + CEILING(2/8.00)
       ,[FixedColumnValue(INT) bytes] = 4
       ,'---'='---'
       ,[VariablePart: 2 + 2*1] = 2 +2* 1
       ,[VariableColValue(VARCHAR(4) bytes] = 4
       ,TotalRowSize = 6 + CEILING(2/8.00) + 4 + 4 + 4
GO

The image below shows the structure of a FixedVar row (Pro Sql Server Internals by Dimitri Korotkevitch)

Now, down to the bits and bytes, lets find the first data page(out of 2598) that is allocated to the table and try to reconcile the min and a max row. The undocumented internal column/fn %%Lockres%% gives us FileId:PageId:SlotId for the selected rows.

SELECT TOP(5) f.*
              ,%%lockres%% as [FileId:PageId:SlotId]
FROM dbo.addANonNullColOnline f 
ORDER BY id

The query selects first few rows from the test table.

id          val  FileId:PageId:SlotId
----------- ---- --------------------------------
1           bitp 1:16576:0
2           NULL 1:16576:1
3           colg 1:16576:2
4           hbco 1:16576:3
5           maxi 1:16576:4

(5 row(s) affected)

With the PageID of the page we want to examine, the next step is to allow sql server to redirect the page content output to the screen.  To achieve this we can use DBCC(Database Console Command). The system change will affect only the current session.

--check the current state of the flag
DBCC TRACESTATUS(3604)
GO
DBCC TRACEON(3604,-1) --sets up traceflag 3604 value to true on a session level
GO
--displays the first page
DBCC PAGE (0,1,16576,3) --WITH TABLERESULTS
GO

The page snapshot below shows the first page header(first 98bytes of the total page size – 8192bytes) and the first couple of rows. The relevant information for this exercise is highlighted.

The page is read from the disk and put into the buffer pool. The relevant information on the page:

  1.  File No : Page number (1:16576)
  2.  m_lsn = (525:62471:2) – Log Sequence Number.
  3. The position of the first row.
    Slot(0) – the first row on the page
    Offset 0x60 – the physical position from the beginning of the page (Hex 0x60 = 96bytes)
    Length = 19* Happens to be the maximum record size. All rows exept row no 2 (id =2) are 19bytes.
  4. The first row -hexadecimal(byte swapped values)

      30000800 01000000 02000001 00130062 697470

       03 – TagA 1 byte
       00– TagB 1 byte
       0008 – Fsize bytes  – location of the end of the last fixed datatype column.
                       8bytes = 1b(TagA) +1b(TagB) + 2bytes(Fsize itself) + 4b(INT column)
      00000001 – The fixed data type column values. One int column with value 1.
      0002 – Total number of columns (2bytes)  =decimal 2
      00 – NullBitmap 1byte = CEILING(TotalNoOfColumns /  8.00bytes) = CEILING(1/8.00bytes)

     TOTAL Fixed part of the row: 1+1+2+4+2+1 = 11bytes

     00012bytes – No. of variable type columns = decimal 1 (If all of the variable data type columns are NULL, this will be 0bytes.)
     0013 2bytes – Every two bytes stores the location of the end of a variable,not NULL data type column. In this case there is only one variable column = decimal 19 – the var. column ends on 19th byte.
(2bytes * NoOfVariableNonNULLColumns)
     62 697470
 – 4 bytes – The actual variable column value ‘bitp’

SELECT CONVERT(VARBINARY(4), 'bitp')
Column 2 value
--------------
0x62697470

(1 row(s) affected)

    TOTAL Variable part of the row: 2+2+ = 8bytes

    Total row1 size = 11bytes(Fixed part) + 8bytes(variable part) = 19bytes

  1. Column “id” value, first row
  2. Column “val” value, first row
  3. The position of the second row.
    Slot(1) – second row on the page
    Offset 0x73 – the physical position from the beginning of the page (Hex 0x73 = 115bytes). This is header(96bytes) + first row(19bytes)
    Length = 11*  Happens to be the minimum record size. All othere rows are 19bytes.
  4. Second row -hexadecimal(byte swapped values)

      10000800 02000000 020002

      01 – TagA 1 byte
      00– TagB 1 byte
      0008 – Fsize bytes  – location of the end of the last fixed datatype column.
                      8bytes = 1b(TagA) +1b(TagB) + 2bytes(Fsize itself) + 4b(INT column)
      00000002 – The fixed data type column values. One int column with value 2.
      0002 – Total number of columns (2bytes)  =decimal 2
      02 – NullBitmap 1byte = CEILING(TotalNoOfColumns /  8.00bytes) = CEILING(1/8.00bytes)  02hex = 0010bin.  This indicates that the second column (second bit is on) in the row contains NULL value.

      TOTAL Fixed part of the row: 1+1+2+4+2+1 = 11bytes

      Since the only variable column contains NULL value, the variable part does not take any space.

      Total row2 size =11bytes(Fixed part) = 11bytes

    1. Column “id” value, second row
    2. Column “val” value, second row
    3. Slot 1 Column 2 Offset 0x0 Length 0 Length (physical) 0
      This shows that the position of the column 2 in the row 2 is 0 and the physical length of the column is also 0. This means that the value is not physically stored on the page. The NULL value during the materialisation of the column value comes from theNullBitmap byte 02

Add a new column

Finally, we will add a column to our “hot” table. The column will be a non-null integer with the default value of 46.

ALTER TABLE dbo.addANonNullColOnline
    ADD NewCol INT NOT NULL
        CONSTRAINT DF_NewCol DEFAULT(46)
GO
SELECT *
FROM addANonNullColOnline

The action will “add” a new column and 1M 46-es in a few milliseconds. We can confirm that the new column with the values is real(materialised).

id         val   NewCol
----------- ----- -----------
1           bitp 46
2           NULL 46
3           colg 46
4           hbco 46
5           maxi 46
6           null 46
7           offs 46
...
...
10000000    runt 46

How it works

After adding a new, integer data type column, we expect to see increase in row size( by 4 bytes). We also expect that Sql Server storage engine somehow managed to add 1M 4byte integer values to the table  in a fraction of second.

Let’s check rows metadata again.

SELECT index_level
       ,page_count
       ,record_count
       ,min_record_size_in_bytes
       ,max_record_size_in_bytes
       ,avg_record_size_in_bytes
       ,SizeInMB = ROUND( ((record_count * avg_record_size_in_bytes) /1024.00 )/1024.00,0)
FROM sys.dm_db_index_physical_stats(db_id('test'),object_id('dbo.addANonNullColOnline','U'),null,null,'detailed')

The query result shows that there were no changes in the row sizes.
So, what actually happened? Knowing that table rows are stored on data pages(RowStore – FixedVar storage format), we would expect increase in min and max row sizes. Using the formulas above…

min_record_size_in_bytes (row id=2)  = 6 + CEILING(2/8.00) + 4 + 4    +  0 + 2*0  =   6 + 1 + 8  =  15
max_record_size_in_bytes  = 6 + CEILING(2/8.00) + 4 + 4    +  2 + 2*1  + 4  =  6 + 1 + 8  + 4  + 4  =  23

The formulas are correct but the result is not 🙂

If we check the first page again, we’ll notice that
m_lsn (log sequential number) has not changed
– The new column (NewCol) is not stored on the page (offset from the beginning of the rows is 0bytes)

The unchanged m_lsn shows us that there were no changes on the page.

More about m_lsn

m_lsn – represents the last Log Sequential Number related to the last transaction that changed the page. The set of log records which describe the changes(old and new  version of the data, transaction, affected row etc.) starts with the same lsn. The log information allow SQL Server to recover databases to transnationally consistent state in case of system crash or any unexpected shut down.
SQL Server uses WAL(Write-Ahead-Log) model which guarantees that no data modifications will be written to disk before the associated log record is written to the disk.
Example: A transaction T1  updates a few columns in a row. The high level sequence of events will be:

  1. A Data access operator requests from the buffer pool manager(BPM) data page that contains rows to be updated .  BPM checks if the page already exists in the buffer pool. If not, BPM requests I/O operations to retrieve the page from the disk.
  2. The update operation(transaction T1) updates relevant row on the page in the buffer pool. The page header information m_lsn is updated. The operation creates a set of new log records(starting with the the same m_lsn) that describes changes made to the page. The log records are created in the log buffer.
  3. Transaction T1 commits the changes. Log buffer manager flushes the content of log buffer(~64kb) to the disk(log file).
  4. Client application receives confirmation that the transaction T1 has been successfully committed.

New column metadata

After creating a new, non-null column, instead of updating all 1M rows with some value, Sql Server 2012+(Enterprise Ed) stores the information as metadata.

SELECT TableName = object_name(p.object_id)
      ,ColumnType = type_name(pc.system_type_id)
      ,pc.max_length
      ,pc.[precision]
      ,pc.scale
      ,pc.is_nullable
      ,has_default
      ,default_value
FROM sys.system_internals_partitions p
    INNER JOIN sys.system_internals_partition_columns pc
        ON p.partition_id = pc.partition_id
WHERE p.object_id = object_id('dbo.addANonNullColOnline');

The two metadata columns hold the information about the added column. This is where 46 is materialised from, not from the page, but from the metadata.
This new behavior occurs automatically and does not impact the system.

default_value has_default

The defaut_value / has_default columns which belongs to sys.system_internals_partition_columns system view are not related to DF_NewCol Default Constraint defined during the table alteration.
The metadata that describes the default constraints can be found using the query below

SELECT parentTable = object_name(dc.parent_object_id)
      ,dc.name
      ,dc.type_desc
      ,dc.[definition]
FROM  sys.default_constraints dc
WHERE dc.name = 'DF_NewCol'

If we drop the constraint, has_default/ default_value will stay intact.

row UPDATE

An interesting thing will happen when we update a row. Let’s update row id=2 and replace the NULL value with something.

UPDATE dbo.addANonNullColOnline
    SET val = 'Hi 5'
WHERE id = 2;
id         val   NewCol
----------- ----- -----------
1           bitp 46
2           Hi 5 46
3           colg 46
4           hbco 46
5           maxi 46
6           null 46
7           offs 46
...
...
10000000    runt 46

…and now the max,min rowsizes are changed.

Whenever we update one or more values in a row, the default value (NewCol = 46) will be written on the data page. The rows that are not updated will still materialise value 46 from the metadata.

If you inspect the first page, you’ll find that the page has a new m_lsn and that all the column values of the row are written on the page
Before the change, the shortest row was row2 (id=2), size = 11b. Now, after the update, the row2 size increased to 23b. The NewColumn default value (46) now  became a physical part of the row.
Using the formulas mentioned before, the new min/max row sizes are:

max_record_size_in_bytes (row id=2)  = 6 + CEILING(2/8.00) + 4 + 4  +  2 + 2*1  + 4 =  {6 + 1 + 8 } + {4  + 4 }= 23
**4bytes for  the NewCol value of 64 and 4bytes for ‘Hi 5’  value that replaced NULL value

min_record_size_in_bytes (all rows expect row id=2)  = 6 + CEILING(2/8.00) + 4  +  2 + 2*1  + 4  =  6 + 1 + 4  + 4  + 4  =  19

Table row source(s)

The figure below shows that the test table materialises column values from two different sources – data pages and metadata.

We see that only the updated row (id=2) has all the columns physically stored on the data page.
E.g row 1 is physically stored in Slot 0, but the new column is not part of the row
Slot 0 Column 3 Offset 0x00 Length 4 Length (physical) 0
NewCol = 46

During the production life, all the table rows may be gradually updated.  The has_default default_value values(1 and 46 respectively) metadata remain but will not be in use.
Metadata will be removed if we REBUILD the table

ALTER TABLE dbo.addANonNullColOnline
    REBUILD;

Restrictions

There are a few data types that cannot be added online.

  • Varchar(max)
  • NVarchar(max)
  • Varbinary(max)
  • XML
  • hierarchy_id
  • geometry
  • geography

Default expressions that are not runtime constant (expressions that require a different value for each row) cannot be added online. e.g NewId(),NewSequentialId() ..

Row-overflow row storage format is not supported. New columns must not increase  the maximum row size over 8060 bytes limit.

Conclusion

Starting from Sql Server 2012 (Enterprise Ed) it is possible to add, ONLINE, non null columns with values to a busy table. Sql Server storage engine uses metadata to store the default values for the added columns. When requested, the table columns are materialised from data pages and metadata. There are restrictions on the data types that may be used. e.g  columns with BLOB datatypes cannot be added online.. etc. Only the runtime constant expressions can be used as a default constraints.

Thank you for reading

Dean Mincic

ORDER BY – Changes everything

SQL programming has a few unique aspects such as Three value logic(3VL), set-based programming logic, and logical processing order. Divination from the programming patterns based on the unique nature of the language usually leads to poor-performing SQL code.

RDBMS fundamentals

SQL Server as an RDBMS(Relational Database Management System), and its SQL language dialect/variant TSQL(Transact-Structured Query Language) has a strong foundation in mathematics. TSQL as SQL Servers language for data management and manipulation is based on two cornerstones of mathematics – Set theory and Predicate logic.

Edgar F. Codd was a computer scientist who invented the relational model for database management systems. He used mathematical theories (Set theory and predicate logic) to create a general data management theory.
IBM’s System R is the very first database system built on Codd’s relational model. In order to “force” DB vendors to follow the relational principles when designing DB systems, Codd published his 12 rules which describe a relational database.
Various database vendors(Oracle, Sybase, SQL server..) implemented the Relational principles similarly but differently. This means that the DB systems “slightly” deviate from Codd’s relational theory, and the theory itself “slightly” deviates* from the original set theory and predicate logic – but that’s ok 🙂

*e.g One of the differences between predicate logic and relational algebra is that the first support the two-valued logic (True/False). The relational model supports three-way logic(True/false and Unknown)

Although tsql querying design patterns are derived from Relational algebra, Tsql does not strictly follow relational rules. One of the differences, that is also related to this article is that a Relation is not sorted → There is no relevance to the order of the elements in a set. However, Tsql allows us to sort(ORDER BY) the final result set (the one that is returned to the Client). We are allowed to replace the Relational result(as a “final product” of the relational operations in a Relational database based on relational theory…. ) with a non-relational result – CURSOR. That is why Order by “Changes everything” 🙂

The logical query processing sequence

To understand the ORDER BY operation, it is important to understand the logical query processing sequence. The concept is unique to the SQL world and defines the order of execution of different query segments.  It’s also known as Theoretical execution order because the sequence may be changed by SQL Srv optimizer due to many different reasons i.e data access patterns, query simplification, etc. The query sequence:

  1. FROM – The FROM phase identifies the query’s source tables. This phase processes CROSS JOIN, ON Filter, and in the case of OUTER JOINs it adds the non-matched rows back to the result set.
  2. WHERE Filter. Removes all those rows from the previous stage for which the predicate evaluates false.
  3. GROUP BY – Performs grouping and aggregation calculations. This phase generates one row per group.
  4. HAVING – This is the last of the three query filters. It filters the results of aggregate functions.
  5. SELECT – Processes the elements in the Select clause i.e Evaluates expressions within the Select statement and/or removes duplicates(DISTINCT) and/or filters the specified percentage/number of rows(TOP) defined by the ORDER BY clause. The phase returns TABLE.
  6. ORDER BY – Sorts the rows according to the list of columns specified and returns CURSOR.
  7. FOR XML/FOR JSON The phase converts tabular results from the SELECT statement into XML/JSON output.

For the queries that include SET operations, the logical sequence will be.

  1. Query 1
  2. SET operation (union, except, intersect) or multi-set Union All
  3. Query 2
  4. Order By

Order by and …

ORDER BY operation guarantees ordered result set – Cursor. TSQL deviates from ANSI standard and allows us to sort results using the columns and/or expressions based on columns that are not part of the SELECT statement(This is not true if the query uses the DISTINCT clause). ANSI standard allows sorting only by using the columns mentioned in the select statement. ORDER BY can also be used as a logical sorting operator when operating as a part of TOP/Offset functions.

Because ORDER BY converts query result-set into a CURSOR, it cannot be used to define table expressions;

  • Views
  • ITVF(Inline table-valued functions aka parameterized views)
  • Derived tables
  • Subqueries
  • CTE(common table expressions)

… unless used as a logical sorting with(top,offset,for XML/JSON)

The CURSOR must be returned to the client application that is able to “consume” cursor records, one at a time and in order.

ORDER BY treats NULL values as equal. NULLs are sorted FIRST in ASC order.

SET operations (Union, Except, Intersect) and multi-set Union All

When using SET-based operations (UNION, EXCEPT, INTERSECT) and/or multi-set operation UNION ALL  we cannot use the ORDER BY operator in the individual statements. The following query will fail the parsing phase.

DECLARE @t1 TABLE (id INT, val VARCHAR(10));
DECLARE @t2 TABLE (id1 INT, val1 VARCHAR(10));
INSERT INTO @t1
    VALUES (1,'A'),(1,'B'),(2,'S')
INSERT INTO @t2
   VALUES (1,'A'),(1,'B1'),(2,'S'),(3,'M')

SELECT * FROM @t1 ORDER BY val DESC
INTERSECT 
SELECT * FROM @t2

Msg 156, Level 15, State 1, Line 24
Incorrect syntax near the keyword ‘INTERSECT’

SQL Server’s SET operations correspond to operators defined in mathematical set theory. Therefore, the operands must be sets – represented as Tables (Set → Relation → Table)
ORDER BY changes query output to something that ANSI SQL defined as CURSOR. Set operations do not support ordered structures.

We can use the ORDER BY clause to format the FINAL result-set for presentation purposes. The ORDER BY in the example below logically does not belong to the last query. It operates on (Set1 U Set2)

SELECT * FROM @t1 --Set1
INTERSECT --U
SELECT * FROM @t2 --Set2
ORDER BY val DESC

TOP/OFFSET-FETCH

The TOP option is unique to tsql and does not follow ANSI guidelines. The directive simply allows us to select a number of rows or a percentage of rows to return. The option is supposed to be paired up with ORDER BY – always.
This time ORDER BY does not serve a presentation purpose. This time it serves as a logical “TOP sorting” operator. It is supposed to answer the question “TOP according to what order?”. The result of a TOP/ORDER BY operation is a table – not a cursor.
Unfortunately, tsql allows us to use TOP without ORDER BY, and in that case, the general query result will be random and non-deterministic TOP(n) rows / TOP(n) PERCENT result set.

It is a good practice to specify ORDER BY(SELECT NULL) if our intention is to select random and/or non-deterministic TOP(n) rows. Query Optimiser will remove the logical ordering during the simplification phase. The purpose of this is to let other developers know that this was done intentionally.

Test data:

IF OBJECT_ID('tempdb.dbo.#testTopOrderBy','U') IS NOT NULL
DROP TABLE #testTopOrderBy
GO
CREATE TABLE #testTopOrderBy(CustomerId INT UNIQUE NOT NULL
,OrderId INT
,OrderDate DATE)
GO
INSERT INTO #testTopOrderBy
   SELECT t.*
   FROM (VALUES (100,1001,'20170101')
                ,(106,1005,'20170210')
                ,(105,1007,'20170101')
                 
                ,(103,1008,'20170410')
                ,(104,1011,'20170511')
                ,(102,1022,'20170101')
                ,(101,1223,'20170210') ) t(custId,OrdId,odate)
CustomerId  OrderId     OrderDate
----------- ----------- ----------
100         1001        2017-01-01
106         1005        2017-02-10
105         1007        2017-01-01
103         1008        2017-04-10
104         1011        2017-05-11
102         1022        2017-01-01
101         1223        2017-02-10

(7 rows affected)

Test 1: The query returns two randomly chosen rows(TOP without ORDER BY)

SELECT TOP(2) *
FROM #testTopOrderBy
ORDER BY (SELECT NULL) – include this line to let other developers know about your intentions
CustomerId  OrderId     OrderDate
----------- ----------- ----------
100         1001        2017-01-01
106         1005        2017-02-10

Test 2: The query returns a non-deterministic result – “non-unique” ORDER BY is used aka. non-deterministic Order By.

SELECT TOP(2) *
FROM #testTopOrderBy
ORDER BY OrderDate
CustomerId  OrderId     OrderDate
----------- ----------- ----------
100         1001        2017-01-01
105         1007        2017-01-01
(2 rows affected)
--row not included
102         1022        2017-01-01

The query “decided” to select Customers 100 and 105 but not 102. In this case, if we had a clustered index on customerId, the query would probably select customers 100 and 102.

Test 3: Two ways to make the “TOP(n)” query return a deterministic result.

--Deterministic ORDER BY
SELECT TOP(2) *
FROM #testTopOrderBy
ORDER BY CustomerId,OrderDate
CustomerId OrderId OrderDate
----------- ----------- ----------
100 1001 2017-01-01
101 1223 2017-02-10

(1) The deterministic ORDER BY provides more specific information about which TOP(2) order dates to select. Now the query knows exactly which TOP(2) rows to select. a combination of Order By columns that provides a deterministic result-set is called Tiebreaker.

--TOP with TIES
SELECT TOP(2) WITH TIES *
FROM #testTopOrderBy
ORDER BY OrderDate
CustomerId OrderId OrderDate
----------- ----------- ----------
100 1001 2017-01-01
105 1007 2017-01-01
102 1022 2017-01-01

(2) The “WITH TIES” guarantees deterministic results based on Order by column(s). WITH TIES must have ORDER BY defined. In the case of non-deterministic Order By, WITH TIES will add all relevant rows to the result set in order to provide determinism. In the example above it added an extra row.

OFFSET-FETCH (tsql 2012+)

OFSET-FETCH is an ANSI SQL-supported feature. Basically, the function is an advanced version of the TOP function. The option allows us to select(fetch) only a window or a page of results from the result set.
Opposed to the similar, TOP function, OFFSET–FETCH can not operate without Order By. On the other hand, OFFSET-FETCH does not support WITH TIES therefore it cannot guarantee a deterministic result set.
The following query fails because offset-fetch always operates with ORDER BY:

SELECT *
FROM #testTopOrderBy
OFFSET 0 ROWS FETCH FIRST 2 ROWS ONLY

Msg 102, Level 15, State 1, Line 25 Incorrect syntax near ‘0’. Msg 153, Level 15, State 2, Line 25 Invalid usage of the option FIRST in the FETCH statement.

If we want to select random 2 rows from a table, we can use

SELECT *
FROM #testTopOrderBy
ORDER BY (SELECT NULL)
OFFSET 0 ROWS FETCH FIRST 2 ROWS ONLY

The query below will return the same result as the query from Test 3.

SELECT *
FROM #testTopOrderBy
ORDER BY CustomerId,OrderDate
OFFSET 0 ROWS FETCH FIRST 2 ROWS ONLY
CustomerId OrderId OrderDate
----------- ----------- ----------
100 1001 2017-01-01
101 1223 2017-02-10
(2 rows affected)

Same as TOP, OFFSET-FETCH returns a table therefore it can be used in other table expressions, e.g., (the query does not make much sense – it’s only for testing purposes …:)

SELECT a.*
FROM #testTopOrderBy a
INNER JOIN (
            SELECT CustomerId
            FROM #testTopOrderBy
            ORDER BY CustomerId,OrderDate
            OFFSET 0 ROWS FETCH FIRST 2 ROWS ONLY
           ) x --the result is a table expression
ON a.CustomerId = x.CustomerId

..and this will fail. Table operators (JOIN and non-ANSI APPLY, PIVOT, and UNPIVOT) operate only between tables(table expressions).

SELECT a.*
FROM #testTopOrderBy a
INNER JOIN (
            SELECT CustomerId
            FROM #testTopOrderBy
            ORDER BY CustomerId,OrderDate
           ) x --the result is a cursor
ON a.CustomerId = x.CustomerId

Msg 1033, Level 15, State 1, Line 38
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions unless TOP, OFFSET, or FOR XML is also specified.

VIEWS

A view is a named virtual table that is defined by a query and used as a table
A view represents a table, and a table is a logical entity that has no order to its rows. This is the reason why it’s not allowed to use presentation ORDER BY in a view. VIEWS ARE NOT ORDERED!
The view definition below will fail

Test 4. Views are not ordered

CREATE VIEW dbo.TestViews
AS
SELECT v.CustId
      ,v.OrderId
FROM (VALUES(1,1000),(2,10001),(2,2003)) v(CustId,OrderId)
ORDER BY CustId

Msg 1033, Level 15, State 1, Procedure TestViews, Line 7 [Batch Start Line 1]
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions unless TOP, OFFSET, or FOR XML is also specified.

Now, some programmers might think that they somehow managed to override RDBMS fundamentals by defining a view with a TOP clause specifying all 100% rows and ORDER BY.

Test 5, Views with TOP 100% ordered rows

CREATE VIEW testViews
AS
    SELECT TOP(100) PERCENT
            [name]
           ,[object_id]
           ,create_date
    FROM sys.objects
    ORDER BY create_date DESC

This will pass the parsing stage and will be stored as a valid view definition. In reality, the view definition does not make much sense(see the TOP/ORDER BY example above). During the simplification phase, the query optimizer will remove the top 100 %/ORDER BY as an unnecessary logic.

SELECT * FROM  testViews
name                     object_id   create_date
------------------------ ----------- -----------------------
sysrscols                3           2014-02-20 20:48:35.270
sysrowsets               5           2009-04-13 12:59:11.093
sysclones                6           2014-02-20 20:48:35.477
sysallocunits            7           2009-04-13 12:59:11.077
sysfiles1                8           2003-04-08 09:13:38.093
sysseobjvalues           9           2014-02-20 20:48:35.610
syspriorities            17          2014-02-20 20:48:35.367
sysdbfrag                18          2014-02-20 20:48:35.563
sysfgfrag                19          2014-02-20 20:48:35.243
... etc...

The result is not ordered by create_date DESC
In some cases i.e when we use TOP and ORDER by the clustered index key columns, the result may be appearing to be sorted – QO may decide to use a sorted table scan. Again, there is no guarantee that the returned rows will be ordered.
e.g. During the final SELECT phase, QO may decide to use SQL Server’s storage engine advanced scan functionality also known as merry-go-round scan or scan piggybacking to return the same rows to two or more concurrent sessions. Without diving into the details of how the advanced scan works(will cover that interesting optimization mechanism in one of the next posts), the different sessions will have the same result sets but in a different order.

The presentation ORDER BY will convert the un-ordered view result into an ordered CURSOR.

SELECT *
FROM testViews
ORDER BY create_date DESC

Window Functions

Windows functions are introduced in the ANSI SQL-2003 standard. Microsoft partially implemented  the functionality in tSql 2005 and expanded to a full range of window functions in tSql 2012,
A Window function operates on a number of rows(window) in a set(table).  Window functions use ORDER BY as a logical operator which defines the behavior of the aggregate operations performed by the functions. In This scenario, ORDER BY, as a part of the functions does not perform the final sort.

It is important to understand where the Window function sits in the logical query execution sequence.

(5) SELECT
(5.1) Evaluate expressions in the SELECT statement
(5.2) WINDOWS Fn can be executed from here…
(5.3) DISTINCT
(5.4) TOP
(5.5) ….to here

(6) ORDER BY (presentation)

The example below demonstrates how window functions are positioned in the logical query exec. sequence.

Test Data.

IF OBJECT_ID('tempdb.dbo.#testWindowFnLogicalSeq','U') IS NOT NULL
    DROP TABLE #testWindowFnLogicalSeq
GO
CREATE TABLE #testWindowFnLogicalSeq( 
        OrderPackDesc VARCHAR(50) 
        ,OrderValue INT)
GO
INSERT INTO #testWindowFnLogicalSeq
    SELECT *
    FROM (VALUES ('Media Pack',100)
                ,('Media Pack',90)
                ,(NULL,10)
                ,('MS Surfice accessories',200)
                ,('MS Surfice accessories',50)
        ) tab(ordrPck,val)
OrderPackDesc                                      OrderValue
-------------------------------------------------- -----------
Media Pack                                         100
Media Pack                                         90
NULL                                               10
MS Surfice accessories                             200
MS Surfice accessories                             50

(5 rows affected)

The query below sums the OrderValues per OrderPackDescription. The OrderPackDesc = NULL will be summarised as a distinct value resulting OrderValue=10. ISNULL(OrderPackDesc,’Media Pack’) evaluates the NULL value as “Media Pack”

SELECT DISTINCT OrderPackDescEXPR = ISNULL(OrderPackDesc,'Media Pack')
               ,OrdVal = SUM(OrderValue) OVER(PARTITION BY OrderPackDesc
                                              ORDER BY( SELECT NULL))
FROM #testWindowFnLogicalSeq
OrderPackDescEXPR                                  OrdVal
-------------------------------------------------- -----------
Media Pack                                         10
Media Pack                                         190
MS Surfice accessories                             250

(3 rows affected)

This could lead to a logical error. We want to find out the sums of order values per order pack. We also want to treat the missing OrderPacks as ‘Media Pack’. The previous query sums the “missing order packs” separately. This is a typical logical error. The error would be more difficult to find if we had OrderValue = 190 for the unknown orderPackDesc. In that case, we will have a total of 190 instead of 380 for the ‘Media Pack’.
Taking into consideration the query execution sequence we can modify our query as:

SELECT DISTINCT  OrderPackDescEXPR = ISNULL(OrderPackDesc,'Media Pack')
                ,OrdVal = SUM(OrderValue) OVER(PARTITION BY ISNULL(OrderPackDesc,'Media Pack') ORDER BY( SELECT NULL))
FROM #testWindowFnLogicalSeq
OrderPackDescEXPR                                  OrdVal
-------------------------------------------------- -----------
Media Pack                                         200
MS Surfice accessories                             250

(2 rows affected)

…OVER(PARTITION BY ( ISNULL(OrderPackDesc,’Media Pack’) … is evaluated BEFORE the WINDOW fn.
Now our window fn, partition knows how to treat the missing values before the aggregation part and the execution sequence works in your favor 🙂
The example also shows that DISTINCT executes after WINDOW functions.
NOTE: It is not possible to use alias OrderPackDescEXPR in other expressions within the same query, like  … OVER(Partition by OrderPackDescEXPR ..) The limitation is another unique aspect of SQL language – All-at-once operation.
E.g in most programming languages, to swap values between variables we use a third, temp var. A→ temp, B→A  , temp→ B . In SQL we can swap table column values without using the temp var storage..

DECLARE @test TABLE(Age1 tinyint, Age2 tinyint)
    INSERT INTO @test
        VALUES (65,25)
SELECT * FROM @test
    UPDATE @test
        SET Age1 = Age2
           ,Age2 = Age1
SELECT * FROM @test

Window aggregate functions

Starting from SQL 2012 the ORDER BY clause can be used along with the window aggregate functions. ORDER BY dictates how the aggregate functions operate within the window partitions. It is important to understand the ORDER BY in this content to be able to avoid logical errors.

Create some test data

IF OBJECT_ID('tempdb.dbo.#testWindowFnLogicalSeq','U') IS NOT NULL
    DROP TABLE #testWindowFnLogicalSeq
GO
CREATE TABLE #testWindowFnLogicalSeq(
     CustomerId INT
    ,OrderId INT
    ,OrderValue INT)
GO
;WITH createNumbers as
(
    SELECT TOP 15 ROW_NUMBER()OVER(ORDER BY(SELECT NULL)) + 1000 rn
    FROM sys.columns c1
    CROSS JOIN sys.columns c2
    ORDER BY (SELECT NULL)
)
INSERT INTO #testWindowFnLogicalSeq
    SELECT CustomerId = (rn % 3)+ 1000 -- 3 customers
          ,OrderId = rn + 1000 --some unique orders
          ,OrderValue = ABS(CAST(CHECKSUM(NewId()) AS BIGINT)) % 1500 -- "random" numbers between 0 - 1499
FROM createNumbers
GO
SELECT * FROM #testWindowFnLogicalSeq ORDER BY CustomerId,OrderId
CustomerId  OrderId     OrderValue
----------- ----------- -----------
1000        2002        870
1000        2005        169
1000        2008        859
1000        2011        516
1000        2014        11
1001        2003        692
1001        2006        683
1001        2009        986
1001        2012        403
1001        2015        71
1002        2001        419
1002        2004        213
1002        2007        609
1002        2010        289
1002        2013        38

(15 row(s) affected)
** The OrderValues column contain random values.

The following query explains how ORDER BY affects SUM – window aggregate function

SELECT CustomerId
       ,OrderId
       ,OrderValue
       ,SUM(OrderValue) OVER() AS [GrandTotal OrderValue sum]
       ,SUM(OrderValue) OVER(PARTITION BY CustomerId) [Total OrderValue per cust.]
       --cumulative sums
       ,SUM(OrderValue) OVER(ORDER BY CustomerId) [Cumulative OrderValue per cust.]
       ,SUM(OrderValue) OVER(PARTITION BY CustomerId
                             ORDER BY OrderId) [Cumulative OrderValue per cust. & ord.]
FROM #testWindowFnLogicalSeq
ORDER BY CustomerID,OrderId --presentaiton purpose
CustomerId  OrderId     OrderValue  GrandTotal OrderValue sum Total OrderValue per cust. Cumulative OrderValue per cust. Cumulative OrderValue per cust. & ord.
----------- ----------- ----------- ------------------------- -------------------------- ------------------------------- --------------------------------------
1000        2002        870         6828                      2425                       2425                            870
1000        2005        169         6828                      2425                       2425                            1039
1000        2008        859         6828                      2425                       2425                            1898
1000        2011        516         6828                      2425                       2425                            2414
1000        2014        11          6828                      2425                       2425                            2425
1001        2003        692         6828                      2835                       5260                            692
1001        2006        683         6828                      2835                       5260                            1375
1001        2009        986         6828                      2835                       5260                            2361
1001        2012        403         6828                      2835                       5260                            2764
1001        2015        71          6828                      2835                       5260                            2835
1002        2001        419         6828                      1568                       6828                            419
1002        2004        213         6828                      1568                       6828                            632
1002        2007        609         6828                      1568                       6828                            1241
1002        2010        289         6828                      1568                       6828                            1530
1002        2013        38          6828                      1568                       6828                            1568

ORDER BY activates cumulative aggregations on the ordered column over a defined partition. The same applies to other aggregate functions MAX, MIN, AVG, and COUNT.
Failing to understand this behavior can lead to logical error 🙂

CONCLUSION

The main purpose of ORDER BY is to force a query to return an ordered result – CURSOR. This changes the relational nature of SQL language – query result sets are NEVER ordered. The presentation ORDER BY is an expensive operation (it needs to collect the entire result-set and put it in the requested order). The same syntax (ORDER BY) is used as a logical operator for TOP/OFFSET FETCH/WINDOW Functions/FOR XML formatter. In those cases, the purpose of the operator is completely different and DOES NOT guarantee the ordered output.

Thank you for reading.

Read Only Routing in Sql Server

In this blog, I wanted to share a recent experience of implementing Read Only Routing functionality on the existing HA/AG Sql Server Cluster. The technology significantly offloaded the primary production server and enhanced  the read queries performances.

The Sql Server HA/AG environment
Windows Failover Cluster

The main business application uses Sql Server set of databases hosted by a WSFC(Windows Failover Cluster). The cluster has two nodes, NODE1 and NODE2. Both nodes have Windows 2012 R2 Server OS and Sql Server 2014 SP1 database platform. The failover cluster has VNN(Virtual Network Name) CLUST1.

Availability group

The system has an Availability Group AG1 (and consequently a WSFC resource group). The AG1 group members are the main business application’s backend databases.
The two availability replicas host the AG1 availability group databases. Each replica exists on a different node, primary on the NODE1, and secondary on the NODE2.
The primary replica is read/write. The secondary replica, in addition to providing the HA/DR for AG1, is set to allow Read Only access for the clients. The readable secondary replica is set to “YES” which allows read-only workloads even without  ApplicationIntent parameter in the connection string definition.

The AG1 availability mode is set to : Synchronous-commit mode
The AG1 failover type is : Automatic failover

The business databases usage pattern

Because the way the cluster was set up the clients were connecting to the production databases through WSFC – CLUST1. All the requests were directed to the current primary replica on the NODE1. The secondary replica on the NODE2, even if set up to allow read-only connections, was never used for that purpose. It was possible to connect to the current secondary replica directly through NODE2 though.
The clients who use the databases are:

  1. ~200 concurrent business application users. They perform read/write queries through the app’s UI

  2. ~10 scheduled jobs that runs regularly during the business hours. The tasks mostly extract data from the business databases (read only operations) in order to provide the required business information for a number of different applications on a number of different platforms i.e Windows Active Directory, AWS, AZURE, Linux etc.

  3. An “unknown”, or should I say, NULL 🙂  number of users who regularly run the ad-hoc, read-only queries through a number of different applications e.g Excel Power Queries, Tableau, Crystal Reports and who knows what..

  4. A number of custom, in-house build, LAMP model applications (only in this case M stands for Ms Sql Server..) that also perform read only operations.

Even if the high-end hardware keeps the cluster’s performance on the acceptable level, there is a lot of room for a number of improvements. One of these improvements, the Read-Only Routing  is described in this blog post.

Read Only Routing

Simply speaking, with Read Only Routing in place, within the connection string, we can specify the client app. connection intention. The intention will later decide on which server the request will be routed. The intention can be:

  • Read only – Client connection needs only to read data
  • Read and write (Default value) – Client connection needs to read and write data.

This means that if we know that an application needs only to read data, we can use a connection string property ApplicationIntent with the value of READONLY to “tell the cluster” to redirect the requests to the secondary, read-only replica. This will offload the NODE1 resources(assuming that the current server role is the primary replica) and will “employ” the NODE2 resources in order to execute the queries.
The connection string property ApplicatoinIntent is supported by all SQL Server 2014 Clients such as ADO.NET(SqlClient),  Sql Server Native Client 11.0 ODBC driver,  SQLNCLI11 OLEDB Driver.
To make the cluster be able to recognise and redirect such a connections, we need to set up:

  • Availability group Listener for the same subnet ( in this case both replicas are on the same subnet)
  • Read-Only routing list
Availability group listener

The availability group listener is a WSFC resource that is logically bound to the AG through the corresponding WSFC resource group . It’s a VNN(virtual network name) with one or many IP addresses attached to it. The Listener runs on a single node(always on the current primary replica) and just handles the connection forwarding. The listener endpoint is always the primary replica.

It is possible to create multiple listeners for an availability group. Only the first listener can be created through tsql. The others can be created using WSFC Failover Cluster Manager or PowerShell script.
Since I was setting up the first listener for AG1, I was able to use tsql.

To set up an AG Listener we need to:

  1. Set up a DNS entry, preferably using a static IP address. In this case the DNS name is AG1-Listener. *
  2. Add the listener to the existing availability group, AG1.

*NOTE: If you decide to use SSMS to create a new listener, SSMS (through WSFC) will create a DNS entry for you. In this case you’ll need to provide the required AD credentials to the WSFC to be able to create the entry – Create Computer Objects permission.

Add the listener to the existing availability group

USE [master]
GO
ALTER AVAILABILITY GROUP AG1
    ADD LISTENER 'AG1-Listener' (
        WITH IP(('10.XX.XX.XX','255.XX.XX.XX'))
       ,PORT = 1433)

Check the listener’s metadata:

SELECT gl.dns_name
      ,gl.port
      --,gl.is_conformant
      ,gl.ip_configuration_string_from_cluster
      ,lip.ip_address
      ,lip.ip_subnet_mask
      ,lip.is_dhcp
      ,lip.network_subnet_ip
      ,lip.network_subnet_prefix_length
      ,lip.network_subnet_ipv4_mask
      ,lip.state_desc
FROM sys.availability_group_listeners gl
INNER JOIN sys.availability_group_listener_ip_addresses lip
    ON gl.listener_id = lip.listener_id

Set up Read-only routing lists

Read-Only routing URL is a pointer that determines which replica will response to the client driver(provider) Read-Only request.

Read-Only routing mechanism performs the following sequence of events when allocating a secondary replica to route the clients requests to.

  1. Client requests a connection to AG1-listener specifying ApplicationIntent=ReadOnly, Server=’AG1-Listener’, Database=’prod_db’.  The request is routed to the NODE1(primary replica).
  2. NODE1 checks the incoming connection, ApplicationIntent parameter value and checks if the specified database is in the availability group.
  3. The server enumerates and checks the replicas defined in the Read-Only routing list.
  4. The routing target is the first readable secondary replica that accepts ALL or Read-Only connections from the list.
  5. The server sends the secondary replica’s Read-only routing URL to the client.
  6. The client then redirects the connection request to the secondary replica following the provided URL.

To set up Read-only routing list we need to assign the Read-only routing URLs for both of the replicas since the replicas may switch roles during the failover. The URLs is used to tell the client where to redirect the Read-only requests.
The URL contains network protocol, system address and the port number.

/*
Define Read-Only URL on the current primary replica.
The URL will be used when the replica switch role and become secondary rep.
*/
ALTER AVAILABILITY GROUP AG1
	MODIFY REPLICA ON N'NODE1'
	WITH (SECONDARY_ROLE(READ_ONLY_ROUTING_URL=N'tcp://NODE1.myDomain.au:1433'))
GO

/*
Define Read-Only URL on the current secondary readable replica.
*/
ALTER AVAILABILITY GROUP AG1
	MODIFY REPLICA ON N'NODE2'
	WITH (SECONDARY_ROLE(READ_ONLY_ROUTING_URL=N'tcp://NODE2.myDomain.au:1433'))
GO

Check the Read-Only URLs

SELECT replica_server_name 
      ,[endpoint_url]           AS WSFC_NodeEndpointURL
      ,availability_mode_desc   AS AvailabilityMode
      ,failover_mode_desc       AS FailoverType
      ,primary_role_allow_connections_desc AS AllowedConnectionsWhenPrimary
      ,secondary_role_allow_connections_desc AS AllowedConnectonsWhenSecondary
      ,read_only_routing_url 
FROM sys.availability_replicas
WHERE read_only_routing_url IS NOT NULL

..and finally create the Read-Only routing list

--when a client read-only request hits the primary replica, the server will response with 
--the secondary replica (NODE2) URL since the replica is the first on the list
ALTER AVAILABILITY GROUP AG1
    MODIFY REPLICA ON N'NODE1'
    WITH (PRIMARY_ROLE(READ_ONLY_ROUTING_LIST=(N'NODE2',N'NODE1')))
GO

-- and if the replicas swith the places, the system will response with the 
-- current secondary replica( now NODE1) URL
ALTER AVAILABILITY GROUP AG1
    MODIFY REPLICA ON N'NODE2'
    WITH (PRIMARY_ROLE(READ_ONLY_ROUTING_LIST=(N'NODE1',N'NODE2')))
GO

Check the read-only routing lists

-- https://blogs.msdn.microsoft.com/alwaysonpro/2014/01/22/modifying-alwayson-read-only-routing-lists/
SELECT  ag.name AS [Availability Group]
       ,ar.replica_server_name AS [When Primary Replica Is]
       ,rl.routing_priority AS [Routing Priority]
       ,CASE 
            WHEN ar.replica_server_name = ar2.replica_server_name 
                THEN 'Read-Only on primary' + '('+ ar.replica_server_name + ')'
            ELSE ar2.replica_server_name
        END AS [Read-Only Routed To]
       ,ar.secondary_role_allow_connections_desc
       ,ar2.read_only_routing_url
FROM    sys.availability_read_only_routing_lists rl
        INNER JOIN sys.availability_replicas ar 
            ON rl.replica_id = ar.replica_id
        INNER JOIN sys.availability_replicas ar2 
            ON rl.read_only_replica_id = ar2.replica_id
        INNER JOIN sys.availability_groups ag 
            ON ar.group_id = ag.group_id
ORDER BY ag.name
       ,ar.replica_server_name
       ,rl.routing_priority
A few more things about the readable, secondary replicas

There is a certain data latency between primary and secondary replica. The data movement process usually takes a few seconds and follows the pseudo-sequence below.

  • User make a change on  AG databases (Begin transaction)
  • Primary replica sends a log information which describes the actions to be performed on the secondary replica(s).
  • The secondary replica(s) runs a dedicated “Redo” process that commits the changes on the secondary replica. At this point in time, the changes are not visible by the read-only queries that runs on the secondary. The readers are not blocked by the redo actions* as they read the previous version of the data that’s being changed.
  • The change is committed on the primary replica (Commit transaction ). The changes are visible on both replicas.

*All read-only queries that runs on the secondary databases are automatically executed within the SNAPSHOT TRANSACTION ISOLATION LEVEL. All locking hints e.g (tablock,xlock,..etc) are ignored.

The DDL actions on the primary replica(applied through the log records redo actions) and conversely, the READ actions on the secondary replica can block each other.  Secondary replica’s “SELECT” queries require Sch-S locks that guarantee the schema stability of the objects in use  e.g During the execution of.. Select * from tab1, the lock manager has to prevent other sessions from changing the tab1 table structure during the select operation. On the other hand, the redo DDL actions require Sch-M (Schema modification locks) that prevent concurrent access to the locked resource e.g (TRUNCATE TABLE tab1 ). Schema locks are acquired regardless of transaction isolation level. Sch-S is compatible with all other locks except Sch-M (and  Sch-M is incompatible with all lock modes). It’s probably not a good idea to perform read operations on the secondary replica if the main system performs frequent DDLs.

Conclusion

Always On Availability group  active, secondary replicas support the Read-Only access. The client requests for read-only access can be automatically redirected to the secondary replica. This will help to conserve primary replica’s resources for the mission critical workloads. The read-only queries will run faster since the workloads ,by default, use row versioning (snapshot isolation level) to remove read/write blocking. The read-only access will have some data latency and that needs to be considered when deciding to go with this technology.

Thanks for reading.

Dean Mincic

 

XACT_ABORT and Query Timeout in Sql Server

In this blog, I have explored how the XACT_ABORT setting controls Sql Server behavior when it interacts with Desktop and Web applications, in the situation of a query timeout runtime error. It is particularly interesting to see how SQL Server works with web applications that use the connection pooling mechanism in those conditions.  I also briefly covered MARS(Multiple Active Result Sets) technology and how it relates to XACT_ABORT.

The performed tests demonstrate the behavior of a typical stored procedure when executed in a session with the XACT_ABORT set to ON/OFF and in the query timeout situation. The tests covered a Desktop -SSMS  and a web application.

Here is a bit of theory we need to know in order to understand the test results.

XACT_ABORT
It is one of the parameters that define the current session behavior.  The parameter belongs to the group of Transaction statements along with IMPLICIT_TRANSACTIONS, TRANSACTION ISOLATION LEVEL, and REMOTE_PROC_TRANSACTIONS (deprecated in Sql Server 2014). More about the parameter can be found here.

Query Timeout

The queries executed on the client side may be canceled during the execution after a certain period of time. One of the reasons can be the “query timeout”, a scenario in which the app. code decides to cancel the ongoing action if the action takes more time to finish than expected(defined).

Query timeout is a client-side concept. After a certain period of time, a client, more specifically the db provider will raise an Attention event which will signal the query timeout to the db server.  At the next available opportunity, Sql Server will stop/abort the currently executing query in the context of the current connection. The sudden stop will not raise any Sql server errors.

XACT_ABORT defines how Sql Server handles the ongoing transactions in these situations.
It is worth noting to mention the Remote query timeout(Exec sp_configure ‘remote query timeout’). This server-scoped setting is related only to the queries executed by a remote source. e.g. Linked server.

When executing Update/Insert/Delete actions within an explicit transaction in SQL Server, there’s a risk of the transaction remaining open (abandoned) if it’s instructed to stop query execution. In such cases, various SQL Server resources may remain locked, making them unavailable to other queries. By default, XACT_ABORT is set to OFF, which means that, in case of a timeout, SQL Server leaves the handling of uncommitted transactions to the caller (the application code).

The application code can be designed to handle exceptions by either rolling back interrupted transactions and closing the connection or by simply closing the connection, allowing any open transactions to be automatically rolled back*.

*A brief note: In SQL Server, a transaction’s scope is defined differently from the server’s and client’s perspectives. From the server’s point of view, a transaction’s scope is a session. However, from the client’s perspective, the scope is a connection.

In SQL Server, a session can have zero, one, or more* related connections. Internal SQL Server engine processes (with SPID <= 50) don’t have related external connections. When using MARS (Multiple Active Result Sets), a session is associated with multiple hierarchical connections. The client’s connection serves as the parent, and together with the child connections, is logically scoped as ‘one per batch.’ Consequently, the scope of a transaction within the MARS-enabled connection is a batch also known as the  Batch scoped transactions.  This is particularly relevant when the MARS session is in the ‘Local Transaction Active‘ state, where all statements executed in a session run under an explicit or implicit transaction.

To examine the hierarchical organization of transactions, you can use dynamic views like sys.dm_exec_connections, focusing on columns such as net_transport, parent_connection_id, and connection_id.

MARS technology is a super interesting subject for research and certainly cannot fit into a “brief note”  🙂 

Session IDs’ are recyclable and are related to the unique transaction IDs.

SELECT *
FROM sys.dm_exec_sessions
WHERE is_user_process = 1; 
–-the predicate filters out the internal processes.
Connection Pooling

Connecting to the database is a resource-intensive operation. Each time a client interacts with a database, it establishes a new connection, which involves creating a new physical channel/socket, parsing the connection string, performing database authentication, and more. Because of the disconnected nature of the web-based applications, the connections are closed as soon as the client is finished with the db request. Constant opening and closing connections wouldn’t be a problem if there were a small number of clients who don’t interact with the db often. In reality, there is a large number of concurrent clients fighting for server resources.
The idea behind the Connection Pool(CP) mechanism is to allow already existing connections to be reused by the clients. Essentially, CP is a container object that contains a list of connections defined by the same connection string. A connection in this context is a pointer to the memory location where the connection object is stored.

Connection Pools are controlled and maintained by the database providers, like the .NET Framework Data Provider for Sql Server or the ADO.NET set of data access components based on the provider. Connection Pooling is active by default(even if not specified in the connection string). The Pool properties are a part of the connection string properties(key-value pairs).
More about connection pooling and pool fragmentation can be found here.

Another thing that would be interesting to explore is the sequence of events between a web application and Sql server during the connection pooling. The following is one of the typical scenarios.

  1. A Client requests data from SqlServer. There is no CP container defined based on a connection string that will be used to define the connection. Client calls the Open() method  of a connection object  e.g based on SlqConnection class if using .NET Data Provider for Sql Server)
  2. SqlServer authenticates the user and stores the connection/sql session data. The Audit Login event shows;  sessionid, EventSubClass =”1- Nonpooled”, Connection type  = “1- Non DAC” …etc. The requested connection is not part of the connection pool yet.
  3. The client executes the T-SQL batch and closes the connection (Close() method of the connection object). Even if enclosed e.g. in the “using” structure, the connection object won’t be destroyed but only referenced by a pointer now stored in a newly created connection pool object container. The object lives in the app. server memory. At this point, the Sql server is not aware that the client closed (returned) connection to the pool. From sql server point of view, the connection is still open. This also means that the abandoned(not committed) transactions, if any, will remain open – This will be a part of the following tests :)
  4. The same or a different client requests to open() the same connection. The code will not create(instantiate) a new connection object. Instead, the pool manager will provide the memory location of the existing connection object. Again, Sql is not aware of what’s happening on the client side and still keeps the information about the original connection.
  5. The client now executes a T-SQL batch e.g. using Sqlcommand /ExecuteNonQuery object/method in the context of the connection. It is only at this point that Sql Server receives the information about the connection being reused by someone.
    • a) The event Audit Logout is fired indicating that the existing connection is pooled. The EventSubClass = “2 – Pooled”.
    • b) The data access API layer(in this case SqlClient) executes system stored procedure sp_reset_connection (the sp is not available through T-SQL) to clean* the existing context of the connection and re-validate the authorization of the user-principal through Audit Login and Audit Logout events.  In this context, cleaning a connection context means cleaning up SqlServer’s session context i.e rollback any active transaction that may have been left open through this session, dropping any local temporary table that has been left behind by the previous owner of the connection, closes any open cursors and deallocates the resources it has been using, resets the CONTEXT_INFO, etc);
    • c) The Audit Login event fires up, again indicating the nature of the connection (EventSubClass = “2- Pooled”)
  6.  SqlServer executes tsql batch.
  7. The client calls the Close() method and the connection is again returned to the connection pool. (the pointer to the connection object is now available for the next client)
Test case scenarios:

I’ve performed the following tests to investigate how Sql Server reacts with different XACT_ABORT settings. The client applications will initiate a query timeout in the middle of an explicit, ongoing transaction encapsulated in a stored procedure.

Test 1: Desktop application query timeout. (MARS test included)
Test 2: ASP.NET application query timeout with connection pooling ON

Desktop-based client application query timeout

XACT_ABORT OFF (Default)

For this test, I used SSMS as a Windows-based client app. To prepare for the test, create a test environment using the script below:

--create test database
CREATE DATABASE XACT_ABORT_TEST
GO
USE XACT_ABORT_TEST
GO

-- Create a test table
DROP TABLE IF EXISTS dbo.Products; --tsql2016
GO

CREATE TABLE dbo.Products(ProductId INT 
                            CONSTRAINT PK_ProductID 
                                PRIMARY KEY CLUSTERED
                          ,ProductName VARCHAR(1000) NOT NULL
                          ,ProductPrice MONEY NULL );                      
GO
--Insert sample data
INSERT INTO dbo.Products(ProductId
                         ,ProductName
                         ,ProductPrice)     
 SELECT Tab.*
 FROM (VALUES (100,'Hard Drive',80.99)
             ,(200, 'SSD Drive',250.85)
             ,(300, 'Flash Memory 32GB',8.11)
             ,(400, 'Surface Book',4120.00) ) AS Tab(prodId,name,price);
GO
--Check the table
SELECT * FROM dbo.Products;

Stored procedures:

--the procedure updates a product price by the percent
CREATE PROCEDURE spuUpdateProductPrice
     @ProductId INT
    ,@ByPercent DECIMAL(5,2)
    ,@SetXactAbort BIT = 0 --Off by default
AS
BEGIN
    SET NOCOUNT ON;

    --set xact abort
    IF (@@OPTIONS & 16384)!=@SetXactAbort    
     SET XACT_ABORT ON 
    
    --output xact setting
    SELECT IIF( (@@OPTIONS & 16384)=0,'XACT_ABORT IS OFF','XACT_ABORT IS ON')
      
    BEGIN TRANSACTION xactTest;
    BEGIN TRY

        --update price by percent
        UPDATE dbo.Products
            SET ProductPrice += (ProductPrice * (@ByPercent * 0.01))
        WHERE ProductId = @ProductId;

        --this simulates a slow query
        WAITFOR DELAY '00:00:10';

       COMMIT TRANSACTION xactTest;
    END TRY

    BEGIN CATCH
        IF @@TRANCOUNT>0
            ROLLBACK TRANSACTION xactTest;

            ;THROW
    END CATCH
  
    RETURN;
END
GO
--select product information
CREATE PROCEDURE spsGetProductDetails
    @ProductId INT = NULL
AS
BEGIN
    SET NOCOUNT ON;

    SELECT ProductId
          ,ProductName
          ,ProductPrice
    FROM dbo.Products
    WHERE ProductId = @ProductId
        OR @productId IS NULL;

    RETURN;
END

SQL Server Profiler:
I’ll use SQL Server Profiler to capture the Attention event when sent by the client application.   Add Error And Warnings/Attention event to the Standard(default) set of the traced events.

SSMS:
SSMS will act as a Desktop client which will cause the query timeout. To simulate the scenario open(CTRL+N) three sessions.

  • Session1 – Change the Execution timeout from 0(unlimited) to 3s . The session will be used to run the sp that will cause query timeout (SSMS:RIght click/Connection/Change Connection)

    ssmsexectuiontimeout

The code below will be terminated by the timeout.

USE XACT_ABORT_TEST
GO
-- decrease SSD disk price by 10%
EXEC dbo.spuUpdateProductPrice 
    @ProductId = 200 
    ,@ByPercent = -10
  • Session 2– The session will be used to run the second sp that will be reading from the same table used by the procedure initiated in session 1.
USE XACT_ABORT_TEST
GO
EXEC dbo.spsGetProductDetails
  • Session 3– Add the code below to monitor the transaction state and the locked resources.
------------------------------
    --check the open transactions
    ------------------------------
    SELECT sTran.transaction_id 
       ,sTran.is_user_transaction AS isUsrTran
       ,sTran.open_transaction_count [NoOfTranPerSession]
       ,cn.parent_connection_id
       ,cn.connection_id
       ,aTran.[name]
       ,aTran.transaction_begin_time
       -- lookup table can be found on 
       -- https://msdn.microsoft.com/en-au/library/ms174302.aspx
       ,CASE aTran.transaction_type
          WHEN 1 THEN 'Read/write'
          WHEN 2 THEN 'Read-only'
          WHEN 3 THEN 'System'
          WHEN 4 THEN 'Distributed'
        END AS [Transaction Type]
        ,CASE aTran.transaction_state
          WHEN 0 THEN 'Not fully initialized'
          WHEN 1 THEN 'Initialized, not started'
          WHEN 2 THEN 'Active'
          WHEN 3 THEN 'Ended' 
          WHEN 4 THEN 'Commit initiated'
          WHEN 5 THEN 'Prepared, awaiting resolution'
          WHEN 6 THEN 'Committed'
          WHEN 7 THEN 'Rolling back'
          WHEN 8 THEN 'Rolled back'
        END AS [Transaction State]
        ,cn.session_id
        ----------- connection params ------------------
        ,cn.net_transport [Conection protocol]
        ,cn.connect_time [Connected At]
        ,cn.client_net_address [Client network address]
        ,cn.client_tcp_port [Client TCP port]
        ,cn.last_read
        ,cn.last_write   
    FROM sys.dm_tran_active_transactions aTran
        INNER JOIN sys.dm_tran_session_transactions sTran
            ON aTran.transaction_id = sTran.transaction_id
                right outer JOIN sys.dm_exec_connections cn
                    ON sTran.session_id = cn.session_id
WHERE sTran.transaction_id IS NOT null

--------------------------------------
--check locks, transactions, sessions
--------------------------------------
SELECT lck.resource_type
      ,DB_NAME(lck.resource_database_id) AS DBName
      ,lck.resource_description AS resDesc
      ,lck.request_mode
      ,lck.request_type
      ,lck.request_status
      ,lck.request_session_id
      ,lck.request_owner_type
      ,ts.transaction_id
      ,tat.name
      ,con.connection_id
      ,tat.transaction_type
FROM sys.dm_tran_locks lck
  INNER JOIN sys.dm_tran_session_transactions ts
    ON lck.request_session_id = ts.session_id
  LEFT OUTER JOIN sys.dm_tran_active_transactions tat
    ON ts.transaction_id = tat.transaction_id
  LEFT OUTER JOIN sys.dm_exec_connections con
    ON ts.session_id = con.session_id

---------------------------------------------
--list of the waiting tasks
SELECT wt.* 
FROM sys.dm_os_waiting_tasks wt
    INNER JOIN sys.dm_exec_sessions s
        ON s.session_id = wt.session_id
WHERE s.is_user_process = 1
---------------------------------------------

To begin the test, enable the Profiler trace and execute the code in session1 and then in session2(sp that just selects data) and finally execute the code in session3.

test1a

Analysis:

The sp(Session 1) will take more time(~10s) to execute than the Client’s execution time allows (3s). After 3s, the client sends an Attention event to the Sql server which causes Sequel to immediately stop the sp execution, with no error. Because XACT_ABORT is set to OFF(default), Sql Server will leave the transaction open and let the Client handle the commit/rollback.

The client’s event handler will send the following error message:

Msg -2, Level 11, State 0, Line 3
Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

Profiler -spids may be different (spid 55 is spid 60 on the previous image)profiler1

Now, the sp that runs in Session 2( reads all rows from dbo.Products) has to wait since Session 1’s active transaction(xactTest) holds X(Exclusive) lock on the KEY(the table is a clustered index) and  IX(Intent Exclusive) on the table(object) which is not compatible with the S(Shared Locks) requested by session 2.
The “SELECT” stored procedure will just wait indefinitely(the default execution time for that connection is unlimited).
Session 3 can be used to analyze the open transactions, locked resources, etc.
If we now disconnect Session 1(right click/Disconnect), Sql Server will automatically ROLLBACK all ongoing transactions within the closed connection scope, leaving the DB in the coexistent state – SSD price will not decrease by 10%   :). The stored procedure in Session 2 will finish and if we re-run the code in Session 3 we’ll see that there are no abandoned transactions. All relevant resources will be freed.

Programmers usually use the “using” statement to “wrap” the objects used for the communication with the Database. The structure ensures that all of the objects instantiated within the block will be removed, closed, flushed, disposed, etc.,(C devs won’t like this kind of a black box 🙂

using (SqlConnection conn = new SqlConnection(Properties.Settings.Default.ConnectionStr))
{
    using (SqlCommand cmd = new SqlCommand("spuUpdateProductPrice",conn))
    {
        cmd.CommandType = CommandType.StoredProcedure;
        cmd.Parameters.Add("@ProductId", System.Data.SqlDbType.Int).Value = 400; //hardcoded
        cmd.Parameters.Add("@ByPercent", System.Data.SqlDbType.Decimal).Value = -10; //hardcoded
        cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 0;//0 - OFF, 1 - ON

        cmd.CommandTimeout = 5; //default is 30s

        conn.Open();
                    
        Console.WriteLine("...about to run dbo.spuUpdateProductPrice");
        try
        {
            cmd.ExecuteNonQuery(); //will take ~10s to execute
        }
        catch (SqlException e)
        {
            //SET UP BREAKPOINT HERE
            Console.WriteLine("Got expected SqlException due to command timeout ");  
            Console.WriteLine(e);
        }
    }    
}

The error handler will handle the timeout and the code will exit the structure closing the connection. The Sql server will then automatically rollback all opened transactions(within the scope of the closed connection) leaving the DB in the coexistent state. This will happen regardless of the XACT_ABORT setting.

To show the difference between XACT_ABORT set to ON/OFF  we can set a break-point at Line:21(any line inside the catch block). With the xact_abort set to OFF the uncommitted transaction will remain open until the code exits the structure. With ON, Sql server will rollback the opened transactions immediately after receiving the Attention signal and before the break-point. This can be monitored through the Session 3 code as mentioned earlier.

XACT_ABORT ON

To test the same scenario with the xact_abort on, reconnect Session 1(Right click/connect – now, the default execution timeout is 3s) and include the third input parameter

USE XACT_ABORT_TEST
GO
-- decrease SSD disk price by 10%
EXEC dbo.spuUpdateProductPrice 
     @ProductId = 200 
    ,@ByPercent = -10
    ,@SetXactAbort = 1 --xact_abort ON

This time Sql server is set up to handle the uncommitted transactions, and when a run-time error occurs – query-timeout in this case, it will rollback the transactions. Session 2 will be able to execute its code. Session 3 will show the “empty resultsets” – no locked resources 🙂

MARS Enabled

Xact_Abort ON/OFF  setting has a different effect on the MARS-enabled connection in the example above.
To test this scenario change the Session 1 connection properties and add an additional connection parameter: MultipleActiveResultSets=True

If we run the test again the results will show that the XACT_ABORT setting does not affect the sql Server behavior* in case of the query timeout run time error. The explicit transaction will be automatically rolled back. Session 2 shows us the MARS transactions hierarchy within the same session.
*The behavior would be different if we executed the stored procedure logic as a script and without explicit transaction. 

ASP.NET application query timeout with connection pooling ON/OFF

The following test demonstrates Sql server’s behavior in case of application timeout when the XACT_ABORT is ON/OFF in combination with the Connection Pooling TRUE/FALSE

The test environment:

  1. A simple web application that can perform a) select and b) update actions on a database using one of two available connections – one connection authenticates using the Integrated* security and the other one uses Sql Server authentication.
  2. Windows Performance Monitor (perfmon.msc) – to examine connection pool counters
  3. Sql Profiler – for tracking events triggered by the app and data provider.
  4. Sql Server T-SQL code to track active connections, locks, etc. – Session 3 code from the previous example.

Note: The integrated security will use ApplicationPoolIdentity(AP) to connect to the DB. Execute the script below to create a Login for the AP identity and a Login for an sql server user:

-- windows login - application pool
USE master
GO
--create db server login
CREATE LOGIN  [IIS APPPOOL\DefaultAppPool] FROM WINDOWS
GO

USE XACT_ABORT_TEST
GO
--create db user
CREATE USER  [IIS APPPOOL\DefaultAppPool] 
GO
--add securables to the db user
GRANT EXECUTE ON OBJECT::dbo.spsGetProductDetails TO [IIS APPPOOL\DefaultAppPool]
GO
GRANT EXECUTE ON OBJECT::dbo.spuUpdateProductPrice TO [IIS APPPOOL\DefaultAppPool]
GO

-- Sql Server login
USE master
GO
--
CREATE LOGIN testXact WITH 
    PASSWORD = 'testXact'
    ,CHECK_POLICY =OFF
GO

USE XACT_ABORT_TEST
GO

CREATE USER testXact FOR LOGIN testXact
GO
GRANT EXECUTE ON OBJECT::dbo.spsGetProductDetails TO testXact
GO
GRANT EXECUTE ON OBJECT::dbo.spuUpdateProductPrice TO testXact
GO

The Web app.

The interface allows you to use two different connections to select and update the existing data. The update will cause the timeout run-time error.

The results of the select statements will be displayed in the Grid view

The error message will show under the grid view control.

 

 

 

 

 

The code is similar to the code used in the previous example.

using System;
using System.Data.SqlClient;
using System.Data;

public partial class _Default : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e){}
    public string ConnString(string auth)
    {      
        {
            if (auth == "IntegratedSecurity")
                return @"Data Source=tcp:(local);
                         Initial Catalog=XACT_ABORT_TEST;
                         Integrated Security=True;
                         Pooling=True;
                         Application Name=WINXactAbort";
            else
                return @"Data Source=tcp:(local);
                         Initial Catalog=XACT_ABORT_TEST;
                         user Id=testXact;
                         password=testXact;
                         Pooling=True;
                         Application Name=SQLXactAbort";  
        }
    }

    //show products
    protected void btnShowProducts_Click(object sender, EventArgs e)
    {
        //show products
        //NOTE: Use the default CommandTimeout: 30s
        using (SqlConnection con = new SqlConnection(ConnString(rBtnLst.SelectedValue)))
        {
            con.Open();

            using (SqlCommand cmd1 = new SqlCommand("spsGetProductDetails", con))
            {
                cmd1.CommandType = CommandType.StoredProcedure;
                cmd1.CommandTimeout = 30;
                try
                {
                    SqlDataReader rd = cmd1.ExecuteReader();
                    Grid1.DataSource = rd;
                    Grid1.DataBind();
                    rd.Close();
                }
                catch (SqlException e1)
                {
                    lblOutputText.Text = e1.Message;
                }
            }
        }
    }

    //Update product
    protected void btnUpdateProducts_Click(object sender, EventArgs e)
    {
        using (SqlConnection con = new SqlConnection(ConnString(rBtnLst.SelectedValue)))
        {
             con.Open();

            using (SqlCommand cmd = new SqlCommand("spuUpdateProductPrice", con))
            {
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.CommandTimeout = 3; //default is 30s

                cmd.Parameters.Add("@ProductId", System.Data.SqlDbType.Int).Value = 400; //hardcoded
                cmd.Parameters.Add("@ByPercent", System.Data.SqlDbType.Decimal).Value = -10; //hardcoded
                cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 0;//0 - OFF, 1 - ON

                lblOutputText.Text = "...about to run dbo.spuUpdateProductPrice";
                try
                {
                    cmd.ExecuteNonQuery(); //will take ~10s to execute
                }
                catch (SqlException e2)
                {
                    lblOutputText.Text = e2.Message;
                }
            }
        }
    }
}

Test: Two clients use the app. The first runs an expensive Update which causes the time-out, and the second one tries to Select data from the same table. Both clients use the same connection.Connection Pooling is enabled.

Test preparations:
Open two instances of the web application in two browsers. Run the T-SQL code from the previous example that selects the open transactions, Open SQL Profiler (standard trace + Errors/Attention event, Audit Login/ Audit Logout). Include Filter for ApplicationName in (WINXactAbort, SQLXactAbort). Open Performance Monitor and add .NET Data Provider for SqlServer Counters (example).

Analysis:

  1. Two different clients using the same connection type e.g Integrated Security authentication performs the following actions:
    •  1.1  Client1 updates a product price(Update Products). 1st non-pooled connection established.
    • 1.2 During the execution Client2 tries to read the product prices (Show Products). 2nd non-pooled connection established(EventSubClass =”1- Nonpooled”,). Client2 has been blocked by the Client1’s update process. Client2 waits.
    • 1.3 Client1’s update action times out (simulates a long-running query) leaving the products table locked by an uncommitted-abandoned transaction. Attention event raised. Close() method executed. A new CP container has been created. The client “returns” the connection to the newly created CP.  Sql Server thinks that the connection is still active. The CP now has one connection ready to be reused.
    • 1.4 Client2 still waits for Client1 to finish with the update and to unlock the Products table. Eventually, Client2 times out (default command timeout = 30s). The connection is now returned(referenced by a pointer) to the CP. The connection pool now has 2 connections ready to be reused. The Sql Server resources(products table) are still locked by Client1.
    • 1.5 At this point, the Connection Pool has two available (idle) connections. The first still has a non-committed transaction in its context. If a new client, Client3 tries to read the same data, CP will allocate the last connection available using the LIFO(Last In, First Out) algorithm. In this case, the allocated connection will be the one that originally belonged to Client2. As expected, Client3 will get blocked by the same abandoned transaction. Moreover, no client will not be able to access the Products table until the status of the transaction in question gets resolved (rollbacked/committed).

This scenario is common and can cause a lot of frustration since the effects can vary i.e sometimes works, sometimes doesn’t work 🙂

To add to the confusion, sometimes the situation can resolve itself e.g: CP will remove idle connections from the pool after 4 to 7min of inactivity by triggering the Audit Logout event which will, among other things, finally instruct SQL Server to rollback the outstanding transactions in the context of the destroyed connection(session). Another possible scenario is if the  “doomed” connection gets reused by a new Client. In that case, the sp_reset_connection stored proc(mentioned earlier) will instruct Sql Server to “clean the mess” and prepare the clean connection/session context for the new Client.

Profiler – the trace shows the above sequence of events

Connection, session, and transaction metadata

The situation can be avoided using the following approaches:

  1. Setting XACT_ABORT to ON. This will instruct Sql server to take action of rolling back the ongoing transactions after receiving the Attention event as a result of the command timeout issued by the client.
  2. More thorough data layer coding ie. executing the queries within the properly defined transactions. Properly handle the uncommitted transactions within the catch block.

Both techniques will prevent the unpredictable behavior explained above.

XACT_ABORT ON

To set up the Xact_abort ON change the value of the @SetXactAbort  parameter to 1 and repeat the test.

//show products
...
  cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 1;//0 - OFF, 1 - ON
...

The same test shows that as soon as Client1 raises the Attention event, Sql Server stops the code execution and performs rollback on the ongoing transaction. All subsequent read requests(Client 2,3..) will be successful.

Using Connection Transactions 

To test the scenario change the bthUpdateProducts_Click  event to to include transactions. and set parameter @SetXactAbort back to 0.

//Update product
    protected void btnUpdateProducts_Click(object sender, EventArgs e)
    {
        using (SqlConnection con = new SqlConnection(ConnString(rBtnLst.SelectedValue)))
        {
            con.Open();
            // Naming the transaction will make it easier to track using dmvs. 
            // Note that the explicit transaction defined in
            // the sp is now a nested transaction.
            SqlTransaction t1 = con.BeginTransaction("ExplicitTransactionFromCode");

            using (SqlCommand cmd = new SqlCommand("spuUpdateProductPrice", con))
            {
                cmd.Transaction = t1;
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.CommandTimeout = 3; //default is 30s

                cmd.Parameters.Add("@ProductId", System.Data.SqlDbType.Int).Value = 400; //hardcoded
                cmd.Parameters.Add("@ByPercent", System.Data.SqlDbType.Decimal).Value = -10; //hardcoded
                cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 1;//0 - OFF, 1 - ON

                lblOutputText.Text = "...about to run dbo.spuUpdateProductPrice";
                try
                {
                    cmd.ExecuteNonQuery(); //will take ~10s to execute
                    t1.Commit(); //NOTE: The COMMIT TRANSACTION in sp, if executed would only 
                                //decrease the @@Transcout value by 1.
                }
                catch (SqlException e2)
                {
                    lblOutputText.Text = e2.Message;
                    if (t1.Connection != null)
                    {
                        t1.Rollback();
                    }
                }
            }
        }
    }

This time the code will rollback the transaction within the catch block and the resources will not be blocked. If we now set XACT_ABORT  to ON, and repeat the same test, after the Attention event,  SQL Server will Rollback the transaction (@@Transccount = 0) and the code will skip executing Rollback() method because the Connection property of the transaction object t1 returns null since the transaction is no longer valid.

Conclusion

It is important to understand the XACT_ABORT setting to be able to prevent and/or explain Sql server’s behavior in case of a query timeout run-time error.  When set to OFF, the default value, Sql server will let the client handle the timeout situation and take care of the connection elements i.e., ongoing transactions. Query timeouts are the Client-side concept only and Sql Server does not treat them as errors. As soon as possible, after receiving the Attention event, Sequel will stop the execution reporting no errors. If XACT_ABORT is set to ON, sql server will rollback unfinished transactions and then return to the Client.
Desktop applications may use different approaches when dealing with DB connections. Some applications may handle the query timeout and close the connection(dispose the connection object). This instructs Sql Server to rollback all unfinished transactions in the scope of the session/connection. In this context, the XACT_ABORT setting will not have a direct effect on the execution. Other applications e.g., SSMS, will keep connections open after handling query timeouts. In this case, the abandoned transactions may cause other requests to wait to acquire the required locks and the problem can escalate. In this context, XACT_ABORT’s value “ON” will rollback the abandoned transactions preventing the resource locks.
Web applications usually use the Connection Pooling technology to optimize the database authentication process. The connections(connection objects) will stay “live” and ready to be re-used by other clients. Every CP contains a collection of similar Connections – connections with the same connection string. The availability of the connections follows the LIFO(Last In, First Out) algorithm. For example, if we have a Connection Pool CP1 with  4 connections in it and  ready to be reused CP1 = {c1,c2,c3,c4}. Connection c3 has left an abandoned transaction leaving table T1 locked. Scenario1: A client wants to read data from T2. CP1 will provide the client  with the connection c4. The client clears  the connection/session context left from the previous owner and reads the data with no problems. Scenario2: A client wants to read data from T1. CP1 provides connection c4. After clearing the c4 context the client gets blocked by the abandoned transaction left in the context of the c3. This can cause another timeout.   Scenario3. A client wants to read T1, gets c4, clear the previous context and ends up blocked. At the same time another client wants to read T1. Now CP1 serves c3 to the client. Client clears the connection’s previous context which includes the abandoned transaction rollback. The latest client now can read T1 as well as the previous one.

MARS Enabled Connections may affect SQL Server behavior in the case of a query timeout since the technology changes the scope of a transaction – Batch scoped transactions. This existing technology is something I would like to blog about in one of my future posts.

There are many case scenarios in which the query timeout can cause confusion and unwanted results. There are two main approaches to resolve this; a) more thorough coding on the client side which includes proper error handling and use of the connection transactions. b) using XACT_ABORT ON on the Sequel side which will handle the abandoned transactions.

My personal preference is not to rely on the Client’s code and to always use XACT_ABORT set to ON when needed. During more than 10 years of database programming, I have worked with only a handful of web programmers who were interested in this matter. In most cases, their perception of a database is a simple data store, preferably “controlled” by an ORM framework. After a certain period of production life, these “Agile products” usually become slow, not responsive, and sometimes unpredictable. The Connection pooling/query timeout problem is one of the unwanted scenarios. From our(DB Devs) perspective this may not necessarily be a “bad thing” since we all get a “good dollar value” for repairing the mess afterward… but this can be an inspiration for another, more philosophical post  🙂

Thanks for reading.

Dean Mincic

Semi Joins, anti-joins and Nulls in Sql Server


Summary

SQL Joins are table operators(binary operations in Relational Algebra) used to combine columns from one or more tables. The expression(predicate) that defines the columns which are used to join the tables is called Join Predicate. The result of a join is a set (relational database implementation of a set).
ANSI standard recognizes five types of joins: Inner Join, Left Outer Join, Right Outer Join, Full Outer Join, and Cross Join.
Joins are typically used to retrieve data from the normalized tables in a relation, e.g. one-to-many, zero-one-to-many, etc., usually with an equality predicate between primary and foreign key columns.
One of the most complex tasks for the Query Optimiser is “join ordering” i.e. finding the optimal join sequence when constructing the execution plan(a query requesting data from n tables requires n-1 joins)

Semi-join is one of a few operators in relational algebra that does not have representation in Tsql language. Some of the  “missing” operators are:

  • Semi join
  • Anti-join (anti-semi-join)
  • Natural join
  • Division

Semi-join is a type of join whose result set contains only the columns from one of the “semi-joined” tables. Each row from the first table(left table if Left Semi Join) will be returned a maximum of once if matched in the second table. The duplicate rows from the first table will be returned if matched once in the second table. A distinct row from the first table will be returned no matter how many times matched in a second table.
Below is the pseudo-code representation of the above statement.

SemiJoinsPCode

Anti-semi-join will do the exact opposite. The join will select any rows from the first table that do not have at least one matching row in the second table.

SQL Server engine has three physical(showplan) operators that can be used to perform semi-join/anti-semi-join logical operations when recognized by the  Query Optimiser.

The table below maps the physical operators and the semi-join algorithms that they support.
PhysicalOpSemiJoins
*The full list of the SQL Server showplan operators can be found here.

There are a number of scenarios when Query Optimiser decides to implement a semi-join algorithm to optimize query requests. Typically, the logical operations that represent semi-joins are: IN, NOT IN, EXISTS, NOT EXISTS. The EXCEPT and INTERSECT set operators may use the same physical, Semi Join operators to perform different logical operations.

The following examples illustrate a few of these scenarios.

Set up the test environment:

CREATE DATABASE testSemiJoins
GO
USE testSemiJoins
GO

IF EXISTS (SELECT 1 FROM sys.tables t WHERE name = 'tab1')
    DROP TABLE dbo.tab1;
GO
CREATE TABLE tab1  (ProductTypeId INT,ProductName VARCHAR(100))
GO

IF EXISTS (SELECT 1 FROM sys.tables t WHERE name = 'tab2')
    DROP TABLE dbo.tab2;
GO
CREATE TABLE tab2  (ProductId INT,Name VARCHAR(100),StorageSizeGB SMALLINT)
GO
--insert some rows
INSERT INTO tab1
    SELECT tab.id,tab.name
    FROM (VALUES (1,'Hard disk')
                ,(1,'Hard disk')
                ,(2,'SSD disk')
                ,(3,'Flash Memory')
                ,(4,'EPROM')
                ,(5,NULL)
                ,(6,'Floppy Disk')
                ,(7,'RAM Memory')
                ,(7,'RAM Memory')
                ) tab(id,name)
GO
INSERT INTO tab2
    SELECT tab.id,tab.name,tab.sizeGb
    FROM (VALUES (1,'Hard disk',250)
                ,(1,'SSD disk',120)
                ,(2,'SSD disk',240)
                ,(3,'Flash Memory',8)
                ,(4,NULL,NULL)
                ,(5,'EPROM',0)
                ) tab(id,name,sizeGb)
GO
Tab1AndTab2

Left Semi Join

The Left Semi Join operator returns each row from the first (top table in the execution plan) input where there is at least one matching row in the second(bottom) input.

SELECT * 
FROM tab1 t1
WHERE t1.ProductName IN (SELECT t2.Name FROM dbo.tab2 t2)
--or using a correlated query (same execution plan)
SELECT * 
FROM tab1 t1
WHERE t1.ProductName IN (SELECT t2.Name FROM dbo.tab2 t2 where t2.Name = t1.ProductName)
LeftSemiJoin

Left Anti Semi Join

The Left Anti Semi Join operator returns each row from the first (top) input when there are no matching rows in the second (bottom) input.

SELECT ProductName FROM tab1
EXCEPT
SELECT Name FROM tab2
--or
SELECT * 
FROM tab1 t1
WHERE NOT EXISTS (SELECT t2.Name FROM dbo.tab2 t2 WHERE t1.ProductName = t2.Name)
--WHERE t1.ProductName NOT IN (SELECT t2.Name FROM dbo.tab2 t2)

LeftAntiSemiJoin

Both of the queries use the same physical operator – Loop Join( Left Anti Semi Join) to perform different logical operations. The second query performs a logical Left Anti Semi Join whereas the first query performs an operation based on the Difference of Sets  (Set A – SetB) operation.
Set operators EXCEPT, INTERSECT, and UNION treat NULL values as equal(non-distinct) whereas operator EXISTS evaluates NULL=NULL as FALSE(even if the 3VL result is UNKNOWN). More about the behavior here.
Also, it is worth mentioning that all set operators (except the multi-set operator UNION ALL) remove duplicates. 

Right Semi Join

The Right Semi Join operator returns each row from the second (bottom) input when there is a matching row in the first (top) input.

SELECT * 
FROM tab1 t1
WHERE  EXISTS (SELECT t2.Name FROM dbo.tab2 t2 WHERE t1.ProductName = t2.Name)
OPTION(HASH JOIN)
--or based on the Intersection Of Sets operation
SELECT ProductName FROM tab1
INTERSECT
SELECT Name FROM tab2 
OPTION(HASH JOIN)
-- When building the hash table, the hash match join chooses the table with fewer rows. 
-- In this example, that is the tab2 table.

RightSemiJoin

NOTE: The INTERSECT set operator treats NULL values as equal and therefore a NULL value appears in the result set(as being one of the  common values in tab1 and tab2) .

Right Anti Semi Join

The Right Anti Semi Join operator outputs each row from the second (bottom) input when a matching row in the first (top) input does not exist.

SELECT * 
FROM tab1 t1
WHERE NOT EXISTS (SELECT t2.Name FROM dbo.tab2 t2 WHERE t1.ProductName = t2.Name)
OPTION(MERGE JOIN)

RightAntiSemiJoin

Anti semi Joins, NOT IN, and NULLs

Again, things get “special” when it comes to working with NULLs.
If we execute the Left Anti Semi Join query again, but this time using NOT IN instead of EXISTS, we will get the empty result set. This is one of the common logical errors that can cause unexpected results.

--NOT IN (does not work)
SELECT * 
FROM tab1 t1
WHERE t1.ProductName NOT IN (SELECT t2.Name FROM dbo.tab2 t2)

--correlated NOT IN (works ok)
SELECT * 
FROM tab1 t1
WHERE t1.ProductName NOT IN (SELECT t2.Name FROM dbo.tab2 t2 WHERE t2.Name = t1.ProductName)

--hardcoded NOT IN (does not work)
SELECT * 
FROM tab1 t1
WHERE t1.ProductName NOT IN ('Hard disk','SSD disk','SSD disk','Flash Memory',NULL,'EPROM')

--NOT EXISTS (works ok)
SELECT * 
FROM tab1 t1
WHERE NOT EXISTS (SELECT t2.Name FROM dbo.tab2 t2 WHERE t1.ProductName = t2.Name)

The image below shows the query execution plans, the predicate property of the significant physical operators, and the result sets.

AntiJoinNot_IN_NULL1

Just as a reminder, the Nested Loop algorithm compares each row from the OUTER table (tab1 in the example) to each row from the INNER table (tab2). Only the rows that satisfy the join predicate will be returned.

The query result sets should contain all rows from the first (top) input when there are no matching rows in the second (bottom) input. The NULLs are treated as distinct. The results should be the same but there are not. Why?

The answer lies in the way the operator NOT IN was implemented.

Query1 (NOT IN)
From the Nested Loop’s Predicate property we can see that the operator uses a sequence of OR logical expressions to implement the request.

t1.ProductName IS NULL
OR t2.Name IS NULL
OR t2.ProductName = t2.Name

To make the tab1 table rows qualify for the output, the results of all the expressions in the predicate must evaluate to FALSE.

The expression t2.Name IS NULL will always evaluate to TRUE for every iteration, resulting in the empty result-set. This is because of a NULL value in the tab2.Name column.

This is important to know to be able to avoid possible logical errors. One way to prevent the behavior is to set NOT NULL column property on the tab2.Name column. The other way is to use  ISNULL() function in the query to prevent NULL expressions.

SELECT * 
FROM tab1 t1
WHERE t1.ProductName NOT IN (SELECT ISNULL(t2.Name,'') FROM dbo.tab2 t2)

--first replace all NULLs with known values, e.g '' and then add NOT NULL column property or a CHECK constraint
UPDATE tab2
    SET Name = ''
WHERE Name IS NULL
GO

ALTER TABLE tab2
  ALTER COLUMN Name VARCHAR(100) NOT NULL
GO
--or
ALTER TABLE tab2
    WITH CHECK 
    ADD CONSTRAINT CK_NameNotNull CHECK(Name IS NOT NULL)
GO

Query2 (“Correlated” NOT IN)
The query uses the same execution plan as NOT IN, except now the Nested Loop’s Predicate property adds an extra AND expression to the sequence of OR expressions.

t2.Name = t1.ProductName
AND t1.ProductName IS NULL
OR t2.Name IS NULL
OR t2.Name = t1.ProductName

The same logic applies as with the first query. The predicate(as a list of the logical expressions) must evaluate to FALSE for the tab1 rows to qualify for the output. As before, at least one of the expressions on the left side of the AND operator will always evaluate to TRUE (t2.Name IS NULL is TRUE for every iteration). However, the additional expression (the one on the right side of the AND operator) may evaluate to FALSE making the whole predicate FALSE and allowing the qualified rows to be returned.

The query returns the correct result.

Query3 (“Hard-coded” NOT IN)
From the lack of the Constant Scan Predicate property in the execution plan, we can conclude that the Query Optimiser has decided to return an empty result set knowing that at least one of the “hard-coded, NOT IN values” is NULL.
To get the predicate property and analyze its logic, we can run the same query, but this time without the NULL value in the NOT IN list  i. e.

SELECT * 
FROM tab1 t1
WHERE t1.ProductName NOT IN ('Hard disk','SSD disk','SSD disk','Flash Memory','EPROM')

The Constant Scan operator has the Predicate property now.

LeftAntiJoinNOTIN_HC

The predicate is constructed of n not-equal expressions (n is a number of distinct values in the NOT IN list, in this case 4) and n-1 AND operators.

t1.ProductName <>’EPROM’
AND t1.ProductName <>’Flash Memory’
AND t1.ProductName <>’Hard disk’
AND 1.ProductName <>’SSD disk’
— AND 1.ProductName <>NULL –if we had NULL here, the predicate would evaluate to UNKNOWN  for every row resulting in an empty result set.

There are a couple of interesting points here. The first one is the result set.

The query excludes t1.ProductName  = NULL from the result set. The reason for excluding the NULL  is because of the way “hard coded NOT IN” was implemented.
The NULL values from the left table will be excluded, whereas the NULL value(s) in the NOT IN list will cause an empty result set.
In this example, the algorithm excludes tab1.ProductName NULL value by evaluating predicate as UNKNOWN i.e

NULL <>’EPROM’ (1)
AND NULL <>’Flash Memory’   (2)
AND NULL<>’Hard disk’  (3)
AND NULL<>’SSD disk’  (4)

NULL (1) AND NULL (2) AND NULL (3) AND NULL (4)  is  NULL

It is good to know the differences in the results to prevent possible logical errors.

The other interesting thing is the number of expressions in the predicate that directly depend on the number of hard-coded literals. The question is: Is there a maximum number of literals that can be processed? The number is related to a maximum string length containing SQL Statements(batch size). The Maximum Capacity Specifications for SQL Server can be found here. Another reason may be running out of stack size(stack overflow) causing the errors 8623 and/or 8632

Query4(NOT EXISTS) 
The NOT EXIST operator works straightforwardly here. All rows from tabt1 that do not match the rows from tab2 (where the correlated predicate t1.ProductName = t2.Name evaluates to FALSE)  will be returned. NULLs are treated as distinct.
The query returns the correct result.

Conclusion

Semi-join is a join that has no representation in tsql. The result of the operation contains only columns from one of the tables. All returned rows from the first table must be matched at least once in the second table. The rows from the first table will be returned only once even if the second table contains more than one match. Semi-joins are usually implemented using IN or EXISTS operators. Any of the three physical operators that SQL Server uses to perform joins can be used to implement Semi Joins. The set operators EXCEPT and INTERSECT may use the same Semi Join physical operators to perform set operations.
Logical Anti semi join operation may be implemented using NOT IN or NOT EXISTS operators. The NOT IN operator may cause unexpected results when it operates on the tables that contain NULL values in the joined columns.

Thanks for reading.

Dean Mincic

 

 

 

Orphaned DB Users and SQL Server logins


One of the common tasks in the DB community is moving databases from one SQL server instance to another.
For simple environments, it could be just a matter of backing up the database on one server/instance and restoring it on another.

Sometimes, even if we successfully restore the database on a new server, create all necessary logins and change the application connection string to point to the new instance we still get the Database not accessible or Login failed error messages.

The common reason for the errors is the disconnection between the DB users and their logins – The orphaned DB users.
The technique we usually use to re-connect the users is to call a system-stored procedure that will do the Login-DbUser remapping.

-- Re-connects an orphaned db user "db_user1" with login "user1"
EXEC sp_change_users_login 
            @Action='update_one'
           ,@UserNamePattern='db_user1' 
           ,@LoginName=user1;
GO

Why does the problem occur in the first place and what we can do to prevent it. The following is a little bit of theory and a couple of examples that will illustrate the problem.

In SQL Server, there are eleven different database user types. The type I am focusing on today is Db User based on the SQL Server authenticated Login. The complete list can be found here.

Every principal (an entity that can request SQL Server resources) has two main properties:

  • Principal ID or ID
  • Security ID or SID

The scope of influence of a principal depends on the level that the principal operates on e.g SQL Server Level principals operate on an SQL Server Instance level while the Database level principals operate on a database level.

  • Logins are principals whose scope is SQL Server instance
  • DB Users are principals whose scope is Database
DB Users –  based on SQL Server Logins

When we create an SQL Server  Login,  SQL Server assigns ID and SID to the created principal.

  • ID – (INT) -Uniquely identifies Login as an SQL Server securable (resource managed by SQL Server). The ID is generated by SQL Server
  • SID – (VARBINARY(85)) –  Uniquely identifies the security context of the Login. The security context depends on how the identifier was created. The Login  SIDs can be created by :

    • Windows User/Group. The SID will be unique across the User/group domain and will be created by the domain.
    • SQL Server. The SID is unique within SQL Server and created by SQL Server – used in the example below.
    • Certificate or asymmetric key. (cryptography in SQL Server will be covered in one of the following posts)

The information about the Logins is stored in the master database

The following code will create a few SQL objects to illustrate the problem.

-- Create a couple of test databases
USE master
GO

CREATE DATABASE TestOrphanedUsersDB_1
GO
CREATE DATABASE TestOrphanedUsersDB_2
GO

--Create Sql Server Logins
CREATE LOGIN Login1 
    WITH PASSWORD='abc'
        ,CHECK_POLICY = OFF
GO
CREATE LOGIN User2 --the login name is User2 :)
  WITH PASSWORD ='abc'
      ,CHECK_POLICY = OFF
GO

The logins are stored in the master database and can be viewed using sys.server_principals system view:

SELECT  name         AS [Login Name]
       ,principal_id AS ID
       ,[sid]        AS [SID]
       ,[type_desc]  AS [Login Type]
FROM master.sys.server_principals
WHERE name IN ('Login1','User2')

Query Results (the identifiers may be different on different PCs)
Logins_srvPrincipals

Now we need to create a few database users.

As mentioned before, there are eleven different types of database users. For this exercise, we’ll create DB users based on  SQL server logins.

--Create database users for the Sql Logins.
--The User SIDs will match the Login SIDs
USE TestOrphanedUsersDB_1
GO
CREATE USER User1  
    FOR LOGIN Login1 --User1(SID) = Login1(SID)
GO
CREATE USER User2
    FOR LOGIN User2
GO

USE TestOrphanedUsersDB_2
GO
CREATE USER User1-- The user names are unique on a database level
    FOR LOGIN Login1

SQL Server has assigned the Principal IDs and Security IDs to the newly created users.

  • ID – (INT) -Uniquely identifies DB users as a database securable.
  • SID – (VARBINARY(85)) – Uniquely identifies the security context of the User. The security context depends on how the identifier was created – In the example the DB Users security context depends on the Logins and therefore the User SIDs will match the Login SIDs.

The information about the database users is stored on the database level and can be viewed using the sys.database_principals system view.

USE TestOrphanedUsersDB_1
GO
SELECT  name           AS [DbUser Name]
       ,principal_id   AS [Principal ID]
       ,[sid]          AS [SID]
       ,[type_desc]    AS [Db User Type]
FROM sys.database_principals
WHERE name IN ('User1','User2')
Users_DB1

..and for the second database…

USE TestOrphanedUsersDB_2
GO
SELECT  name           AS [DbUser Name]
       ,principal_id   AS [Principal ID]
       ,[sid]          AS [SID]
       ,[type_desc]    AS [Db User Type]
FROM sys.database_principals
WHERE name IN ('User1','User2')
Users_DB2

The diagram below shows the relationship between Logins and Users.

LoginsUsers

Image 1, Logins/Users mapping

Case Scenario:
An application uses two databases, TestOrphanedUsers_1 and TestOrphanedUsers_2. We decided to move the application’s backend to a new instance by backing up and restoring the two on the new server.
The restored databases contain all of the previously defined DB users since the principals are a part of the databases. The original server logins were not transferred because they belong to the original master database.
At this stage, the logins are not mapped to the users and the application is not able to access the backend.

To simulate the scenario, we’ll remove the previously created logins.

 USE master
 GO
 DROP LOGIN Login1
 GO
 DROP LOGIN User2
 GO

If we removed the logins using SSMS UI, we would get a message

DropLoginMsgSSMS

The users left in the databases are now called “Orphaned Users”. The users without the corresponding logins cannot be used to access the databases. This situation mimics the database restore on a new SQL Server instance.

The next step is to create new logins. At this stage, we can do two things.

  • Create new logins (as we did before). SQL server’s engine will assign new SIDs to the logins. These identifiers will not match the existing user SIDs and consequently, we’ll have to remap the Logins to the Users (to make SIDs match). To make a match, the process will replace the old user SIDs with the new Login ones.
--Create Sql Server Logins
USE master
GO
CREATE LOGIN Login1 
    WITH PASSWORD='abc'
        ,CHECK_POLICY = OFF
GO
CREATE LOGIN User2
  WITH PASSWORD ='abc'
      ,CHECK_POLICY = OFF
GO

Using the principals’ metadata we can see the mismatch between SIDs .

Login SIDs:
Logins_srvPrincipalsNEW_notMatch

To restore the previous mapping (see Image 1) we need to remap the orphaned users as:
Logins_UsersMapping

.. using the sys.sp_change_users_login system stored procedure.

--list all orphaned users
EXEC TestOrphanedUsersDB_1.sys.sp_change_users_login 
    @Action ='report'
GO
EXEC TestOrphanedUsersDB_2.sys.sp_change_users_login 
    @Action ='report'

-- remap
EXEC TestOrphanedUsersDB_1.sys.sp_change_users_login 
            @Action='update_one'
           ,@UserNamePattern='User1' 
           ,@LoginName=Login1
GO
EXEC TestOrphanedUsersDB_1.sys.sp_change_users_login 
            @Action='update_one'
           ,@UserNamePattern='User2' 
           ,@LoginName=User2
GO

This will replace the user SIDs with the new Login SIDs
The stored procedure supports the Auto_Fix action type that can be used in a specific scenario in which we create a missing login with the same name as the orphaned user the login was created for.
More information about the procedure can be found here.

NOTE: SQL Server 2016 is the last database engine version to support the sys.sp_change_users_login procedure.
Microsoft recommends using ALTER USER instead.

--remap User1(TestOrphanedUsersDB_2) to login Login1
USE TestOrphanedUsersDB_2
GO
ALTER USER User1
	WITH LOGIN=Login1
GO
  • Create logins implicitly specifying  SIDs to match the DB. user SIDs,
USE master
GO
--Create Sql Server Logins
CREATE LOGIN Login1 
    WITH PASSWORD='abc'
        ,CHECK_POLICY = OFF
        --copied from TestOrphanedUsersDB_1.User1 or TestOrphanedUsersDB_2/User1
        ,SID =0X043C965331B69D46B3D6A813C9238090 
GO
USE master
GO
CREATE LOGIN User2 --the login name is User2 :)
  WITH PASSWORD ='abc'
      ,CHECK_POLICY = OFF
      --copied from TestOrphanedUsersDB_1.User2
      ,SID=0XC871212ABD68D04998E89480285DDE70
GO

Now we can test the mapping using the Logins to access the databases.

Conclusion:

The database users created for(or based on) SQL Server authenticated logins must have a valid link to the logins. The link is the Security identification(SID) varbinary.
Due to different scopes of the principals, DB Users, when restored on a different server, may become disconnected(orphaned) from the corresponding Logins(the new logins may have new SIDs that don’t match the original DB User SIDs). In this situation, the applications are not able to connect to the database.
To remap the principals we can use two approaches; system sp sys.sp_change_users_login to change the DB user SIDs to match the new Login SIDs or to create new Logins using the original user SIDs.
It’s worth mentioning that the latter may cause the “Supplied parameter sid is in use” error if the specified SID is already in use by an existing Login.

Thanks for reading.

Dean Mincic

Three valued logic – NULL in Sql Server


In SQL the logical expressions (predicates) can evaluate TRUE, FALSE and UNKNOWN. The third result is unique to “SQL world“ and is caused by NULL value.  This is called Three Valued Logic (3VL).   NULL represents the missing or UNKNOWN value.

The logic could be confusing because Boolean, two valued logic expressions can evaluate only to TRUE or FALSE – a comparison between two KNOWN values gives us TRUE or FALSE.

With NULL as a mark or a placeholder, we have an UNKNOWN value, so the question is how to compare a known with an unknown value or an unknown with an unknown.

The logical expressions with NULLs  can evaluate to NULL, TRUE and FALSE e.g.

SELECT IIF(1=NULL,'true','unknown')      AS  [1 = NULL]
      ,IIF(NULL=NULL,'true','unknown')   AS  [NULL = NULL]
      ,IIF(NULL+22<=33,'true','unknown') AS  [NULL*22 <= 33]

nullExpressionResult

The result of NULL negations can be confusing too

  • The opposite of True is False (NOT True = False)
  • The opposite of False is True (NOT False = True)
  • The opposite of UNKNOWN is UNKNOWN (NOT NULL = NULL)

Conjunctive(AND) predicates with Nulls evaluates:

  • True  AND NULL  => NULL
  • False AND NULL => False

and Disjunctive (OR) predicates with Null evaluates:

  • True OR NULL => True
  • False OR NULL => NULL
  • NULL OR NULL => NULL

The logic is confusing and it was source of an ongoing discussion about NULLS for a long time.
The important thing to note is to put the logic in the context of DB design and to understand how Sql Server treats NULLs in different scenarios. Knowing that we can  benefit from using the UNKNOWNS and avoid possible logical errors.

Sql Server is INCONSISTENT when evaluates the logical expressions with NULL values – The same expressions may produce different results TRUE, FALSE and sometimes UNKNOWN.
The following examples shows how SqlServer deals with NULLs in different elements of Tsql language.

 QUERY FILTERS (ON,WHERE,HAVING)

Create a couple of test tables:

CREATE TABLE #Products(ProductId INT IDENTITY(1,1)
                           PRIMARY KEY CLUSTERED
                       ,Name VARCHAR(500) NOT NULL
                       ,Price DECIMAL(8,2) NULL CHECK(Price > 0.00)           
                       ,ProductCategoryId INT NULL)
GO
CREATE TABLE #ProductCategories(ProductCategoryId INT
                                   UNIQUE CLUSTERED
                               ,Name VARCHAR(500) NOT NULL)
GO

..and add some data

INSERT INTO #Products(Name,Price, ProductCategoryId)
     SELECT tab.* 
     FROM (VALUES ('Lego',65.99,200)
                 ,('iPad',750.00,300)
                 ,('Gyroscope',NULL,NULL)
                 ,('Beer',3.55,NULL)
                 ,('PC',2500.00,300)) tab(name,price,catid)
GO
INSERT INTO #ProductCategories(ProductCategoryId,Name)
    SELECT tab.*
    FROM (VALUES (200,'KidsGames')
                ,(300,'Gadgets')
                ,(NULL,'Liquor')
                ,(500,'Camping')) tab(id,name)
GO

Test1: Select all the products over $500.00.
The product with the UNKNOWN(NULL) price will be excluded from the result-set as it was treated as FALSE. Only these rows for which the predicate evaluates to TRUE will be included in the final result-set.

SELECT * 
FROM #Products 
WHERE Price  > 500

Null1

We’ll get the similar result if we try to select all products with not unknown price like:

SELECT * 
FROM #Products 
WHERE Price <> NULL;

The result will be an empty set. Not NULL is still UNKNOWN and is treated as FALSE.

The UNKNOWN result is treated as FALSE when evaluated by ON filter.

SELECT p.*,pc.* 
FROM #Products p
  INNER JOIN #ProductCategories pc
    ON p.ProductCategoryId = pc.ProductCategoryId

As we can see from the result-set, the product “Gyroscope” is excluded from the result as NULL = NULL evaluated to UNKNOWN and was treated as FALSE.

Null2

CHECK Constraint

In case of CHECK constraint, SQL Server treats UNKNOWN result as TRUE.
The product “Gyroscope” has UNKNOWN price. The CHECK constraint on the price column allows only those values which evaluates to TRUE when compared with 0.  In this case Sql Server evaluated;

NULL >0 => True

opposed to the previous example when

.. NULL > 500 => False

To confirm the CHECK constraint behavior we can insert a new product into #Products table without getting a Check constraint error:

INSERT INTO #Products(Name,Price, ProductCategoryId)
    SELECT 'Microsoft Surface Pro',NULL,300

This can make sense if we put the NULL logic into the context of the DB design, e.g. All products must have price greater than $0 or NULL if the price hasn’t been decided yet.

Null3UNIQUE Constraint and Set operators

It was mentioned earlier that  NULL = NULL evaluates to UNKNOWN. This means that NULLs are distinct( no two NULLs are equal)

The UNIQUE constraint treats NULL as a distinct value in the context of all other values in the column i.e ProductCategoryId can have only one NULL value and the distinct integers. However, allowing only single NULL value(inserting another NULL value will throw Violation of UNIQUE KEY constraint error), the constraint treats all UNKNOWN values(NULLS) as equal or not distinct.

INSERT INTO #ProductCategories
        SELECT NULL, 'Musical Instruments'

UniqueConstraintError

The table #ProductCategories has an UNIQUE constraint (enforced by a clustered index) on ProductCategoryId column.

The SET operators UNION, UNION ALL* , EXCEPT,  INTERSECT

Sql Server’s  Set operators treat NULLs as non distinct, in other words  as equal.
e.g. Select the rows from tables @Tab1 and @Tab2 that are common for both tables.

DECLARE @Tab1 TABLE (ID INT, Val VARCHAR(5))
DECLARE @Tab2 TABLE (ID INT, Val VARCHAR(5))

INSERT INTO @Tab1(ID,Val)
    SELECT t1.id, t1.Val
    FROM (VALUES (1,NULL),(2,NULL),(3,NULL),(4,'Dean'),(5,'Peter'),(6,'Bill')) t1(id,Val)

INSERT INTO @Tab2(ID,Val)
    SELECT t2.id, t2.Val
    FROM (VALUES (1,NULL),(2,NULL),(3,NULL),(4,'Dean'),(5,'Bahar'),(6,'Nick')) t2(id,Val)

SELECT ID AS keyValue, Val AS Value FROM @Tab1
INTERSECT 
SELECT ID,Val FROM @Tab2

NullsIntersection

As we can see from the result-set, the SET operator treats NULL values as equal e.g  Element (1,NULL ) = Element (1,NULL) and therefore belongs to “the red zone”, whereas e.g Element(5,Bahar) <>(5,Peter) and therefore was left out from the result-set.

*UNION ALL is a multi set operator that combines two sets into one non distinct set (takes all elements from both sets)

 Aggregate Functions

All Aggregate functions except COUNT(*) ignore NULLs.

DECLARE @Tab3 TABLE (ID INT NULL)
INSERT INTO @Tab3
  SELECT t.* 
  FROM (VALUES(1),(2),(NULL),(NULL),(3) ) t(i)

  SELECT  AVG(ID)   AS [AVG(ID)]
         ,COUNT(ID) AS [COUNT(ID)]
         ,SUM(ID)   AS [SUM(ID)] 
         ,MAX(ID)   AS [MAX(ID)]
         ,MIN(ID)   AS [MIN(ID)]
         ,'***'AS [***],COUNT(*) AS [COUNT(*)]
          from @Tab3

NullAgg

Note: COUNT(col1) eliminate NULLs as the other aggregate functions. However, if we try something like: SELECT COUNT(NULL+1) the function returns 0 eliminating the result of the expression, which is NULL 🙂
Another interesting thing with this, is that SELECT COUNT(NULL) throws an error – Null does not have any data type (COUNT() always does implicit conversion to INT) and e.g. SELECT COUNT(CONVERT(varchar(max),NULL)) returns 0 – now COUNT() has a datatype to work with.

Cursors (Order by)

When a query requests sorting the result-set becomes cursor – sets are not ordered. Sql Server treats NULL values as distinct and sorts the unknowns first if the sort is Ascending (Descending sort order puts NULL values  last)

SELECT * 
FROM #Products
ORDER by ProductCategoryId

NullOrder

*Just a small digression. One way to show NULL values last when  sorting in Ascending direction is to use tsql’s ability to sort record-set by non-selected columns and/or expressions.

SELECT *  ---,[NullsLast]= IIF(ProductCategoryId IS NULL,1,0)
FROM #Products
ORDER BY  CASE 
            WHEN ProductCategoryId IS NULL THEN 1
            ELSE 0
          END 
          ,ProductCategoryId

The NULL expressions results can explain the “unexpected” results  when used in anti-semi joins, but that will be covered in a separate post.

Conclusion

Three valued logic is unique to DB wold. ANSI SQL Standard provides rules/guidelines about NULL values. However, different RDBMS may handle NULLs  differently in different situations. It is important to understand how Sql Server treats NULL values in different elements of tsql language. Using NULLs  in the context of DB design can make 3VL work for you.

Thanks for reading.

Dean Mincic