Category Archives: SQL Server Adminstration

Data providers and User Options

Page Contents

Data providers and User Options


Summary

Some time ago I decided to write a quick post about ANSI_WARNINGS, one of Sql Server’s user options. Half way through, I discovered many interesting things around the mechanisms that sets Client connection/session user settings. That was definitely more fun to research and blog about 🙂
Sql Server configuration settings include several settings called User options.  Those options, along with some other settings define the user connections environment related to query processing. The options define i.e how queries handle the 3VL(Three Valued Logic – ANSI_NULLS) or how they enforce atomicity of the explicit transactions (XACT_ABORT) etc.
There are a few levels where the options can be set: Server, Database and session levels. The values can also be independently  set by Sql Server engine and the data providers after the successfully established Client connection. Different Db tools like SSMS may have their own settings on top of the previously mentioned. This blog aims to shed some light on the processes that change the connection/session user option settings.

SQL Server Configuration Settings overview

Sql Server’s configuration settings can be set on three different levels.

  1. Server configuration settings
  2. Database configuration settings
    2.1 Connection settings set by Sql Server/ Data providers during the Database Login process/after the successfully established Client connection.
  3. SQL Server session(connection) configuration settings (SET statements on a session level)

For the most of the settings, the values can be set on more than one level. The overlap introduces the precedence of the values set on the lower levels over those set on the higher levels i.e QUOTED_IDENTIFIER setting value defined on the session level (3) overrides the same setting value on the database(2) and/or Server level(1).
Sometimes, the concept may be confusing and this is only my personal feel, since there is a number of different ways to assign values to the same settings and on a few different levels.

Some of the settings can only be set on certain levels. i.e  Max workers treads (configures the number of worker threads that are available to SQL Server processes) can be set only on the server(instance) level, and AUTO_CREATE_STATISTICS( If not already available, Query Optimizer creates statistics on individual columns used in a predicate) can only be set on database level*.

NOTE: All database settings are inherited from the model db during the database creation.

As mentioned, this post will be focusing only on a sub-set of the server settings, the User Options settings.

User Options

Figure 1 below shows the categories of the user options and their values. The bit settings are presented as decimal numbers – this will be explained later in the blog.


Figure 1, User Options

User Options can be set up on three different levels; Server, Database and Session level.

Server level

Server level defines User Option settings for all Client connections. User options on this level can be managed in a few different ways.

  • sys.sp_configure , system stored procedure
EXEC sys.sp_configure 
    @configname = 'user_options'


Figure 2, Default User Option(s)

The system stored procedure manages server configuration settings.
@configname is an option name or, in our case  the name of a group of options.  config_value is a combination of values presented in Figure 1.  The default config_value is 0 representing the default state of user settings – all set to OFF.

More about config_value

The config_value is a small, positive integer. This means that the maximum value that can be used is 32768 – 1 = 32767   or 215 – 1.  One  bit is used for the  integer sign.

The config_value is also a 2 bytes(16 bits) bitmap used to store the user setting values. The first 15bits are used to store the values (although the 1st – DISABLE_DEF_CNST_CHK has been discontinued since Sql Server 2012).

Lets say we want to turn ON the following user settings

ANSI_WARNINGS  (decimal value 8)
ARITHABORT (decimal value 64) and
CONCAT_NULL_YIELDS_NULL (decimal value 4096)

Figure 3 is a graphical representation of the 2byte bitmap and the corresponding binary and  decimal values. The last bit (binary 215 ,decimal 32768) is not in use.

Figure 3, User Options – config_value

To turn ON the corresponding bits, we just add decimal values like :  8 + 64 +4096 = 4168 (or using bitwise OR) i.e SELECT (8 | 64 | 4096)  We then pass the value to the  configvalue parameter of the sys.sp_configure system stored procedure.

EXEC sys.sp_configure @configname = 'user_options'
                     ,@configvalue = 4168
RECONFIGURE;

This means that the all subsequent client connections may have the settings turned ON. More on this in the following sections.

  • SSMS GUI, Server Settings/Properties/Connections

Another way to manage User Options on the server level is by using SSMS tool. Figure 4 shows the user_options accessed through the UI.


Figure 4, User Options – SSMS

The default state of the user option values on the Server level  is all OFF.

Database level

Similar to the Server level user options, database level user options define the user option values for all the Clients that will be connecting to the particular database. The database settings are supposed to override the same settings that were set on a higher, server level.
Database level user options as well as server level user options are, by default all set to OFF.
NOTE: The Database level user options becomes more interesting in the context of Contained Sql server(2012+) databases or its Cloud counterpart , the Azure SQL Database

To change user options on the database level use ALTER DATABASE statement.

ALTER DATABASE TestDB
 SET QUOTED_IDENTIFIER ON
     ,NUMERIC_ROUNDABORT ON 
GO

To check the current user option values we can use system functions or system views.

SELECT DATABASEPROPERTYEX('TestDB','IsNumericRoundAbortEnabled')

--OR

SELECT [name]
      ,is_ansi_warnings_on
      ,is_ansi_padding_on
      ,is_ansi_nulls_on
      ,is_ansi_null_default_on
      ,is_arithabort_on
      ,is_numeric_roundabort_on
      ,is_quoted_identifier_on
      ,is_cursor_close_on_commit_on
      ,is_concat_null_yields_null_on
FROM sys.databases 
WHERE [name] = N'TestDB'

Figure 5 shows a sub-set of the User Options available on the database level.

Database user options settings
Figure 5, Database level user options

The following user options are available on the server level but not on the database level.
– ARITHIGNORE
– NOCOUNT
– XACT_ABORT
– IMPLICIT_TRANSACTIONS

ANSI NULL Default (ANSI_NULL_DEFAULT) represents the Server level  ANSI_NULL_DFLT_ON and ANSI_NULL_DFLT_OFF.

By default, server level user option settings should be overwritten by the corresponding database level settings. However,  for most of the options this is not true. Data providers responsible for establishing connections between client applications and database will override the settings defined on both levels.

Connection settings set by Sql Server/ Data provider

I was not able to find many white papers that covers Sql Server Data providers’ role in configuring user options. The following conclusions are based on my personal research and experiments.

For a start, let’s set up a simple Ext. Event session that reports on the  following events;

  • Login event. The main Event field of interest will be the option_text field.
  •  sql_batch_starting and sql_batch_completed – sql_text field
  •  sql_statement_starting and sql_statement_completed – sql_text field

The last two events will be used in the subsequent experiments.


Figure 6, Login event

Now, let’s connect to a test database using a simple powershell script and observe the Login event attributes

## Create a simple connection to Sql Server using .NET Framework Data Provider for Sql Server

$sqlConn = New-Object System.Data.SqlClient.SqlConnection;

$sqlConn.ConnectionString = @('Data Source=tcp:(local);' `
                             +'Initial Catalog=TestDB;' `
                             +'Integrated Security=True;' `
                             +'Pooling=False;' `
                             +'Application Name=TestDataProviderConnSettings;');

## Open and Close the connection. 
$sqlConn.Open(); ##This is where we capture the Login event
$sqlConn.Close();   
$sqlConn.Dispose();


Figure 7, options_text field value

By analysing the output we can observe a couple of interesting things.

  • The Server/Database user settings have been overwritten by the new values i.e ANSI_PADDING is originally set to OFF on both levels and then turned ON during the SqlConnection.Open() process.
  • Some other connection settings are set i.e dateformat, transaction isolation level ..etc

In addition, I compared the user option values set by different  data providers;

  • .NET Framework Data Provider for Sql Server/ODBC/OLEDB (pshell script)
  •  ODBC v3 and v7 (Python and sqlCmd)
  •  Microsoft JDBC Driver 7.0 for SQL Server (java app)

The results were always exactly the same – the identical set of user option values.

Now, the question I was trying to answer was ; What was the process that set the values presented in the Figure 7? The option_text event field  description says:
“Occurs when a successful connection is made to the Server. This event is fired for new connection or when connections are reused from a connection pool”
This may suggest that Data Providers may be responsible for setting the option values and at some stage during the login process.
To  further investigate this, I have ran a powershell script similar to the one used in the previous test. This time I used ODBC Data Source Administrator (for 32bit env) to trace the ODBC communication(function calls) between the Client app and Sql Server.

## Create a simple connection to Sql Server using .NET Framework Data Provider for ODBC
$sqlConn = New-Object System.Data.Odbc.OdbcConnection;
$sqlConn.ConnectionString =  @('Driver={SQL Server Native Client 11.0};' `
                             +'Server=tcp:(local);' `
                             +'Database=TestDB;' `
                             +'Trusted_Connection=yes;');
##Note: Connection pooling is disabled for all ODBC connections
##      see odbcad32.exe (or odbcad64.exe) Connection Pooling Tab


## Open and Close the connection
$sqlConn.Open();
$sqlConn.Close();   
$sqlConn.Dispose();


Figure 8, ODBC Administrator – Trace ODBC function calls

The trace did not show any ODBC function calls that set up the user options captured by the Login event. It was “nice and clean” log.
In the next experiment, I used a simple sqlcmd.exe script to connect to the same database. Sqlcmd utility uses (at least ver. 14.0.1 I have on my laptop) ODBC Driver 13 for Sql Server. This is visible in the ODBC trace file – Figure 10.

C:\Users>sqlcmd -S tcp:(local) -d TestDB

Again, no “interesting” function calls during the login phase. However,  there were two function (SQLSetStmtAttrW) calls that took place after the successfully established connection.

Figure 9, Post Login events

Figure 10, ODBC Trace snapshot  – SQLSetStmtAttrW() function

The log shows that there were two sets of changes on the user options  initiated by data providers;

1. Pre-login/During the login process (captured by the Login event)
2. Post-login changes (captured by sql_batch/sql_statement events)

Post-login changes

In this experiment I’ve tried to consolidate and document the post login user options values set by different providers. The idea is to execute a simple program using different data providers, that will:

  1. connect to the test database
  2. execute a view that selects the user option values for the current session.
  3. output the results of the view
    i. e C:\Users>sqlcmd -S tcp:(local) -d TestDB -q “select * from dbo.vwGetSessionSETOptions”

The view definition:

CREATE OR ALTER VIEW [dbo].[vwGetSessionSETOptions]
AS 
	SELECT TAB.OptionName
		  ,Setting = IIF(TAB.Setting = 0,'OFF','ON')
	FROM (VALUES ('DISABLE_DEF_CNST_CHK'    ,(1 & @@OPTIONS))	   --not in use
				,('ANSI_WARNINGS'           ,(8 & @@OPTIONS))	   --iso settings statements
				,('ANSI_PADDING'            ,(16 & @@OPTIONS))
				,('ANSI_NULLS'              ,(32 & @@OPTIONS))
				,('ANSI_NULL_DFLT_ON'       ,(1024 & @@OPTIONS))
				,('ANSI_NULL_DFLT_OFF'      ,(2048 & @@OPTIONS))
				,('ARITHABORT'              ,(64 & @@OPTIONS))	   --query execution statements
				,('ARITHIGNORE'             ,(128 & @@OPTIONS))
				,('NUMERIC_ROUNDABORT'      ,(8192 & @@OPTIONS))
				,('NOCOUNT'                 ,(512 & @@OPTIONS))
				,('QUOTED_IDENTIFIER'       ,(256 & @@OPTIONS))	   --miscellaneous statements
				,('CURSOR_CLOSE_ON_COMMIT'  ,(4 & @@OPTIONS))
				,('CONCAT_NULL_YIELDS_NULL' ,(4096 & @@OPTIONS))
				,('XACT_ABORT'              ,(16384 & @@OPTIONS))  --transaction statements
				,('IMPLICIT_TRANSACTIONS'   ,(2 & @@OPTIONS))				
	) TAB(OptionName,Setting)
UNION ALL
	SELECT '-------','-------'
UNION ALL 
	SELECT  'PROGRAM_NAME: ' +[program_name] + '** CLIENT_INTERFACE_NAME : '+client_interface_name 
	       ,'CLIENT_VERSION: ' +CAST(client_version as varchar(3))
		FROM sys.dm_exec_sessions
		WHERE session_id = @@SPID
GO

The output shows the user options that would be applied on any batch/stored proc, function, ad-hoc query if executed within the active connection (or the @@spid session from SQL Server’s perspective). The partitioned view includes the data provider’s details available through sys.dm_exec_sessions dynamic management view.

Figure 11, Post-login user options values

From the result (the figure above may be an overkill 🙂 ) we can conclude that data providers can change some of the user settings and that can produce unexpected results. Consider the scenarios below;

  • If we execute a Python script that i.e inserts a some rows into a table..
import pyodbc;
#import pytds 
conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=tcp:localhost;'
                      'Database=TestDB;'
                      'Trusted_Connection=yes;'
		      'Poolig = false')

cursor = conn.cursor()
cursor.execute('INSERT INTO dbo.testImplicitTran(Id) SELECT 511')

.. the INSERT will NOT happen since the data provider had set the IMPLICIT_TRANSACTIONS to ON. To see how SQL Server implements the post-login change, run the script along with the Extended Event Session – Figure 9.

  • If we execute an sqlCmd query to i.e insert some values in a table that has a filtered index, we’ll get the following error.
C:\Users>sqlcmd -S tcp:(local) -d TestDB -q "INSERT INTO dbo.testQIdentifierOFF SELECT 555"

Msg 1934, Level 16, State 1, Server ENETT-NB290, Line 1 INSERT failed because the following SET options have incorrect settings: ‘QUOTED_IDENTIFIER’. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.

Other data providers may change settings like;  dbLib  data provider sets ANSI_WARNINGS user option value to OFF. This may cause the logical errors that may be difficult to debug.

SSMS post login changes

SSMS application performs its own set of post-login changes. The changes can be managed through the application UI on two levels;

  • Current session level(query editor) – The changes affects only the current session.
    • Right Click on the query editor area/ Query Options/Execution – ANSI
  • Application level where the changes affects all sessions.
    • Tools/Options/Query Execution/ SQL Server/ANSI

Developers, should be aware of the fact that the same query executed from SSMS and a Client application may behave differently and produce different results. i.e The INSERT query executed through the Python script above would commit changes if executed in SSMS.

Pre-login changes

As mentioned before – Figure 7, there are nine User settings set by Data providers (or Sql Server engine) during the login process. In the next experiment I’ll try to show how the nine Server level settings get affected/overridden by the pre-login changes – loosely speaking 🙂

Initially, all server level user settings are turned OFF – Figure 2  and that includes the 9 options mentioned before.
Let’s see what happens on the session level if we set ALL Server level user options to ON. For the experiment I’ll use a PowerShell script similar to the one used before, to  connect to Sql Server.

## Create a simple connection to Sql Server using .NET Framework Data Provider for Sql Server
$sqlConn = New-Object System.Data.SqlClient.SqlConnection;

$sqlConn.ConnectionString = @('Data Source=tcp:(local);' `
                             +'Initial Catalog=TestDB;' `
                             +'Integrated Security=True;' `
                             +'Pooling=False;' `
                             +'Application Name=TestDataProviders;');

## Open and Close the connection
$sqlConn.Open();

###$sqlCmd = New-Object System.Data.OleDb.OleDbCommand;
$sqlCmd = $sqlConn.CreateCommand()
$sqlCmd.CommandText = "SELECT * FROM dbo.vwGetSessionSETOptions";

$result = $sqlCmd.ExecuteReader();

$table=New-Object System.Data.DataTable;
$table.Load($result);

$format = @{Expression={$_.OptionName};Label=”Option Name”;width=90},@{Expression={$_.Setting};Label=”Setting”; width=50}

$table | Format-Table $format;   

$sqlConn.Close();   
$sqlConn.Dispose();

Before executing the script, lets change server level user settings and check the effects of the changes through SSMS GUI, Figure 4.

DECLARE @user_settingsBitMask SMALLINT;

SET @user_settingsBitMask = ( POWER(2,15)- 1 ) -- all 15 settings turned on
                             - 1024  --turn off ANSI_NULL_DFLT_ON
							 - 64    --turn off ARITHABORT 
EXEC sys.sp_configure 
	 @configname = 'user_options' 
	,@configvalue = @user_settingsBitMask; --31679
GO
RECONFIGURE;

NOTE: Some of the user settings are mutually exclusive.

  • ANSI_NULL_DFLT_ON and ANSI_NULL_DFLT_OFF
  • ARITHABORT and ARITHIGNORE

The figure below shows the session level user options settings output for the two Server level user options settings scenarios. The snapshot combines two outputs.

FIgure 12, Session level user settings 

As we can see, the user options business is pretty unclear and confusing. Combining findings from the previous experiments, I compiled a table that may explain the sequence in which the user options are set, but before that, just another thing to mention – the ANSI_DEFAULTS setting. The setting controls(sets to ON) a group of user options values. This is to provide the ANSI standard behavior for the options below;

  • ANSI NULLS
  • ANSI_PADDING
  • ANSI_WARNINGS
  • QUOTED_IDENTIFIER
  • ANSI_NULL_DFLT_ON
  • CURSOR_CLOSE_ON_COMMIT
  • IMPLICIT_TRANSACTIONS

The status of the ANSI_DEFAULTS can be checked using the query below.

SELECT ansi_defaults
FROM sys.dm_exec_sessions 
WHERE session_id = @@SPID

When checked, status is usually 0 (not set) because some of the user settings from the list gets overridden by the pre-login/post login processes mentioned before.

Figure 13, User Option settings sequence

During application login attempt, data provider/Sql server engine turns ON a set of user options grouped in the ANSI_DEFAULT settings. This overrides the user options set on server level(Phase 1). Right after the change, another set of changes overrides a couple of ANSI_DEFAULTS options and a few server level user options(Phase 2). At this stage the login process is finished and  9 out of 15 option values is reported by the login xEvent(option_text field).The color of user options at the final session level shows the levels from which the values ​​came from.

*The ARITHABORT and ARITHIGNORE shows a different behavior.

  • The options are mutually exclusive if set on server level (only one setting can be set to ON.
  • If one of the settings is set on server level the setting will not be changed through the login process.
  • If none of the settings is set on server level, ARITHABORT will be set to ON during the Phase 2.
  • Only ARITHABORT value will be reported by the login_xevent.

Data providers like ODBC, dbLib perform additional changes to the user options after successfully establishing a database connection.

Session level setting

Once connected, Client code can override previously set user options  by using SET statements directly in the code. This would be the final override of the user option values 🙂 . The scope of the settings will be the current session( or from the Client’s perspective, the current connection). The following coder shows a few user options set in a stored procedure’s header…

CREATE OR ALTER PROCEDURE dbo.uspSetUserOptions
AS
BEGIN
	SET ANSI_DEFAULTS ON;
	SET IMPLICIT_TRANSACTIONS OFF;
	SET ARITHABORT ON;
	SET QUOTED_IDENTIFIER ON;

	SET NOCOUNT ON;
	SET XACT_ABORT ON;

	-- code ....

	RETURN;
END

Conclusion

The User Options are a set of configurable elements that affect query behavior i.e handling 3VL(three valued logic), division by zero situations,  data truncation rules etc. As Sql developers we need to be aware of the environment in which our queries are being executed. There are many different levels where user options can be set; Server, Database and session levels. Session level has precedence over the other two levels. However, Sql server can set up its own default user option values during and after the login phase – when a client application makes an attempt to connect to Sql Server. Data providers can also independently change user option settings, during and after the login process. The impact of the changes can introduce logical errors such as e.g always rolled back inserts/updates/deletes ( implicit transaction set to on) or code execution errors such as insert failures on a table with a filtered index. There are scenarios when user options settings can cause the sneaky logical errors that are very difficult to debug e.g.  what if NULL = NULL suddenly evaluates to true(ansi nulls off).
The session level can provide a place where we can set user option values which will override all previously set options. Designing templates for Sql Server objects (stored procedures, functions..etc) that will include the option settings in the header may be a good way to put the whole user option confusion to the rest 🙂

Thanks for reading.

Dean Mincic

 

 

Client Requests & Sql events in Sql Server

Page Contents

Client Requests & Sql events in Sql Server


Summary

Very often we, the sequel guys, use MS Profiler/Sql Server’s Extended events to intercept and analyse the traffic between client applications and our precious Sql Server. Sometimes we capture an hour/day of traffic, and sometimes we just want to capture a slow running stored procedure’s  input param values.  Also, we are involved in  the data layer code reviews in order to keep Client data requests as optimal as possible. This post is a sort of an internal notes/documentation about different types of requests  that can be sent from  Clients to Sql Server and the corresponding events visible in MS Profiler. The article also gives a high overview on the space between ADO.NET and Sql Server

MDAC – Microsoft Data Access Components

To start off, lets first take a brief look at the MDAC framework and where ADO.NET sits in the stack. MDAC technology allows applications to communicate with almost any data-store.
MDAC architecture implements three logical layers:

  • Programming interface layer (ADO, ADO.NET)
  • DB access layer (ODBC, OLEDB, .NET Managed providers)
  • Data store

All of the layers are accessible through the MDAC API. (+ MS Sql Server network library)*


Figure 1, MDAC architecture

NOTE: Microsoft Network Library(Net-lib) is also part of MDAC and sits between TDS(Tabular data stream) Formatter and  the network protocol. The DLLs are specifically used by SQL Server to communicate with the Clients on the network level.  It supports the network protocols: Named Pipes, TCP/IP, Shared Memory, VIA).

ADO.NET

ADO.NET provides a generic interface(set of classes that expose data access services)  to a number of different data-stores (SQL Server,CSV, Excel..) from .NET applications. It is a set of libraries included in MS .NET Framework.
The main objectives of ADO.NET is to;

  • connect to a data source
  • submit queries
  • process results
  • provide a powerful disconnected data structure to perform data operations offline.

There are two main  components of ADO.NET for accessing and manipulating data

    • .NET Framework data providers – a set of class libraries that understand how to interact with specific data platform e.g Sql Server, or with a common data layer i.e OLE DB. Data providers transport data between proprietary data platforms and the generic ADO.NET data layer. ADO.NET framework includes three providers
      • The MS Sql Server Provider – exposed through the System.Data.SqlClient namespace
      • The OLE DB provider – exposed through the System.Data.OleDb namespace.
      • The ODBC provider – exposed through the System.Data.Odbc namespace.
    • DataSet – a class(System.Data namespace) that provides a disconnected representation of result sets from the Data Source. The object provides a consistent relational programming model regardless of the data source. It includes objects like: DataTable, DataRow,  DataView, DataColumn, PrimaryKey ..etc

The key classes within each provider(each provider has its own specific class names for the same objects) include:

  • Command (SqlCommand),
  • Connection (SqlConnection),
  • DataAdapter (SqlDataAdapter)
  • DataReader (SqlDataReader).


Figure 2, ADO.NET components, high overview

TDS (Tabular data stream)

TDS is an application layer protocol used to transfer tsql command between client and server. The protocol includes facilities for;  authentication and identification, encrypted negotiation, issuing of Sql batch(language events) and stored procedures(RPC) calls, managing and distributing the outcome of atomic transactions. TDS describe the names, types and optional descriptions of the rows being returned.   It was developed by Sysbase in 1984 and adopted by Microsoft in 1990. MS-TDS(Microsoft’s version of the protocol) specification can be found here.

Figure 3,Communication flow in TDS

More on how the Client talks to Sql Server

Client communicates with Sql Server in a layered fashion. Each layer exchanges information with its neighboring layer. Sometimes the layers can be bypassed.  Only the Transport protocol layer enables physical communication between Client and Sql Server. For all other layers the exchange of information is done locally – in RAM memory and  between different dlls.
The layers are:

  1. Application Layer (C# code, etc)
  2. High level data access API (ADO,ADO.NET…)
  3. Client data interface(OLEDB,ODBC, db-lib ..)
  4. Client TDS Formatter. The tsql commands sent to Sql server has to be in TDS format.
  5. Client net protocol (Shared memory, TCP/IP sockets, net-lib)
  6. Client transport protocol (TCP/IP ..)
  7. Server transport protocol (TCP/IP ..)
  8. Server net protocol (Shared memory, TCP/IP sockets, net-lib)
  9. Server data interface
  10. SQL Server

NOTES:  App level(A) can bypass ADO.NET  level(B) if we directly access teh Client data interface (direct API calls to OLE-DB, ODBC..etc).
Client protocol(E) and Client transport protocol(F) can be bypassed if the Client and Sql Server communicate through the Shared memory protocol. In this scenario Client and SQL Server run on the same machine. The components exchange information through the RAM memory space. The network elements are not required.

Client data requests

The only way clients can interact with Sql Server is by sending requests that contain TSQL commands for the database engine. The requests are sent over MS-TDS protocol(Microsoft’s version of TDS).
There are three main forms of the requests

  • Batch requests
  • RPC (Remote Procedure Requests)
  • Bulk Load Request

In the following examples I’ll use ADO.NET components to present the different  types of Client requests. I’ll  use MS Profiler to capture and analyse Events which corresponds to the requests.

Events

When Sql Server receives a message from a client(through TDS), it can be thought of as an event occurring – see  Figure 3, First client sends a connection request(a login event) and gets back success or failure response. Once connected, client requests typically fall into one of two categories:

  • Language events
  • RPC events

Language Events

A language event is a tsql batch sent from the client to the server. From ADO.NET perspective a tsql batch* is a set of tsql commands defined in the CommandText property of the SqlCommand Class. The command can be executed using one of the commands; ExecuteReader, ExecuteScalar, ExecuteNonQuery and ExectueXMLReader.

The commandType enum property of the SqlCommand Class specifies how the command string is interpreted. The Field values can be:

  • CommandType.StoredProcedure (contains the name of a stored procedure or a scalar function)
  • CommandType.Text (contains an tsql batch – one or more tsql statements, no GO directives)
  • CommandType.TableDirect (contains a table name. Only supported by the .NET Framework Data Provider for OLD DB)

NOTE: A tsql batch can be defined as one or more tsql statements before the GO* directive(end of batch signal). This is specific to Microsoft SQL stack tools i.e SSMS, sqlcmd, osql. isql- Sybase It is not part of tSql language and may not be recognised by other similar tools. It is not permitted to use GO directive within CommandText property.
After a TDS request reaches SQL Server, the db  engine will create a TASK to handle the request. One task handles only one batch(a task may spawn a number of sub-tasks to support parallel executions). When ADO.NET “executes” a batch(one or more tsql commands), one TASK will handle the request. When SSMS executes i.e two batches(the same tsql command twice – GO 2, or two tsql statements divided by the GO terminator), Sql Server engine(SQLOS) allocates two separate TASKS for the job.

Non-parameterised batch request

This type of request does not have parameters. It can contain local variable declarations. The request is treated by Sql Server engine as an ad-hoc query.
Adhoc queries  can be cached and reused only if a subsequent batch matches exactly – exact textual match.  i.e 

SELECT * FROM dbo.Products WHERE [name]='bread';
--and
SELECT * FROM dbo.Products WHERE [name]='butter';

The query plans above are  compiled and cashed separately. This behavior can lead to the problem known as “the cache pollution”.
Sql server optimiser may decide to automatically parameterise an ad-hoc query, but it’s very conservative when making those decisions. The query template must be considered to be safe, meaning that the same query plan doesn’t change even if the actual parameter values change – the paramterisation must not degrade query performances. Also, query engine decides on the auto-parameter data type (smallint in the example below). The datatype may be changed due to change in the input parameter size(PurshaseOrderId = 3420 can fit in 2bytes-smallint, but PurshaseOrderId = 210 fits into 1byte-tinyint). In that case the optimiser cashes another copy of the plan that includes the new data type.
In addition, many query elements disallow automatic parameterisation, such as; JOIN, IN, INTO, DISTINCT, TOP.. just to name a few.

--this query may be autoparameterised since it uses the same plan regardless of the 
--predicate's value (PurchaseOrderID is a clustered primary key)
USE AdventureWorks2012
GO
SELECT PurchaseOrderID
      ,[Status]
      ,OrderDate
      ,SubTotal
FROM Purchasing.PurchaseOrderHeader
WHERE PurchaseOrderID = 3420; --843, --1 ..
--the plan template looks like
/*
(@1smallint)SELECT[PurchaseOrderID],[Status],[OrderDate],[SubTotal] 
            FROM [Purchasing].[PurchaseOrderHeader] 
            WHERE [PurchaseOrderID]=@1
*/

-- the query below uses a predicate with an uneven distribution of values. This means that the
-- plan shape depends on the predicate value and therefore the auto parameterisation would degrade 
-- the query's performance
USE AdventureWorks2012
GO
SELECT  ProductID
       ,PurchaseOrderID
       ,OrderQty
 FROM Purchasing.PurchaseOrderDetail 
 WHERE ProductID = 707 --319 --391

Let’s execute the second query through a simple console application. The predicate value will be passed as a part of the batch.  Pass 707 as a command line argument; (ProductId = 707)
NOTE: We can also use SSMS to run the batch.

using System;
using System.Data;
using System.Data.SqlClient;

namespace TestClientRequest_batch
{
    class Program
    {
        static void Main(string[] args)
        {
            int ProductId = 0;

            //get arguments
            if (args.Length != 0)
            {
                ProductId = Convert.ToInt16(args[0]);
            }

            using (SqlConnection conn = new SqlConnection())
            {
                conn.ConnectionString = @"Data Source=tcp:(local);
                                          Initial Catalog=AdventureWorks2012;
                                          Integrated Security=True;
                                          Application Name=TestClientRequests";

                using (SqlCommand cmd = new SqlCommand())
                {
                    cmd.Connection = conn;
                    cmd.CommandType = CommandType.Text;
                    cmd.CommandText = @"SELECT  ProductID
                                               ,PurchaseOrderID
                                               ,OrderQty
                                        FROM Purchasing.PurchaseOrderDetail 
                                        WHERE ProductID= " + ProductId.ToString();
                    conn.Open();

                    //use sql reader to execute the query 
                    using (SqlDataReader dr = cmd.ExecuteReader())
                    {

                        Console.WriteLine("ProductID PurchaseOrderID OrderQty\t");
                        while (dr.Read())
                        {
                            Console.WriteLine(dr[0].ToString() + "\t " + dr[1].ToString() + "\t\t\t" + dr[2].ToString());
                        }
                    }
                }
                Console.ReadKey(); //prevent auto close
            }
        }
    }
}

Figure 4, MS Profiler – tsql batch execution events 

The Client request is executed as a tsql batch. SQL:BatchStarting signals that the batch execution is starting.  SQL:StmtStarting event was fired for each statement in the batch(once in this example).

RPC Events

There are two basic ADO.NET client request types which invoke RPC(Remote Procedure Call) events

  • Parameterised batch request  – uses sys.sp_executesql Sql Server’s system extended stored proc
  • Stored procedure and  scalar user defined function call

NOTE: RPC(Remote Procedure Call) name came from Sybase and in context of the Client request represents a type of formatting of the query inside a TDS packet before it is sent to Sql Server. The requests are pre-formated by the driver/provider i.e OLEDB, ODBC,JDBC. ..etc
Windows RPC(Remote Procedure Call) is completely different technology which implements the interprocess communication(IPC) that allows applications to talk to each other locally or over the network. More on the RPC  here.

Parameterised batch request

Parameterised type of request does have parameter values passed to tSQL statements. Parameter input is treated as a literal value, not  as an executable code.
This type of request significantly improves the way Sql Server process the tsql commands. The sql batch  separates the “body” of the query from the data values managed by that query.
Lets change SqlCommand object from the previous example to use a parameterised tsql statement and see what happens.

...
using (SqlCommand cmd = new SqlCommand())
{
    cmd.Connection = conn;
    cmd.CommandType = CommandType.Text;
    cmd.CommandText = @"SELECT  ProductID
                               ,PurchaseOrderID
                               ,OrderQty
                        FROM Purchasing.PurchaseOrderDetail 
                        WHERE ProductID= @ProductId";

    SqlParameter ProdId = new SqlParameter("@ProductId", SqlDbType.Int);
    cmd.Parameters.Add(ProdId).Value = ProductId;

    conn.Open();
...

Figure 5, MS Profiler – parameterised batch RPC event

This time ADO.NET constructs the request as a RPC(Remote Procedure Call) type. The client sends an RPC message data stream(TDS protocol level) to Sql server. The message contains: Procedure name( or in this case the ProcID) and the Parameters.  ProcID identifies Sql Server’s internal sp to be executed. In the example above ProcID = 10. The id value maps sys.sp_exeuctesql extended stored procedure in Sql Server.

Figure 6, sys.sp_executesql extended stored procedure

NOTE: We cannot use SSMS to invoke RPC event types since it’s not possible to send Parameters as a part of the TDS binary stream.

The parameterised batch request results with a cached plan, just like a regular stored procedure call.

SELECT   cp.usecounts
        ,cp.cacheobjtype
        ,cp.objtype
        ,txt.[text]
        ,qp.query_plan
FROM    sys.dm_exec_cached_plans cp
    CROSS APPLY sys.dm_exec_sql_text(plan_handle) txt
    CROSS APPLY sys.dm_exec_query_plan(plan_handle) qp
WHERE   cacheobjtype = 'Compiled Plan'
    AND [text] NOT LIKE '%dm_exec_cached_plans%'
    AND [text] LIKE '%FROM Purchasing.PurchaseOrderDetail%'

Figure 7, parameterised batch request – cached plan

Stored procedure call

This type of RPC request is similar to the parameterised batch request. The main difference is that the commandType enum property of the SqlCommand Class is set to “StoredProcedure“, and the CommandText property value represents the name* of the stored procedure.

Quick note: In order to avoid cache misses/compile locks, use the two part naming convention when referencing  sql objects – dbo.uspGetPurchaseOrderDetails. More about sp caching can be found here.

Let’s encapsulate previous tsql batch in a stored proc…

USE AdventureWorks2012;
GO
CREATE PROCEDURE dbo.uspGetPurchaseOrderDetails
     @ProductId int 
AS 
BEGIN
     SET NOCOUNT ON;

     SELECT   ProductID
             ,PurchaseOrderID
             ,OrderQty
     FROM Purchasing.PurchaseOrderDetail 
     WHERE ProductID= @ProductId;  

     RETURN;
END

and then modify CommandType and CommandText properties as follows…

.....
using (SqlCommand cmd = new SqlCommand())
                {
                    cmd.Connection = conn;
                    cmd.CommandType = CommandType.StoredProcedure;
                    cmd.CommandText = @"dbo.uspGetPurchaseOrderDetails";

                    SqlParameter ProdId = new SqlParameter("@ProductId", SqlDbType.Int);
                    cmd.Parameters.Add(ProdId).Value = ProductId;

                    conn.Open();
...

Figure 8, MS Profiler – RPC Stored proc call

Using RPC: Starting and/or SP:Starting  events we can find out the value of the stored procedure’s input parameter(s).  It is interesting to notice the absence of the stored procedure’s ObjectId within the RPC call. The initial request doesn’t know whatID (object_id) is used by Sql Server to identify the stored procedure.

SELECT  o.[name]
        ,o.[object_id]
        ,spdef.[definition]
FROM sys.objects o 
    INNER JOIN  sys.all_sql_modules spdef
        ON spdef.[object_id] = o.[object_id]
WHERE o.[name] = N'uspGetPurchaseOrderDetails' 
/* --result
name                        ObjectId      Definition
-----------------------------------------------------
uspGetPurchaseOrderDetails  1364199910    CREATE PROC..

*/

Not directly related to the topic but nevertheless interesting observation is the SP:CacheMiss event. The event describes the failed cache lookup for the stored procedure on the very first run as the plan was not generated and stored in the cache.

Now, let’s execute a  stored procedure that has a nested sp.

--------------------- top level stored proc --------------
USE AdventureWorks2012
GO

ALTER PROCEDURE dbo.uspGetPurchaseOrderDetails
        @ProductId int 
       ,@ProdName dbo.[Name] OUTPUT --NVARCHAR(50)
AS 
BEGIN
    SET NOCOUNT ON;
      
    SELECT   ProductID
            ,PurchaseOrderID
            ,OrderQty
    FROM Purchasing.PurchaseOrderDetail 
    WHERE ProductID= @ProductId

    --get the product name
    EXEC dbo.uspGetProductName
            @ProductId
           ,@ProductName = @ProdName OUTPUT      
    RETURN;
END
go 
--------------------- nested stored proc --------------

CREATE  PROCEDURE dbo.uspGetProductName
      @ProductId INT
     ,@ProductName dbo.[Name] OUTPUT --NVARCHAR(50)
AS
BEGIN
    SET NOCOUNT ON;

    IF @ProductId IS NULL
        RETURN;

    SELECT @ProductName = [Name]
    FROM Production.Product 
    WHERE ProductID = @ProductId;
    
    RETURN;
END 
-----------------------------------------------------

… add an output parameter to the code ..

...
using (SqlCommand cmd = new SqlCommand())
{
          cmd.Connection = conn;
          cmd.CommandType = CommandType.StoredProcedure;
          cmd.CommandText = @"dbo.uspGetPurchaseOrderDetails";

          SqlParameter ProdId = new SqlParameter("@ProductId", SqlDbType.Int);
          SqlParameter ProdName = new SqlParameter("@ProdName", SqlDbType.NVarChar,-1); // -1 varchar max
               
          ProdName.Direction = ParameterDirection.Output;
                    
          cmd.Parameters.Add(ProdId).Value = ProductId;
          cmd.Parameters.Add(ProdName);

          conn.Open();

          //use sql reader to execute the query 
          using (SqlDataReader dr = cmd.ExecuteReader())
          {
               Console.WriteLine("ProductID PurchaseOrderID OrderQty\t");
               while (dr.Read())
               {
                 Console.WriteLine(dr[0].ToString() + "\t " + dr[1].ToString() + "\t\t" + dr[2].ToString());
               }
                        
           }
//data reader object must be closed before retrieving the output parameter
Console.WriteLine(ProdName.Value.ToString());
}
....

From the Profiler’s trace below we can see that the only one RPC request was initiated for the top level stored procedure.

Figure 9, RPC request and nested stored procedures

From the trace above we can observe a few interesting things

  • Only the first, top level stored procedure call is initiated through the RPC event. This is the request initiated by ADO.NET.
  • The nested stored procedure call was captured through SP:Starting and SP:StmtStarting events. The input parameter values are not visible(more about this later).
  • RPC Output parameter* event is triggered AFTER the RPC:Completed event. This explains why we need to close Data reader object before retrieving the output parameter(see the c# code above)

NOTE: When the RPC is invoked, some or all of its parameters are designated as output parameters. All output parameters will have values returned from the server. For each output parameter, there is a corresponding return value, sent via the RETURNVALUE token. More about this can be found here.

How to capture nested stored procedure parameter values

In the example below, the nested stored procedure call was captured through the SP:Starting and SP:StmtStarting events. it is not possible to capture nested procedure(s) input parameter values only by examining this two events. Sql Profiler is not a debugging tool and it shows only what has been executed. This means that it can show only the parameters values that are initially passed from the data layer, not during the procedure execution.
So how can we capture a nested stored procedure input parameter values?  Well, we can get the values if

  • The top level stored proc parameter(which we can get through the RPC event) value is passed unchanged to the nested sp.
  • The nested stored proc parameters are passed as literals i.e .. EXECUTE dbo.MyStoredProc @name=’John Doe’, @Age=42 ..

However, there is another, more interesting way to capture the parameter values.

We can include the Showplan XML Statistic Profile  event class to our trace. The event class triggers when Sql Server executes an SQL statement and shows complete, compile-time data, including the parameter values we are interested in.

Figure 10, Showplan XML Statistic Profile event class

We can export the plan(XML format) and then search for the ParameterList element.

<ColumnReference  Column="@ProductName" 
                  ParameterDataType="nvarchar(50)" 
                  ParameterCompiledValue="NULL" 
                  ParameterRuntimeValue="N'HL Crankarm'">
</ColumnReference>
<ColumnReference  Column="@ProductId" 
                  ParameterDataType="int" 
                  ParameterCompiledValue="(319)" 
                  ParameterRuntimeValue="(319)">
</ColumnReference>

Now we can analyse the parameters’ runtime/compiled values and the hunt for the parameter sniffing issues can begin 🙂

NOTE: Showplan XML Statistic Profile is an expensive event to track. It may be a good idea to use the server side tracking instead or the Profiler UI. ..

Conclusion

In my opinion,  It is beneficial for the Sql database developers to be familiar with the database access patterns from the client end. Knowing how different components i.e ADO.NET, OLEDB, ODBC, TDS etc. fit together can help when making design decisions. The way Client constructs its data requests dictates the database engine behavior i.e Sql Server treats non-parametersied queries as ad-hoc queries whereas it caches the execution plans for the parameterised queries the same way as it does for stored procedures. Understanding how different events( captured by Profiler or Extended events) relate to the client requests can help us find and understand the root causes  for the slow running queries, excessive locking etc. that may not be on the Sql Server side but on the data layer.

Thanks for reading

Dean Mincic

 

 

 

Read Only Routing in Sql Server

In this blog, I wanted to share a recent experience of implementing Read Only Routing functionality on the existing HA/AG Sql Server Cluster. The technology significantly offloaded the primary production server and enhanced  the read queries performances.

The Sql Server HA/AG environment
Windows Failover Cluster

The main business application uses Sql Server set of databases hosted by a WSFC(Windows Failover Cluster). The cluster has two nodes, NODE1 and NODE2. Both nodes have Windows 2012 R2 Server OS and Sql Server 2014 SP1 database platform. The failover cluster has VNN(Virtual Network Name) CLUST1.

Availability group

The system has an Availability Group AG1 (and consequently a WSFC resource group). The AG1 group members are the main business application’s backend databases.
The two availability replicas host the AG1 availability group databases. Each replica exists on a different node, primary on the NODE1, and secondary on the NODE2.
The primary replica is read/write. The secondary replica, in addition to providing the HA/DR for AG1, is set to allow Read Only access for the clients. The readable secondary replica is set to “YES” which allows read-only workloads even without  ApplicationIntent parameter in the connection string definition.

The AG1 availability mode is set to : Synchronous-commit mode
The AG1 failover type is : Automatic failover

The business databases usage pattern

Because the way the cluster was set up the clients were connecting to the production databases through WSFC – CLUST1. All the requests were directed to the current primary replica on the NODE1. The secondary replica on the NODE2, even if set up to allow read-only connections, was never used for that purpose. It was possible to connect to the current secondary replica directly through NODE2 though.
The clients who use the databases are:

  1. ~200 concurrent business application users. They perform read/write queries through the app’s UI

  2. ~10 scheduled jobs that runs regularly during the business hours. The tasks mostly extract data from the business databases (read only operations) in order to provide the required business information for a number of different applications on a number of different platforms i.e Windows Active Directory, AWS, AZURE, Linux etc.

  3. An “unknown”, or should I say, NULL 🙂  number of users who regularly run the ad-hoc, read-only queries through a number of different applications e.g Excel Power Queries, Tableau, Crystal Reports and who knows what..

  4. A number of custom, in-house build, LAMP model applications (only in this case M stands for Ms Sql Server..) that also perform read only operations.

Even if the high-end hardware keeps the cluster’s performance on the acceptable level, there is a lot of room for a number of improvements. One of these improvements, the Read-Only Routing  is described in this blog post.

Read Only Routing

Simply speaking, with Read Only Routing in place, within the connection string, we can specify the client app. connection intention. The intention will later decide on which server the request will be routed. The intention can be:

  • Read only – Client connection needs only to read data
  • Read and write (Default value) – Client connection needs to read and write data.

This means that if we know that an application needs only to read data, we can use a connection string property ApplicationIntent with the value of READONLY to “tell the cluster” to redirect the requests to the secondary, read-only replica. This will offload the NODE1 resources(assuming that the current server role is the primary replica) and will “employ” the NODE2 resources in order to execute the queries.
The connection string property ApplicatoinIntent is supported by all SQL Server 2014 Clients such as ADO.NET(SqlClient),  Sql Server Native Client 11.0 ODBC driver,  SQLNCLI11 OLEDB Driver.
To make the cluster be able to recognise and redirect such a connections, we need to set up:

  • Availability group Listener for the same subnet ( in this case both replicas are on the same subnet)
  • Read-Only routing list
Availability group listener

The availability group listener is a WSFC resource that is logically bound to the AG through the corresponding WSFC resource group . It’s a VNN(virtual network name) with one or many IP addresses attached to it. The Listener runs on a single node(always on the current primary replica) and just handles the connection forwarding. The listener endpoint is always the primary replica.

It is possible to create multiple listeners for an availability group. Only the first listener can be created through tsql. The others can be created using WSFC Failover Cluster Manager or PowerShell script.
Since I was setting up the first listener for AG1, I was able to use tsql.

To set up an AG Listener we need to:

  1. Set up a DNS entry, preferably using a static IP address. In this case the DNS name is AG1-Listener. *
  2. Add the listener to the existing availability group, AG1.

*NOTE: If you decide to use SSMS to create a new listener, SSMS (through WSFC) will create a DNS entry for you. In this case you’ll need to provide the required AD credentials to the WSFC to be able to create the entry – Create Computer Objects permission.

Add the listener to the existing availability group

USE [master]
GO
ALTER AVAILABILITY GROUP AG1
    ADD LISTENER 'AG1-Listener' (
        WITH IP(('10.XX.XX.XX','255.XX.XX.XX'))
       ,PORT = 1433)

Check the listener’s metadata:

SELECT gl.dns_name
      ,gl.port
      --,gl.is_conformant
      ,gl.ip_configuration_string_from_cluster
      ,lip.ip_address
      ,lip.ip_subnet_mask
      ,lip.is_dhcp
      ,lip.network_subnet_ip
      ,lip.network_subnet_prefix_length
      ,lip.network_subnet_ipv4_mask
      ,lip.state_desc
FROM sys.availability_group_listeners gl
INNER JOIN sys.availability_group_listener_ip_addresses lip
    ON gl.listener_id = lip.listener_id

Set up Read-only routing lists

Read-Only routing URL is a pointer that determines which replica will response to the client driver(provider) Read-Only request.

Read-Only routing mechanism performs the following sequence of events when allocating a secondary replica to route the clients requests to.

  1. Client requests a connection to AG1-listener specifying ApplicationIntent=ReadOnly, Server=’AG1-Listener’, Database=’prod_db’.  The request is routed to the NODE1(primary replica).
  2. NODE1 checks the incoming connection, ApplicationIntent parameter value and checks if the specified database is in the availability group.
  3. The server enumerates and checks the replicas defined in the Read-Only routing list.
  4. The routing target is the first readable secondary replica that accepts ALL or Read-Only connections from the list.
  5. The server sends the secondary replica’s Read-only routing URL to the client.
  6. The client then redirects the connection request to the secondary replica following the provided URL.

To set up Read-only routing list we need to assign the Read-only routing URLs for both of the replicas since the replicas may switch roles during the failover. The URLs is used to tell the client where to redirect the Read-only requests.
The URL contains network protocol, system address and the port number.

/*
Define Read-Only URL on the current primary replica.
The URL will be used when the replica switch role and become secondary rep.
*/
ALTER AVAILABILITY GROUP AG1
	MODIFY REPLICA ON N'NODE1'
	WITH (SECONDARY_ROLE(READ_ONLY_ROUTING_URL=N'tcp://NODE1.myDomain.au:1433'))
GO

/*
Define Read-Only URL on the current secondary readable replica.
*/
ALTER AVAILABILITY GROUP AG1
	MODIFY REPLICA ON N'NODE2'
	WITH (SECONDARY_ROLE(READ_ONLY_ROUTING_URL=N'tcp://NODE2.myDomain.au:1433'))
GO

Check the Read-Only URLs

SELECT replica_server_name 
      ,[endpoint_url]           AS WSFC_NodeEndpointURL
      ,availability_mode_desc   AS AvailabilityMode
      ,failover_mode_desc       AS FailoverType
      ,primary_role_allow_connections_desc AS AllowedConnectionsWhenPrimary
      ,secondary_role_allow_connections_desc AS AllowedConnectonsWhenSecondary
      ,read_only_routing_url 
FROM sys.availability_replicas
WHERE read_only_routing_url IS NOT NULL

..and finally create the Read-Only routing list

--when a client read-only request hits the primary replica, the server will response with 
--the secondary replica (NODE2) URL since the replica is the first on the list
ALTER AVAILABILITY GROUP AG1
    MODIFY REPLICA ON N'NODE1'
    WITH (PRIMARY_ROLE(READ_ONLY_ROUTING_LIST=(N'NODE2',N'NODE1')))
GO

-- and if the replicas swith the places, the system will response with the 
-- current secondary replica( now NODE1) URL
ALTER AVAILABILITY GROUP AG1
    MODIFY REPLICA ON N'NODE2'
    WITH (PRIMARY_ROLE(READ_ONLY_ROUTING_LIST=(N'NODE1',N'NODE2')))
GO

Check the read-only routing lists

-- https://blogs.msdn.microsoft.com/alwaysonpro/2014/01/22/modifying-alwayson-read-only-routing-lists/
SELECT  ag.name AS [Availability Group]
       ,ar.replica_server_name AS [When Primary Replica Is]
       ,rl.routing_priority AS [Routing Priority]
       ,CASE 
            WHEN ar.replica_server_name = ar2.replica_server_name 
                THEN 'Read-Only on primary' + '('+ ar.replica_server_name + ')'
            ELSE ar2.replica_server_name
        END AS [Read-Only Routed To]
       ,ar.secondary_role_allow_connections_desc
       ,ar2.read_only_routing_url
FROM    sys.availability_read_only_routing_lists rl
        INNER JOIN sys.availability_replicas ar 
            ON rl.replica_id = ar.replica_id
        INNER JOIN sys.availability_replicas ar2 
            ON rl.read_only_replica_id = ar2.replica_id
        INNER JOIN sys.availability_groups ag 
            ON ar.group_id = ag.group_id
ORDER BY ag.name
       ,ar.replica_server_name
       ,rl.routing_priority
A few more things about the readable, secondary replicas

There is a certain data latency between primary and secondary replica. The data movement process usually takes a few seconds and follows the pseudo-sequence below.

  • User make a change on  AG databases (Begin transaction)
  • Primary replica sends a log information which describes the actions to be performed on the secondary replica(s).
  • The secondary replica(s) runs a dedicated “Redo” process that commits the changes on the secondary replica. At this point in time, the changes are not visible by the read-only queries that runs on the secondary. The readers are not blocked by the redo actions* as they read the previous version of the data that’s being changed.
  • The change is committed on the primary replica (Commit transaction ). The changes are visible on both replicas.

*All read-only queries that runs on the secondary databases are automatically executed within the SNAPSHOT TRANSACTION ISOLATION LEVEL. All locking hints e.g (tablock,xlock,..etc) are ignored.

The DDL actions on the primary replica(applied through the log records redo actions) and conversely, the READ actions on the secondary replica can block each other.  Secondary replica’s “SELECT” queries require Sch-S locks that guarantee the schema stability of the objects in use  e.g During the execution of.. Select * from tab1, the lock manager has to prevent other sessions from changing the tab1 table structure during the select operation. On the other hand, the redo DDL actions require Sch-M (Schema modification locks) that prevent concurrent access to the locked resource e.g (TRUNCATE TABLE tab1 ). Schema locks are acquired regardless of transaction isolation level. Sch-S is compatible with all other locks except Sch-M (and  Sch-M is incompatible with all lock modes). It’s probably not a good idea to perform read operations on the secondary replica if the main system performs frequent DDLs.

Conclusion

Always On Availability group  active, secondary replicas support the Read-Only access. The client requests for read-only access can be automatically redirected to the secondary replica. This will help to conserve primary replica’s resources for the mission critical workloads. The read-only queries will run faster since the workloads ,by default, use row versioning (snapshot isolation level) to remove read/write blocking. The read-only access will have some data latency and that needs to be considered when deciding to go with this technology.

Thanks for reading.

Dean Mincic

 

XACT_ABORT and Query Timeout in Sql Server

In this blog, I have explored how the XACT_ABORT setting controls Sql Server behavior when it interacts with Desktop and Web applications, in the situation of a query timeout runtime error. It is particularly interesting to see how SQL Server works with web applications that use the connection pooling mechanism in those conditions.  I also briefly covered MARS(Multiple Active Result Sets) technology and how it relates to XACT_ABORT.

The performed tests demonstrate the behavior of a typical stored procedure when executed in a session with the XACT_ABORT set to ON/OFF and in the query timeout situation. The tests covered a Desktop -SSMS  and a web application.

Here is a bit of theory we need to know in order to understand the test results.

XACT_ABORT
It is one of the parameters that define the current session behavior.  The parameter belongs to the group of Transaction statements along with IMPLICIT_TRANSACTIONS, TRANSACTION ISOLATION LEVEL, and REMOTE_PROC_TRANSACTIONS (deprecated in Sql Server 2014). More about the parameter can be found here.

Query Timeout

The queries executed on the client side may be canceled during the execution after a certain period of time. One of the reasons can be the “query timeout”, a scenario in which the app. code decides to cancel the ongoing action if the action takes more time to finish than expected(defined).

Query timeout is a client-side concept. After a certain period of time, a client, more specifically the db provider will raise an Attention event which will signal the query timeout to the db server.  At the next available opportunity, Sql Server will stop/abort the currently executing query in the context of the current connection. The sudden stop will not raise any Sql server errors.

XACT_ABORT defines how Sql Server handles the ongoing transactions in these situations.
It is worth noting to mention the Remote query timeout(Exec sp_configure ‘remote query timeout’). This server-scoped setting is related only to the queries executed by a remote source. e.g. Linked server.

When executing Update/Insert/Delete actions within an explicit transaction in SQL Server, there’s a risk of the transaction remaining open (abandoned) if it’s instructed to stop query execution. In such cases, various SQL Server resources may remain locked, making them unavailable to other queries. By default, XACT_ABORT is set to OFF, which means that, in case of a timeout, SQL Server leaves the handling of uncommitted transactions to the caller (the application code).

The application code can be designed to handle exceptions by either rolling back interrupted transactions and closing the connection or by simply closing the connection, allowing any open transactions to be automatically rolled back*.

*A brief note: In SQL Server, a transaction’s scope is defined differently from the server’s and client’s perspectives. From the server’s point of view, a transaction’s scope is a session. However, from the client’s perspective, the scope is a connection.

In SQL Server, a session can have zero, one, or more* related connections. Internal SQL Server engine processes (with SPID <= 50) don’t have related external connections. When using MARS (Multiple Active Result Sets), a session is associated with multiple hierarchical connections. The client’s connection serves as the parent, and together with the child connections, is logically scoped as ‘one per batch.’ Consequently, the scope of a transaction within the MARS-enabled connection is a batch also known as the  Batch scoped transactions.  This is particularly relevant when the MARS session is in the ‘Local Transaction Active‘ state, where all statements executed in a session run under an explicit or implicit transaction.

To examine the hierarchical organization of transactions, you can use dynamic views like sys.dm_exec_connections, focusing on columns such as net_transport, parent_connection_id, and connection_id.

MARS technology is a super interesting subject for research and certainly cannot fit into a “brief note”  🙂 

Session IDs’ are recyclable and are related to the unique transaction IDs.

SELECT *
FROM sys.dm_exec_sessions
WHERE is_user_process = 1; 
–-the predicate filters out the internal processes.
Connection Pooling

Connecting to the database is a resource-intensive operation. Each time a client interacts with a database, it establishes a new connection, which involves creating a new physical channel/socket, parsing the connection string, performing database authentication, and more. Because of the disconnected nature of the web-based applications, the connections are closed as soon as the client is finished with the db request. Constant opening and closing connections wouldn’t be a problem if there were a small number of clients who don’t interact with the db often. In reality, there is a large number of concurrent clients fighting for server resources.
The idea behind the Connection Pool(CP) mechanism is to allow already existing connections to be reused by the clients. Essentially, CP is a container object that contains a list of connections defined by the same connection string. A connection in this context is a pointer to the memory location where the connection object is stored.

Connection Pools are controlled and maintained by the database providers, like the .NET Framework Data Provider for Sql Server or the ADO.NET set of data access components based on the provider. Connection Pooling is active by default(even if not specified in the connection string). The Pool properties are a part of the connection string properties(key-value pairs).
More about connection pooling and pool fragmentation can be found here.

Another thing that would be interesting to explore is the sequence of events between a web application and Sql server during the connection pooling. The following is one of the typical scenarios.

  1. A Client requests data from SqlServer. There is no CP container defined based on a connection string that will be used to define the connection. Client calls the Open() method  of a connection object  e.g based on SlqConnection class if using .NET Data Provider for Sql Server)
  2. SqlServer authenticates the user and stores the connection/sql session data. The Audit Login event shows;  sessionid, EventSubClass =”1- Nonpooled”, Connection type  = “1- Non DAC” …etc. The requested connection is not part of the connection pool yet.
  3. The client executes the T-SQL batch and closes the connection (Close() method of the connection object). Even if enclosed e.g. in the “using” structure, the connection object won’t be destroyed but only referenced by a pointer now stored in a newly created connection pool object container. The object lives in the app. server memory. At this point, the Sql server is not aware that the client closed (returned) connection to the pool. From sql server point of view, the connection is still open. This also means that the abandoned(not committed) transactions, if any, will remain open – This will be a part of the following tests :)
  4. The same or a different client requests to open() the same connection. The code will not create(instantiate) a new connection object. Instead, the pool manager will provide the memory location of the existing connection object. Again, Sql is not aware of what’s happening on the client side and still keeps the information about the original connection.
  5. The client now executes a T-SQL batch e.g. using Sqlcommand /ExecuteNonQuery object/method in the context of the connection. It is only at this point that Sql Server receives the information about the connection being reused by someone.
    • a) The event Audit Logout is fired indicating that the existing connection is pooled. The EventSubClass = “2 – Pooled”.
    • b) The data access API layer(in this case SqlClient) executes system stored procedure sp_reset_connection (the sp is not available through T-SQL) to clean* the existing context of the connection and re-validate the authorization of the user-principal through Audit Login and Audit Logout events.  In this context, cleaning a connection context means cleaning up SqlServer’s session context i.e rollback any active transaction that may have been left open through this session, dropping any local temporary table that has been left behind by the previous owner of the connection, closes any open cursors and deallocates the resources it has been using, resets the CONTEXT_INFO, etc);
    • c) The Audit Login event fires up, again indicating the nature of the connection (EventSubClass = “2- Pooled”)
  6.  SqlServer executes tsql batch.
  7. The client calls the Close() method and the connection is again returned to the connection pool. (the pointer to the connection object is now available for the next client)
Test case scenarios:

I’ve performed the following tests to investigate how Sql Server reacts with different XACT_ABORT settings. The client applications will initiate a query timeout in the middle of an explicit, ongoing transaction encapsulated in a stored procedure.

Test 1: Desktop application query timeout. (MARS test included)
Test 2: ASP.NET application query timeout with connection pooling ON

Desktop-based client application query timeout

XACT_ABORT OFF (Default)

For this test, I used SSMS as a Windows-based client app. To prepare for the test, create a test environment using the script below:

--create test database
CREATE DATABASE XACT_ABORT_TEST
GO
USE XACT_ABORT_TEST
GO

-- Create a test table
DROP TABLE IF EXISTS dbo.Products; --tsql2016
GO

CREATE TABLE dbo.Products(ProductId INT 
                            CONSTRAINT PK_ProductID 
                                PRIMARY KEY CLUSTERED
                          ,ProductName VARCHAR(1000) NOT NULL
                          ,ProductPrice MONEY NULL );                      
GO
--Insert sample data
INSERT INTO dbo.Products(ProductId
                         ,ProductName
                         ,ProductPrice)     
 SELECT Tab.*
 FROM (VALUES (100,'Hard Drive',80.99)
             ,(200, 'SSD Drive',250.85)
             ,(300, 'Flash Memory 32GB',8.11)
             ,(400, 'Surface Book',4120.00) ) AS Tab(prodId,name,price);
GO
--Check the table
SELECT * FROM dbo.Products;

Stored procedures:

--the procedure updates a product price by the percent
CREATE PROCEDURE spuUpdateProductPrice
     @ProductId INT
    ,@ByPercent DECIMAL(5,2)
    ,@SetXactAbort BIT = 0 --Off by default
AS
BEGIN
    SET NOCOUNT ON;

    --set xact abort
    IF (@@OPTIONS & 16384)!=@SetXactAbort    
     SET XACT_ABORT ON 
    
    --output xact setting
    SELECT IIF( (@@OPTIONS & 16384)=0,'XACT_ABORT IS OFF','XACT_ABORT IS ON')
      
    BEGIN TRANSACTION xactTest;
    BEGIN TRY

        --update price by percent
        UPDATE dbo.Products
            SET ProductPrice += (ProductPrice * (@ByPercent * 0.01))
        WHERE ProductId = @ProductId;

        --this simulates a slow query
        WAITFOR DELAY '00:00:10';

       COMMIT TRANSACTION xactTest;
    END TRY

    BEGIN CATCH
        IF @@TRANCOUNT>0
            ROLLBACK TRANSACTION xactTest;

            ;THROW
    END CATCH
  
    RETURN;
END
GO
--select product information
CREATE PROCEDURE spsGetProductDetails
    @ProductId INT = NULL
AS
BEGIN
    SET NOCOUNT ON;

    SELECT ProductId
          ,ProductName
          ,ProductPrice
    FROM dbo.Products
    WHERE ProductId = @ProductId
        OR @productId IS NULL;

    RETURN;
END

SQL Server Profiler:
I’ll use SQL Server Profiler to capture the Attention event when sent by the client application.   Add Error And Warnings/Attention event to the Standard(default) set of the traced events.

SSMS:
SSMS will act as a Desktop client which will cause the query timeout. To simulate the scenario open(CTRL+N) three sessions.

  • Session1 – Change the Execution timeout from 0(unlimited) to 3s . The session will be used to run the sp that will cause query timeout (SSMS:RIght click/Connection/Change Connection)

    ssmsexectuiontimeout

The code below will be terminated by the timeout.

USE XACT_ABORT_TEST
GO
-- decrease SSD disk price by 10%
EXEC dbo.spuUpdateProductPrice 
    @ProductId = 200 
    ,@ByPercent = -10
  • Session 2– The session will be used to run the second sp that will be reading from the same table used by the procedure initiated in session 1.
USE XACT_ABORT_TEST
GO
EXEC dbo.spsGetProductDetails
  • Session 3– Add the code below to monitor the transaction state and the locked resources.
------------------------------
    --check the open transactions
    ------------------------------
    SELECT sTran.transaction_id 
       ,sTran.is_user_transaction AS isUsrTran
       ,sTran.open_transaction_count [NoOfTranPerSession]
       ,cn.parent_connection_id
       ,cn.connection_id
       ,aTran.[name]
       ,aTran.transaction_begin_time
       -- lookup table can be found on 
       -- https://msdn.microsoft.com/en-au/library/ms174302.aspx
       ,CASE aTran.transaction_type
          WHEN 1 THEN 'Read/write'
          WHEN 2 THEN 'Read-only'
          WHEN 3 THEN 'System'
          WHEN 4 THEN 'Distributed'
        END AS [Transaction Type]
        ,CASE aTran.transaction_state
          WHEN 0 THEN 'Not fully initialized'
          WHEN 1 THEN 'Initialized, not started'
          WHEN 2 THEN 'Active'
          WHEN 3 THEN 'Ended' 
          WHEN 4 THEN 'Commit initiated'
          WHEN 5 THEN 'Prepared, awaiting resolution'
          WHEN 6 THEN 'Committed'
          WHEN 7 THEN 'Rolling back'
          WHEN 8 THEN 'Rolled back'
        END AS [Transaction State]
        ,cn.session_id
        ----------- connection params ------------------
        ,cn.net_transport [Conection protocol]
        ,cn.connect_time [Connected At]
        ,cn.client_net_address [Client network address]
        ,cn.client_tcp_port [Client TCP port]
        ,cn.last_read
        ,cn.last_write   
    FROM sys.dm_tran_active_transactions aTran
        INNER JOIN sys.dm_tran_session_transactions sTran
            ON aTran.transaction_id = sTran.transaction_id
                right outer JOIN sys.dm_exec_connections cn
                    ON sTran.session_id = cn.session_id
WHERE sTran.transaction_id IS NOT null

--------------------------------------
--check locks, transactions, sessions
--------------------------------------
SELECT lck.resource_type
      ,DB_NAME(lck.resource_database_id) AS DBName
      ,lck.resource_description AS resDesc
      ,lck.request_mode
      ,lck.request_type
      ,lck.request_status
      ,lck.request_session_id
      ,lck.request_owner_type
      ,ts.transaction_id
      ,tat.name
      ,con.connection_id
      ,tat.transaction_type
FROM sys.dm_tran_locks lck
  INNER JOIN sys.dm_tran_session_transactions ts
    ON lck.request_session_id = ts.session_id
  LEFT OUTER JOIN sys.dm_tran_active_transactions tat
    ON ts.transaction_id = tat.transaction_id
  LEFT OUTER JOIN sys.dm_exec_connections con
    ON ts.session_id = con.session_id

---------------------------------------------
--list of the waiting tasks
SELECT wt.* 
FROM sys.dm_os_waiting_tasks wt
    INNER JOIN sys.dm_exec_sessions s
        ON s.session_id = wt.session_id
WHERE s.is_user_process = 1
---------------------------------------------

To begin the test, enable the Profiler trace and execute the code in session1 and then in session2(sp that just selects data) and finally execute the code in session3.

test1a

Analysis:

The sp(Session 1) will take more time(~10s) to execute than the Client’s execution time allows (3s). After 3s, the client sends an Attention event to the Sql server which causes Sequel to immediately stop the sp execution, with no error. Because XACT_ABORT is set to OFF(default), Sql Server will leave the transaction open and let the Client handle the commit/rollback.

The client’s event handler will send the following error message:

Msg -2, Level 11, State 0, Line 3
Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

Profiler -spids may be different (spid 55 is spid 60 on the previous image)profiler1

Now, the sp that runs in Session 2( reads all rows from dbo.Products) has to wait since Session 1’s active transaction(xactTest) holds X(Exclusive) lock on the KEY(the table is a clustered index) and  IX(Intent Exclusive) on the table(object) which is not compatible with the S(Shared Locks) requested by session 2.
The “SELECT” stored procedure will just wait indefinitely(the default execution time for that connection is unlimited).
Session 3 can be used to analyze the open transactions, locked resources, etc.
If we now disconnect Session 1(right click/Disconnect), Sql Server will automatically ROLLBACK all ongoing transactions within the closed connection scope, leaving the DB in the coexistent state – SSD price will not decrease by 10%   :). The stored procedure in Session 2 will finish and if we re-run the code in Session 3 we’ll see that there are no abandoned transactions. All relevant resources will be freed.

Programmers usually use the “using” statement to “wrap” the objects used for the communication with the Database. The structure ensures that all of the objects instantiated within the block will be removed, closed, flushed, disposed, etc.,(C devs won’t like this kind of a black box 🙂

using (SqlConnection conn = new SqlConnection(Properties.Settings.Default.ConnectionStr))
{
    using (SqlCommand cmd = new SqlCommand("spuUpdateProductPrice",conn))
    {
        cmd.CommandType = CommandType.StoredProcedure;
        cmd.Parameters.Add("@ProductId", System.Data.SqlDbType.Int).Value = 400; //hardcoded
        cmd.Parameters.Add("@ByPercent", System.Data.SqlDbType.Decimal).Value = -10; //hardcoded
        cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 0;//0 - OFF, 1 - ON

        cmd.CommandTimeout = 5; //default is 30s

        conn.Open();
                    
        Console.WriteLine("...about to run dbo.spuUpdateProductPrice");
        try
        {
            cmd.ExecuteNonQuery(); //will take ~10s to execute
        }
        catch (SqlException e)
        {
            //SET UP BREAKPOINT HERE
            Console.WriteLine("Got expected SqlException due to command timeout ");  
            Console.WriteLine(e);
        }
    }    
}

The error handler will handle the timeout and the code will exit the structure closing the connection. The Sql server will then automatically rollback all opened transactions(within the scope of the closed connection) leaving the DB in the coexistent state. This will happen regardless of the XACT_ABORT setting.

To show the difference between XACT_ABORT set to ON/OFF  we can set a break-point at Line:21(any line inside the catch block). With the xact_abort set to OFF the uncommitted transaction will remain open until the code exits the structure. With ON, Sql server will rollback the opened transactions immediately after receiving the Attention signal and before the break-point. This can be monitored through the Session 3 code as mentioned earlier.

XACT_ABORT ON

To test the same scenario with the xact_abort on, reconnect Session 1(Right click/connect – now, the default execution timeout is 3s) and include the third input parameter

USE XACT_ABORT_TEST
GO
-- decrease SSD disk price by 10%
EXEC dbo.spuUpdateProductPrice 
     @ProductId = 200 
    ,@ByPercent = -10
    ,@SetXactAbort = 1 --xact_abort ON

This time Sql server is set up to handle the uncommitted transactions, and when a run-time error occurs – query-timeout in this case, it will rollback the transactions. Session 2 will be able to execute its code. Session 3 will show the “empty resultsets” – no locked resources 🙂

MARS Enabled

Xact_Abort ON/OFF  setting has a different effect on the MARS-enabled connection in the example above.
To test this scenario change the Session 1 connection properties and add an additional connection parameter: MultipleActiveResultSets=True

If we run the test again the results will show that the XACT_ABORT setting does not affect the sql Server behavior* in case of the query timeout run time error. The explicit transaction will be automatically rolled back. Session 2 shows us the MARS transactions hierarchy within the same session.
*The behavior would be different if we executed the stored procedure logic as a script and without explicit transaction. 

ASP.NET application query timeout with connection pooling ON/OFF

The following test demonstrates Sql server’s behavior in case of application timeout when the XACT_ABORT is ON/OFF in combination with the Connection Pooling TRUE/FALSE

The test environment:

  1. A simple web application that can perform a) select and b) update actions on a database using one of two available connections – one connection authenticates using the Integrated* security and the other one uses Sql Server authentication.
  2. Windows Performance Monitor (perfmon.msc) – to examine connection pool counters
  3. Sql Profiler – for tracking events triggered by the app and data provider.
  4. Sql Server T-SQL code to track active connections, locks, etc. – Session 3 code from the previous example.

Note: The integrated security will use ApplicationPoolIdentity(AP) to connect to the DB. Execute the script below to create a Login for the AP identity and a Login for an sql server user:

-- windows login - application pool
USE master
GO
--create db server login
CREATE LOGIN  [IIS APPPOOL\DefaultAppPool] FROM WINDOWS
GO

USE XACT_ABORT_TEST
GO
--create db user
CREATE USER  [IIS APPPOOL\DefaultAppPool] 
GO
--add securables to the db user
GRANT EXECUTE ON OBJECT::dbo.spsGetProductDetails TO [IIS APPPOOL\DefaultAppPool]
GO
GRANT EXECUTE ON OBJECT::dbo.spuUpdateProductPrice TO [IIS APPPOOL\DefaultAppPool]
GO

-- Sql Server login
USE master
GO
--
CREATE LOGIN testXact WITH 
    PASSWORD = 'testXact'
    ,CHECK_POLICY =OFF
GO

USE XACT_ABORT_TEST
GO

CREATE USER testXact FOR LOGIN testXact
GO
GRANT EXECUTE ON OBJECT::dbo.spsGetProductDetails TO testXact
GO
GRANT EXECUTE ON OBJECT::dbo.spuUpdateProductPrice TO testXact
GO

The Web app.

The interface allows you to use two different connections to select and update the existing data. The update will cause the timeout run-time error.

The results of the select statements will be displayed in the Grid view

The error message will show under the grid view control.

 

 

 

 

 

The code is similar to the code used in the previous example.

using System;
using System.Data.SqlClient;
using System.Data;

public partial class _Default : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e){}
    public string ConnString(string auth)
    {      
        {
            if (auth == "IntegratedSecurity")
                return @"Data Source=tcp:(local);
                         Initial Catalog=XACT_ABORT_TEST;
                         Integrated Security=True;
                         Pooling=True;
                         Application Name=WINXactAbort";
            else
                return @"Data Source=tcp:(local);
                         Initial Catalog=XACT_ABORT_TEST;
                         user Id=testXact;
                         password=testXact;
                         Pooling=True;
                         Application Name=SQLXactAbort";  
        }
    }

    //show products
    protected void btnShowProducts_Click(object sender, EventArgs e)
    {
        //show products
        //NOTE: Use the default CommandTimeout: 30s
        using (SqlConnection con = new SqlConnection(ConnString(rBtnLst.SelectedValue)))
        {
            con.Open();

            using (SqlCommand cmd1 = new SqlCommand("spsGetProductDetails", con))
            {
                cmd1.CommandType = CommandType.StoredProcedure;
                cmd1.CommandTimeout = 30;
                try
                {
                    SqlDataReader rd = cmd1.ExecuteReader();
                    Grid1.DataSource = rd;
                    Grid1.DataBind();
                    rd.Close();
                }
                catch (SqlException e1)
                {
                    lblOutputText.Text = e1.Message;
                }
            }
        }
    }

    //Update product
    protected void btnUpdateProducts_Click(object sender, EventArgs e)
    {
        using (SqlConnection con = new SqlConnection(ConnString(rBtnLst.SelectedValue)))
        {
             con.Open();

            using (SqlCommand cmd = new SqlCommand("spuUpdateProductPrice", con))
            {
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.CommandTimeout = 3; //default is 30s

                cmd.Parameters.Add("@ProductId", System.Data.SqlDbType.Int).Value = 400; //hardcoded
                cmd.Parameters.Add("@ByPercent", System.Data.SqlDbType.Decimal).Value = -10; //hardcoded
                cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 0;//0 - OFF, 1 - ON

                lblOutputText.Text = "...about to run dbo.spuUpdateProductPrice";
                try
                {
                    cmd.ExecuteNonQuery(); //will take ~10s to execute
                }
                catch (SqlException e2)
                {
                    lblOutputText.Text = e2.Message;
                }
            }
        }
    }
}

Test: Two clients use the app. The first runs an expensive Update which causes the time-out, and the second one tries to Select data from the same table. Both clients use the same connection.Connection Pooling is enabled.

Test preparations:
Open two instances of the web application in two browsers. Run the T-SQL code from the previous example that selects the open transactions, Open SQL Profiler (standard trace + Errors/Attention event, Audit Login/ Audit Logout). Include Filter for ApplicationName in (WINXactAbort, SQLXactAbort). Open Performance Monitor and add .NET Data Provider for SqlServer Counters (example).

Analysis:

  1. Two different clients using the same connection type e.g Integrated Security authentication performs the following actions:
    •  1.1  Client1 updates a product price(Update Products). 1st non-pooled connection established.
    • 1.2 During the execution Client2 tries to read the product prices (Show Products). 2nd non-pooled connection established(EventSubClass =”1- Nonpooled”,). Client2 has been blocked by the Client1’s update process. Client2 waits.
    • 1.3 Client1’s update action times out (simulates a long-running query) leaving the products table locked by an uncommitted-abandoned transaction. Attention event raised. Close() method executed. A new CP container has been created. The client “returns” the connection to the newly created CP.  Sql Server thinks that the connection is still active. The CP now has one connection ready to be reused.
    • 1.4 Client2 still waits for Client1 to finish with the update and to unlock the Products table. Eventually, Client2 times out (default command timeout = 30s). The connection is now returned(referenced by a pointer) to the CP. The connection pool now has 2 connections ready to be reused. The Sql Server resources(products table) are still locked by Client1.
    • 1.5 At this point, the Connection Pool has two available (idle) connections. The first still has a non-committed transaction in its context. If a new client, Client3 tries to read the same data, CP will allocate the last connection available using the LIFO(Last In, First Out) algorithm. In this case, the allocated connection will be the one that originally belonged to Client2. As expected, Client3 will get blocked by the same abandoned transaction. Moreover, no client will not be able to access the Products table until the status of the transaction in question gets resolved (rollbacked/committed).

This scenario is common and can cause a lot of frustration since the effects can vary i.e sometimes works, sometimes doesn’t work 🙂

To add to the confusion, sometimes the situation can resolve itself e.g: CP will remove idle connections from the pool after 4 to 7min of inactivity by triggering the Audit Logout event which will, among other things, finally instruct SQL Server to rollback the outstanding transactions in the context of the destroyed connection(session). Another possible scenario is if the  “doomed” connection gets reused by a new Client. In that case, the sp_reset_connection stored proc(mentioned earlier) will instruct Sql Server to “clean the mess” and prepare the clean connection/session context for the new Client.

Profiler – the trace shows the above sequence of events

Connection, session, and transaction metadata

The situation can be avoided using the following approaches:

  1. Setting XACT_ABORT to ON. This will instruct Sql server to take action of rolling back the ongoing transactions after receiving the Attention event as a result of the command timeout issued by the client.
  2. More thorough data layer coding ie. executing the queries within the properly defined transactions. Properly handle the uncommitted transactions within the catch block.

Both techniques will prevent the unpredictable behavior explained above.

XACT_ABORT ON

To set up the Xact_abort ON change the value of the @SetXactAbort  parameter to 1 and repeat the test.

//show products
...
  cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 1;//0 - OFF, 1 - ON
...

The same test shows that as soon as Client1 raises the Attention event, Sql Server stops the code execution and performs rollback on the ongoing transaction. All subsequent read requests(Client 2,3..) will be successful.

Using Connection Transactions 

To test the scenario change the bthUpdateProducts_Click  event to to include transactions. and set parameter @SetXactAbort back to 0.

//Update product
    protected void btnUpdateProducts_Click(object sender, EventArgs e)
    {
        using (SqlConnection con = new SqlConnection(ConnString(rBtnLst.SelectedValue)))
        {
            con.Open();
            // Naming the transaction will make it easier to track using dmvs. 
            // Note that the explicit transaction defined in
            // the sp is now a nested transaction.
            SqlTransaction t1 = con.BeginTransaction("ExplicitTransactionFromCode");

            using (SqlCommand cmd = new SqlCommand("spuUpdateProductPrice", con))
            {
                cmd.Transaction = t1;
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.CommandTimeout = 3; //default is 30s

                cmd.Parameters.Add("@ProductId", System.Data.SqlDbType.Int).Value = 400; //hardcoded
                cmd.Parameters.Add("@ByPercent", System.Data.SqlDbType.Decimal).Value = -10; //hardcoded
                cmd.Parameters.Add("@SetXactAbort", System.Data.SqlDbType.Bit).Value = 1;//0 - OFF, 1 - ON

                lblOutputText.Text = "...about to run dbo.spuUpdateProductPrice";
                try
                {
                    cmd.ExecuteNonQuery(); //will take ~10s to execute
                    t1.Commit(); //NOTE: The COMMIT TRANSACTION in sp, if executed would only 
                                //decrease the @@Transcout value by 1.
                }
                catch (SqlException e2)
                {
                    lblOutputText.Text = e2.Message;
                    if (t1.Connection != null)
                    {
                        t1.Rollback();
                    }
                }
            }
        }
    }

This time the code will rollback the transaction within the catch block and the resources will not be blocked. If we now set XACT_ABORT  to ON, and repeat the same test, after the Attention event,  SQL Server will Rollback the transaction (@@Transccount = 0) and the code will skip executing Rollback() method because the Connection property of the transaction object t1 returns null since the transaction is no longer valid.

Conclusion

It is important to understand the XACT_ABORT setting to be able to prevent and/or explain Sql server’s behavior in case of a query timeout run-time error.  When set to OFF, the default value, Sql server will let the client handle the timeout situation and take care of the connection elements i.e., ongoing transactions. Query timeouts are the Client-side concept only and Sql Server does not treat them as errors. As soon as possible, after receiving the Attention event, Sequel will stop the execution reporting no errors. If XACT_ABORT is set to ON, sql server will rollback unfinished transactions and then return to the Client.
Desktop applications may use different approaches when dealing with DB connections. Some applications may handle the query timeout and close the connection(dispose the connection object). This instructs Sql Server to rollback all unfinished transactions in the scope of the session/connection. In this context, the XACT_ABORT setting will not have a direct effect on the execution. Other applications e.g., SSMS, will keep connections open after handling query timeouts. In this case, the abandoned transactions may cause other requests to wait to acquire the required locks and the problem can escalate. In this context, XACT_ABORT’s value “ON” will rollback the abandoned transactions preventing the resource locks.
Web applications usually use the Connection Pooling technology to optimize the database authentication process. The connections(connection objects) will stay “live” and ready to be re-used by other clients. Every CP contains a collection of similar Connections – connections with the same connection string. The availability of the connections follows the LIFO(Last In, First Out) algorithm. For example, if we have a Connection Pool CP1 with  4 connections in it and  ready to be reused CP1 = {c1,c2,c3,c4}. Connection c3 has left an abandoned transaction leaving table T1 locked. Scenario1: A client wants to read data from T2. CP1 will provide the client  with the connection c4. The client clears  the connection/session context left from the previous owner and reads the data with no problems. Scenario2: A client wants to read data from T1. CP1 provides connection c4. After clearing the c4 context the client gets blocked by the abandoned transaction left in the context of the c3. This can cause another timeout.   Scenario3. A client wants to read T1, gets c4, clear the previous context and ends up blocked. At the same time another client wants to read T1. Now CP1 serves c3 to the client. Client clears the connection’s previous context which includes the abandoned transaction rollback. The latest client now can read T1 as well as the previous one.

MARS Enabled Connections may affect SQL Server behavior in the case of a query timeout since the technology changes the scope of a transaction – Batch scoped transactions. This existing technology is something I would like to blog about in one of my future posts.

There are many case scenarios in which the query timeout can cause confusion and unwanted results. There are two main approaches to resolve this; a) more thorough coding on the client side which includes proper error handling and use of the connection transactions. b) using XACT_ABORT ON on the Sequel side which will handle the abandoned transactions.

My personal preference is not to rely on the Client’s code and to always use XACT_ABORT set to ON when needed. During more than 10 years of database programming, I have worked with only a handful of web programmers who were interested in this matter. In most cases, their perception of a database is a simple data store, preferably “controlled” by an ORM framework. After a certain period of production life, these “Agile products” usually become slow, not responsive, and sometimes unpredictable. The Connection pooling/query timeout problem is one of the unwanted scenarios. From our(DB Devs) perspective this may not necessarily be a “bad thing” since we all get a “good dollar value” for repairing the mess afterward… but this can be an inspiration for another, more philosophical post  🙂

Thanks for reading.

Dean Mincic

Orphaned DB Users and SQL Server logins


One of the common tasks in the DB community is moving databases from one SQL server instance to another.
For simple environments, it could be just a matter of backing up the database on one server/instance and restoring it on another.

Sometimes, even if we successfully restore the database on a new server, create all necessary logins and change the application connection string to point to the new instance we still get the Database not accessible or Login failed error messages.

The common reason for the errors is the disconnection between the DB users and their logins – The orphaned DB users.
The technique we usually use to re-connect the users is to call a system-stored procedure that will do the Login-DbUser remapping.

-- Re-connects an orphaned db user "db_user1" with login "user1"
EXEC sp_change_users_login 
            @Action='update_one'
           ,@UserNamePattern='db_user1' 
           ,@LoginName=user1;
GO

Why does the problem occur in the first place and what we can do to prevent it. The following is a little bit of theory and a couple of examples that will illustrate the problem.

In SQL Server, there are eleven different database user types. The type I am focusing on today is Db User based on the SQL Server authenticated Login. The complete list can be found here.

Every principal (an entity that can request SQL Server resources) has two main properties:

  • Principal ID or ID
  • Security ID or SID

The scope of influence of a principal depends on the level that the principal operates on e.g SQL Server Level principals operate on an SQL Server Instance level while the Database level principals operate on a database level.

  • Logins are principals whose scope is SQL Server instance
  • DB Users are principals whose scope is Database
DB Users –  based on SQL Server Logins

When we create an SQL Server  Login,  SQL Server assigns ID and SID to the created principal.

  • ID – (INT) -Uniquely identifies Login as an SQL Server securable (resource managed by SQL Server). The ID is generated by SQL Server
  • SID – (VARBINARY(85)) –  Uniquely identifies the security context of the Login. The security context depends on how the identifier was created. The Login  SIDs can be created by :

    • Windows User/Group. The SID will be unique across the User/group domain and will be created by the domain.
    • SQL Server. The SID is unique within SQL Server and created by SQL Server – used in the example below.
    • Certificate or asymmetric key. (cryptography in SQL Server will be covered in one of the following posts)

The information about the Logins is stored in the master database

The following code will create a few SQL objects to illustrate the problem.

-- Create a couple of test databases
USE master
GO

CREATE DATABASE TestOrphanedUsersDB_1
GO
CREATE DATABASE TestOrphanedUsersDB_2
GO

--Create Sql Server Logins
CREATE LOGIN Login1 
    WITH PASSWORD='abc'
        ,CHECK_POLICY = OFF
GO
CREATE LOGIN User2 --the login name is User2 :)
  WITH PASSWORD ='abc'
      ,CHECK_POLICY = OFF
GO

The logins are stored in the master database and can be viewed using sys.server_principals system view:

SELECT  name         AS [Login Name]
       ,principal_id AS ID
       ,[sid]        AS [SID]
       ,[type_desc]  AS [Login Type]
FROM master.sys.server_principals
WHERE name IN ('Login1','User2')

Query Results (the identifiers may be different on different PCs)
Logins_srvPrincipals

Now we need to create a few database users.

As mentioned before, there are eleven different types of database users. For this exercise, we’ll create DB users based on  SQL server logins.

--Create database users for the Sql Logins.
--The User SIDs will match the Login SIDs
USE TestOrphanedUsersDB_1
GO
CREATE USER User1  
    FOR LOGIN Login1 --User1(SID) = Login1(SID)
GO
CREATE USER User2
    FOR LOGIN User2
GO

USE TestOrphanedUsersDB_2
GO
CREATE USER User1-- The user names are unique on a database level
    FOR LOGIN Login1

SQL Server has assigned the Principal IDs and Security IDs to the newly created users.

  • ID – (INT) -Uniquely identifies DB users as a database securable.
  • SID – (VARBINARY(85)) – Uniquely identifies the security context of the User. The security context depends on how the identifier was created – In the example the DB Users security context depends on the Logins and therefore the User SIDs will match the Login SIDs.

The information about the database users is stored on the database level and can be viewed using the sys.database_principals system view.

USE TestOrphanedUsersDB_1
GO
SELECT  name           AS [DbUser Name]
       ,principal_id   AS [Principal ID]
       ,[sid]          AS [SID]
       ,[type_desc]    AS [Db User Type]
FROM sys.database_principals
WHERE name IN ('User1','User2')
Users_DB1

..and for the second database…

USE TestOrphanedUsersDB_2
GO
SELECT  name           AS [DbUser Name]
       ,principal_id   AS [Principal ID]
       ,[sid]          AS [SID]
       ,[type_desc]    AS [Db User Type]
FROM sys.database_principals
WHERE name IN ('User1','User2')
Users_DB2

The diagram below shows the relationship between Logins and Users.

LoginsUsers

Image 1, Logins/Users mapping

Case Scenario:
An application uses two databases, TestOrphanedUsers_1 and TestOrphanedUsers_2. We decided to move the application’s backend to a new instance by backing up and restoring the two on the new server.
The restored databases contain all of the previously defined DB users since the principals are a part of the databases. The original server logins were not transferred because they belong to the original master database.
At this stage, the logins are not mapped to the users and the application is not able to access the backend.

To simulate the scenario, we’ll remove the previously created logins.

 USE master
 GO
 DROP LOGIN Login1
 GO
 DROP LOGIN User2
 GO

If we removed the logins using SSMS UI, we would get a message

DropLoginMsgSSMS

The users left in the databases are now called “Orphaned Users”. The users without the corresponding logins cannot be used to access the databases. This situation mimics the database restore on a new SQL Server instance.

The next step is to create new logins. At this stage, we can do two things.

  • Create new logins (as we did before). SQL server’s engine will assign new SIDs to the logins. These identifiers will not match the existing user SIDs and consequently, we’ll have to remap the Logins to the Users (to make SIDs match). To make a match, the process will replace the old user SIDs with the new Login ones.
--Create Sql Server Logins
USE master
GO
CREATE LOGIN Login1 
    WITH PASSWORD='abc'
        ,CHECK_POLICY = OFF
GO
CREATE LOGIN User2
  WITH PASSWORD ='abc'
      ,CHECK_POLICY = OFF
GO

Using the principals’ metadata we can see the mismatch between SIDs .

Login SIDs:
Logins_srvPrincipalsNEW_notMatch

To restore the previous mapping (see Image 1) we need to remap the orphaned users as:
Logins_UsersMapping

.. using the sys.sp_change_users_login system stored procedure.

--list all orphaned users
EXEC TestOrphanedUsersDB_1.sys.sp_change_users_login 
    @Action ='report'
GO
EXEC TestOrphanedUsersDB_2.sys.sp_change_users_login 
    @Action ='report'

-- remap
EXEC TestOrphanedUsersDB_1.sys.sp_change_users_login 
            @Action='update_one'
           ,@UserNamePattern='User1' 
           ,@LoginName=Login1
GO
EXEC TestOrphanedUsersDB_1.sys.sp_change_users_login 
            @Action='update_one'
           ,@UserNamePattern='User2' 
           ,@LoginName=User2
GO

This will replace the user SIDs with the new Login SIDs
The stored procedure supports the Auto_Fix action type that can be used in a specific scenario in which we create a missing login with the same name as the orphaned user the login was created for.
More information about the procedure can be found here.

NOTE: SQL Server 2016 is the last database engine version to support the sys.sp_change_users_login procedure.
Microsoft recommends using ALTER USER instead.

--remap User1(TestOrphanedUsersDB_2) to login Login1
USE TestOrphanedUsersDB_2
GO
ALTER USER User1
	WITH LOGIN=Login1
GO
  • Create logins implicitly specifying  SIDs to match the DB. user SIDs,
USE master
GO
--Create Sql Server Logins
CREATE LOGIN Login1 
    WITH PASSWORD='abc'
        ,CHECK_POLICY = OFF
        --copied from TestOrphanedUsersDB_1.User1 or TestOrphanedUsersDB_2/User1
        ,SID =0X043C965331B69D46B3D6A813C9238090 
GO
USE master
GO
CREATE LOGIN User2 --the login name is User2 :)
  WITH PASSWORD ='abc'
      ,CHECK_POLICY = OFF
      --copied from TestOrphanedUsersDB_1.User2
      ,SID=0XC871212ABD68D04998E89480285DDE70
GO

Now we can test the mapping using the Logins to access the databases.

Conclusion:

The database users created for(or based on) SQL Server authenticated logins must have a valid link to the logins. The link is the Security identification(SID) varbinary.
Due to different scopes of the principals, DB Users, when restored on a different server, may become disconnected(orphaned) from the corresponding Logins(the new logins may have new SIDs that don’t match the original DB User SIDs). In this situation, the applications are not able to connect to the database.
To remap the principals we can use two approaches; system sp sys.sp_change_users_login to change the DB user SIDs to match the new Login SIDs or to create new Logins using the original user SIDs.
It’s worth mentioning that the latter may cause the “Supplied parameter sid is in use” error if the specified SID is already in use by an existing Login.

Thanks for reading.

Dean Mincic