Tag Archives: ado

Client Requests & Sql events in Sql Server

Client Requests & Sql events in Sql Server


Summary

Very often we, the sequel guys, use MS Profiler/Sql Server’s Extended events to intercept and analyse the traffic between client applications and our precious Sql Server. Sometimes we capture an hour/day of traffic, and sometimes we just want to capture a slow running stored procedure’s  input param values.  Also, we are involved in  the data layer code reviews in order to keep Client data requests as optimal as possible. This post is a sort of an internal notes/documentation about different types of requests  that can be sent from  Clients to Sql Server and the corresponding events visible in MS Profiler. The article also gives a high overview on the space between ADO.NET and Sql Server

MDAC – Microsoft Data Access Components

To start off, lets first take a brief look at the MDAC framework and where ADO.NET sits in the stack. MDAC technology allows applications to communicate with almost any data-store.
MDAC architecture implements three logical layers:

  • Programming interface layer (ADO, ADO.NET)
  • DB access layer (ODBC, OLEDB, .NET Managed providers)
  • Data store

All of the layers are accessible through the MDAC API. (+ MS Sql Server network library)*


Figure 1, MDAC architecture

NOTE: Microsoft Network Library(Net-lib) is also part of MDAC and sits between TDS(Tabular data stream) Formatter and  the network protocol. The DLLs are specifically used by SQL Server to communicate with the Clients on the network level.  It supports the network protocols: Named Pipes, TCP/IP, Shared Memory, VIA).

ADO.NET

ADO.NET provides a generic interface(set of classes that expose data access services)  to a number of different data-stores (SQL Server,CSV, Excel..) from .NET applications. It is a set of libraries included in MS .NET Framework.
The main objectives of ADO.NET is to;

  • connect to a data source
  • submit queries
  • process results
  • provide a powerful disconnected data structure to perform data operations offline.

There are two main  components of ADO.NET for accessing and manipulating data

    • .NET Framework data providers – a set of class libraries that understand how to interact with specific data platform e.g Sql Server, or with a common data layer i.e OLE DB. Data providers transport data between proprietary data platforms and the generic ADO.NET data layer. ADO.NET framework includes three providers
      • The MS Sql Server Provider – exposed through the System.Data.SqlClient namespace
      • The OLE DB provider – exposed through the System.Data.OleDb namespace.
      • The ODBC provider – exposed through the System.Data.Odbc namespace.
    • DataSet – a class(System.Data namespace) that provides a disconnected representation of result sets from the Data Source. The object provides a consistent relational programming model regardless of the data source. It includes objects like: DataTable, DataRow,  DataView, DataColumn, PrimaryKey ..etc

The key classes within each provider(each provider has its own specific class names for the same objects) include:

  • Command (SqlCommand),
  • Connection (SqlConnection),
  • DataAdapter (SqlDataAdapter)
  • DataReader (SqlDataReader).


Figure 2, ADO.NET components, high overview

TDS (Tabular data stream)

TDS is an application layer protocol used to transfer tsql command between client and server. The protocol includes facilities for;  authentication and identification, encrypted negotiation, issuing of Sql batch(language events) and stored procedures(RPC) calls, managing and distributing the outcome of atomic transactions. TDS describe the names, types and optional descriptions of the rows being returned.   It was developed by Sysbase in 1984 and adopted by Microsoft in 1990. MS-TDS(Microsoft’s version of the protocol) specification can be found here.

Figure 3,Communication flow in TDS

More on how the Client talks to Sql Server

Client communicates with Sql Server in a layered fashion. Each layer exchanges information with its neighboring layer. Sometimes the layers can be bypassed.  Only the Transport protocol layer enables physical communication between Client and Sql Server. For all other layers the exchange of information is done locally – in RAM memory and  between different dlls.
The layers are:

  1. Application Layer (C# code, etc)
  2. High level data access API (ADO,ADO.NET…)
  3. Client data interface(OLEDB,ODBC, db-lib ..)
  4. Client TDS Formatter. The tsql commands sent to Sql server has to be in TDS format.
  5. Client net protocol (Shared memory, TCP/IP sockets, net-lib)
  6. Client transport protocol (TCP/IP ..)
  7. Server transport protocol (TCP/IP ..)
  8. Server net protocol (Shared memory, TCP/IP sockets, net-lib)
  9. Server data interface
  10. SQL Server

NOTES:  App level(A) can bypass ADO.NET  level(B) if we directly access teh Client data interface (direct API calls to OLE-DB, ODBC..etc).
Client protocol(E) and Client transport protocol(F) can be bypassed if the Client and Sql Server communicate through the Shared memory protocol. In this scenario Client and SQL Server run on the same machine. The components exchange information through the RAM memory space. The network elements are not required.

Client data requests

The only way clients can interact with Sql Server is by sending requests that contain TSQL commands for the database engine. The requests are sent over MS-TDS protocol(Microsoft’s version of TDS).
There are three main forms of the requests

  • Batch requests
  • RPC (Remote Procedure Requests)
  • Bulk Load Request

In the following examples I’ll use ADO.NET components to present the different  types of Client requests. I’ll  use MS Profiler to capture and analyse Events which corresponds to the requests.

Events

When Sql Server receives a message from a client(through TDS), it can be thought of as an event occurring – see  Figure 3, First client sends a connection request(a login event) and gets back success or failure response. Once connected, client requests typically fall into one of two categories:

  • Language events
  • RPC events

Language Events

A language event is a tsql batch sent from the client to the server. From ADO.NET perspective a tsql batch* is a set of tsql commands defined in the CommandText property of the SqlCommand Class. The command can be executed using one of the commands; ExecuteReader, ExecuteScalar, ExecuteNonQuery and ExectueXMLReader.

The commandType enum property of the SqlCommand Class specifies how the command string is interpreted. The Field values can be:

  • CommandType.StoredProcedure (contains the name of a stored procedure or a scalar function)
  • CommandType.Text (contains an tsql batch – one or more tsql statements, no GO directives)
  • CommandType.TableDirect (contains a table name. Only supported by the .NET Framework Data Provider for OLD DB)

NOTE: A tsql batch can be defined as one or more tsql statements before the GO* directive(batch terminator). This is specific to Microsoft SQL stack tools i.e SSMS, sqlcmd, osql. isql- Sybase It is not part of tSql language and may not be recognised by other similar tools. It is not permitted to use GO directive within CommandText property.
After a TDS request reaches SQL Server, the db  engine will create a TASK to handle the request. One task handles only one batch(a task may spawn a number of sub-tasks to support parallel executions). When ADO.NET “executes” a batch(one or more tsql commands), one TASK will handle the request. When SSMS executes i.e two batches(the same tsql command twice – GO 2, or two tsql statements divided by the GO terminator), Sql Server engine(SQLOS) allocates two separate TASKS for the job.

Non-parameterised batch request

This type of request does not have parameters. It can contain local variable declarations. The request is treated by Sql Server engine as an ad-hoc query.
Adhoc queries  can be cached and reused only if a subsequent batch matches exactly – exact textual match.  i.e 

The query plans above are  compiled and cashed separately. This behavior can lead to the problem known as “the cache pollution”.
Sql server optimiser may decide to automatically parameterise an ad-hoc query, but it’s very conservative when making those decisions. The query template must be considered to be safe, meaning that the same query plan doesn’t change even if the actual parameter values change – the paramterisation must not degrade query performances. Also, query engine decides on the auto-parameter data type (smallint in the example below). The datatype may be changed due to change in the input parameter size(PurshaseOrderId = 3420 can fit in 2bytes-smallint, but PurshaseOrderId = 210 fits into 1byte-tinyint). In that case the optimiser cashes another copy of the plan that includes the new data type.
In addition, many query elements disallow automatic parameterisation, such as; JOIN, IN, INTO, DISTINCT, TOP.. just to name a few.

Let’s execute the second query through a simple console application. The predicate value will be passed as a part of the batch.  Pass 707 as a command line argument; (ProductId = 707)
NOTE: We can also use SSMS to run the batch.

Figure 4, MS Profiler – tsql batch execution events 

The Client request is executed as a tsql batch. SQL:BatchStarting signals that the batch execution is starting.  SQL:StmtStarting event was fired for each statement in the batch(once in this example).

RPC Events

There are two basic ADO.NET client request types which invoke RPC(Remote Procedure Call) events

  • Parameterised batch request  – uses sys.sp_executesql Sql Server’s system extended stored proc
  • Stored procedure and  scalar user defined function call

NOTE: RPC(Remote Procedure Call) name came from Sybase and in context of the Client request represents a type of formatting of the query inside a TDS packet before it is sent to Sql Server. The requests are pre-formated by the driver/provider i.e OLEDB, ODBC,JDBC. ..etc
Windows RPC(Remote Procedure Call) is completely different technology which implements the interprocess communication(IPC) that allows applications to talk to each other locally or over the network. More on the RPC  here.

Parameterised batch request

Parameterised type of request does have parameter values passed to tSQL statements. Parameter input is treated as a literal value, not  as an executable code.
This type of request significantly improves the way Sql Server process the tsql commands. The sql batch  separates the “body” of the query from the data values managed by that query.
Lets change SqlCommand object from the previous example to use a parameterised tsql statement and see what happens.

Figure 5, MS Profiler – parameterised batch RPC event

This time ADO.NET constructs the request as a RPC(Remote Procedure Call) type. The client sends an RPC message data stream(TDS protocol level) to Sql server. The message contains: Procedure name( or in this case the ProcID) and the Parameters.  ProcID identifies Sql Server’s internal sp to be executed. In the example above ProcID = 10. The id value maps sys.sp_exeuctesql extended stored procedure in Sql Server.

Figure 6, sys.sp_executesql extended stored procedure

NOTE: We cannot use SSMS to invoke RPC event types since it’s not possible to send Parameters as a part of the TDS binary stream.

The parameterised batch request results with a cached plan, just like a regular stored procedure call.

Figure 7, parameterised batch request – cached plan

Stored procedure call

This type of RPC request is similar to the parameterised batch request. The main difference is that the commandType enum property of the SqlCommand Class is set to “StoredProcedure“, and the CommandText property value represents the name* of the stored procedure.

Quick note: In order to avoid cache misses/compile locks, use the two part naming convention when referencing  sql objects – dbo.uspGetPurchaseOrderDetails. More about sp caching can be found here.

Let’s encapsulate previous tsql batch in a stored proc…

and then modify CommandType and CommandText properties as follows…

Figure 8, MS Profiler – RPC Stored proc call

Using RPC: Starting and/or SP:Starting  events we can find out the value of the stored procedure’s input parameter(s).  It is interesting to notice the absence of the stored procedure’s ObjectId within the RPC call. The initial request doesn’t know whatID (object_id) is used by Sql Server to identify the stored procedure.

Not directly related to the topic but nevertheless interesting observation is the SP:CacheMiss event. The event describes the failed cache lookup for the stored procedure on the very first run as the plan was not generated and stored in the cache.

Now, let’s execute a  stored procedure that has a nested sp.

… add an output parameter to the code ..

From the Profiler’s trace below we can see that the only one RPC request was initiated for the top level stored procedure.

Figure 9, RPC request and nested stored procedures

From the trace above we can observe a few interesting things

  • Only the first, top level stored procedure call is initiated through the RPC event. This is the request initiated by ADO.NET.
  • The nested stored procedure call was captured through SP:Starting and SP:StmtStarting events. The input parameter values are not visible(more about this later).
  • RPC Output parameter* event is triggered AFTER the RPC:Completed event. This explains why we need to close Data reader object before retrieving the output parameter(see the c# code above)

NOTE: When the RPC is invoked, some or all of its parameters are designated as output parameters. All output parameters will have values returned from the server. For each output parameter, there is a corresponding return value, sent via the RETURNVALUE token. More about this can be found here.

How to capture nested stored procedure parameter values

In the example below, the nested stored procedure call was captured through the SP:Starting and SP:StmtStarting events. it is not possible to capture nested procedure(s) input parameter values only by examining this two events. Sql Profiler is not a debugging tool and it shows only what has been executed. This means that it can show only the parameters values that are initially passed from the data layer, not during the procedure execution.
So how can we capture a nested stored procedure input parameter values?  Well, we can get the values if

  • The top level stored proc parameter(which we can get through the RPC event) value is passed unchanged to the nested sp.
  • The nested stored proc parameters are passed as literals i.e  .. EXECUTE dbo.MyStoredProc @name='John Doe', @Age=42 ..

However, there is another, more interesting way to capture the parameter values.

We can include the Showplan XML Statistic Profile  event class to our trace. The event class triggers when Sql Server executes an SQL statement and shows complete, compile-time data, including the parameter values we are interested in.

Figure 10, Showplan XML Statistic Profile event class

We can export the plan(XML format) and then search for the ParameterList element.

Now we can analyse the parameters’ runtime/compiled values and the hunt for the parameter sniffing issues can begin 🙂

NOTE: Showplan XML Statistic Profile is an expensive event to track. It may be a good idea to use the server side tracking instead or the Profiler UI. ..

Conclusion

In my opinion,  It is beneficial for the Sql database developers to be familiar with the database access patterns from the client end. Knowing how different components i.e ADO.NET, OLEDB, ODBC, TDS etc. fit together can help when making design decisions. The way Client constructs its data requests dictates the database engine behavior i.e Sql Server treats non-parametersied queries as ad-hoc queries whereas it caches the execution plans for the parameterised queries the same way as it does for stored procedures. Understanding how different events( captured by Profiler or Extended events) relate to the client requests can help us find and understand the root causes  for the slow running queries, excessive locking etc. that may not be on the Sql Server side but on the data layer.

Thanks for reading

Dean Mincic