Tuesday, May 24, 2011

SQL Server : EXEC and sp_executesql - how are they different?



I wanted to highlight how to avoid SQL Injection when using EXEC. A few people brought up the point that I could have avoided SQL Injection by using sp_executesql. And, while sp_executesql can be used in some cases, it's not always the most ideal from a performance perspective. So, to really show this, I'm going to start with focusing on the similarities and differences of EXEC and sp_executesql. I'll start here with some performance details but you'll find that I'm going to just hit the tip of the iceberg with this one. I'll definitely need another post or two!

First, a quick overview:

sp_executesql (also known as "Forced Statement Caching")

  • Allows for statements to be parameterized
  • Only allows parameters where SQL Server would normally allow parameters; however, this string can be built using forms of dynamic constructs. I'll give more details on this in additional posts.
  • Has strongly typed variables/parameters - and this can reduce injection and offer some performance benefits!
  • Creates a plan on first execution (similar to stored procedures) and subsequent executions reuse this plan

EXEC  (also known as "Dynamic String Execution" or DSE)

  • Allows *any* construct to be built
  • Treats the statement similarly to an adhoc statement. This means that the statement goes through the same process that adHoc statements do - they are parsed, probably parameterized and possibly deemed "safe" for subsequent executions to re-use
  • Does not have strongly typed parameters in the adhoc statement and therefore can cause problems when the statements are executed (I have ways around this)
  • Does not force a plan to be cached.
    • This can be a pro in that SQL Server can create a plan for each execution
    • This can be a con in the SQL Server needs to recompile/optimize for each execution

Let's start with using sp_executesql to parameterize a query where SQL Server would also allow parameters:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'Tripp'
go

Because 'Tripp' is a highly selective name, SQL Server uses an index to the lookup of the data:

So, for our next execution, I'll supply a different lastname - a lastname of Anderson. In this database, Anderson is NOT highly selective:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'Anderson'
go

However, the query plan looks exactly the same:

Or does it? It turns out that this query plan does look a tiny bit different but it's not very obvious... it's in the thickness of the lines. In this second execution it clearly shows that more data is being passed between the steps. But, is this a bad thing? Maybe, maybe not. Let's drill in a bit deeper. If I hover over the Index Seek (on member.test), I can see the following tooltip:

The key point here is that it shows an "Estimated Number of Rows" of 1.96 but an "Actual Number of Rows" of 385. That's pretty far off... why? Because this statement's plan was determined by the first execution of sp_executesql. Let's try another execution:

Because the query has LIKE in it, we can use wildcards. And, let's do that! This time I'll supply a wildcard of %e%:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'%e%'
go

And, again, the query plan looks the same... but with even thicker lines. Yes, I realize... this is not blindingly obvious:

And, by turning on SET STATISTICS IO ON, we can also review the [Logical] IOs performed:

For 'Tripp' 

Table 'member'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For 'Anderson'

Table 'member'. Scan count 1, logical reads 772, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For '%e%'

Table 'member'. Scan count 1, logical reads 10019, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

This table only has 144 pages. In the cases of 'Anderson' and '%e%', SQL Server would have been better off doing a table scan. But, because I used sp_executesql I forced SQL Server to do caching. Subsequent users use the plan whether it's optimal or not.

And, there are even more complicated scenarios than this. I plan to keep tackling these issues over the next few days and I'll add quite a bit more to this. However, it's late. So, I'll dangle the carrot for now. Within the next couple of days, I'll show a series of EXEC statements that generate both SAFE and UNSAFE plans.



No comments:

Post a Comment