Correct SQL Server TempDB Spills in Query Plans Caused by Outdated Statistics

Statistics are an integral part of SQL Server and query performance. In short, the query optimizer uses statistics to create query plans that will improve the overall performance of the queries ran. Each statistic object is created on a list of one or more table columns and includes a histogram displaying the distribution of values in the first column. The histogram can have up to 200 steps, but no more regardless of the number of rows in the column or index.

In this post we’ll take a look at one specific performance issue that you might find in an execution plan of a query. If you’ve ever noticed the following warning, then this post is for you:

1 - TempDB spills
Within the AdventureWorks2014 DB, I’ll use the following query for my example:

SELECT BusinessEntityID, FirstName, LastName, EmailPromotion
FROM [AdventureWorks2014].[Person].[Person]
WHERE EmailPromotion > 0
ORDER BY LastName

Looking at this query I can already tell contention may be present so I’ll go ahead and add a covering index:

CREATE NONCLUSTERED INDEX [IX_Person_EmailPromotion_INCLUDES]
ON [Person].[Person] ([EmailPromotion])
INCLUDE ([BusinessEntityID],[FirstName],[LastName])

When adding the index above, statistics were automatically created and updated. Since the addition of this index I’ve added a few thousand rows to the Person table.

Let’s run the query and make sure the “Include Actual Execution Plan” button is selected.

2 - TempDB spills

After the query executes let’s take a look at the execution plan by clicking on the tab in the Results pane:

3 - TempDB spills
These warnings were added to SQL Server Management Studio 2012 so if you’re using an older version you may not see this. The spill data to TempDB warning means that the query was not granted enough memory to finish the operation and spilled over into the TempDB to complete the operation. We all know reading from memory is much faster than reading from disk and this is exactly what is happening here. The query read as much as it could from memory before moving over to the TempDB disk.

Click here to view the rest of this post.

SQL Server 2016 Compare Showplan Option

Just released in SQL Server 2016 CTP 2.4 is a new feature that will allow us to compare execution plans. The main purpose of this feature is to provide a side-by-side comparison of two execution plans. This makes it easier to find similarities and changes on each plan. This is a very nice enhancement that aids in troubleshooting issues such as understanding the impact of rewriting queries or observing how a design change, such as an index, may impact the plan. In the tip below, we’ll go over how it works.

For this post, I’m using the AdventureWorks2014 database and I’m going to use the following query:

SELECT FirstName, LastName, CreditCardID 
FROM Person.Person p 
INNER JOIN Sales.PersonCreditCard c ON p.BusinessEntityID = c.BusinessEntityID
WHERE p.ModifiedDate > '2014-01-01'

pen SQL Server Management Studio (SSMS) 2016 and execute the query with SET STATISTICS IO ON and “Include Actual Execution plan“ enabled.

1-Compare Showplan SQL FreelancerClick the Messages tab and you’ll notice that we’re doing 3819 logical reads on the Person table. This is a sign that this query can be tuned. We’d like to get this number to the lowest possible.

2-Compare Showplan SQL Freelancer

Click the execution plan tab and you’ll see the plan along with some Missing Index Details.

3-Compare Showplan SQL Freelancer

The execution plan is what we’ll focus on in this tip, but I wanted to look at STATISTICS IO also to see if we can improve the number of logical reads.

To be able to compare plan we’ll need to save the execution plan. Right click anywhere in the plan window and select Save Execution Plan As… Choose a name and location and hit Save.

Click here to view the rest of this post.

Creating SQL Server performance based reports using Excel

I’m not a big fan of Performance Monitor but in this post we will review the steps in how to export the results so they can be read more easily in Excel.  This post will not discuss how to setup a data collector set and will assume the reader already has results saved to the file system in a .blg file format.

If you’ve ever used PerfMon you’ve probably noticed .blg files. These are the files that hold all of our performance data that we have collected over the past week. When you open this file in PerfMon it can be tedious work to get the data in a presentable form. Actually, I don’t know if there’s a good way to get this data in a presentable form using the PerfMon’s GUI.

SQL Freelancer SQL Server PeSQL Freelancer SQL Server Perfmon Graph Excel

First thing we need to do is to convert the .blg to .csv so we can open it in Excel. To convert this file open command prompt and navigate to the directory the file is located and type:

Relog SQL_BASELINE_20140128.blg –f CSV –o NewFile.csv

SQL Freelancer SQL Server Perfmon Graph Excel
Once the command completes successfully, you should see the new file in the same directory:

SQL Freelancer SQL Server Perfmon Graph Excel

Open NewFile.csv in Excel:

SQL Freelancer SQL Server Perfmon Graph Excel

Not pretty huh? First thing we need to do is format the first column….which is the Date column. Change the text in the A1 cell from PDH-CSV 4.0…. to DateTime:

SQL Freelancer SQL Server Perfmon Graph Excel

Next, remove Row 2. We do not need this data. Also, highlight column A and right click anywhere in the column and choose Format Cells:

SQL Freelancer SQL Server Perfmon Graph Excel

On the number tab, click the Data Category and select 3/14/2012 1:30PM:

SQL Freelancer SQL Server Perfmon Graph Excel

Press Ctrl+Home to select the A1 cell.

Create Pivot Chart

Click on the Insert tab and select Pivot Chart:

SQL Freelancer SQL Server Perfmon Graph Excel

The Pivot Chart dialog box should automatically select the correct cells to analyze. If not, make sure all of the cells are selected:

SQL Freelancer SQL Server Perfmon Graph Excel

Once the new worksheet opens, drag DateTime to the Axis Fields pane:

SQL Freelancer SQL Server Perfmon Graph Excel

Now, depending on what counter(s) you want to analyze, drag it down to the Values pane. For this example, I’ll analyze Processor(_Total)\% Processor Time:

SQL Freelancer SQL Server Perfmon Graph Excel

Ta da! You now have a graph that displays your Processor %.

SQL Freelancer SQL Server Perfmon Graph Excel

To make this presentable, simply format the graph to your liking and you’re done.

SQL Server Index Properties in Management Studio

Understanding indexes and how they work can be complicated enough for a Jr. DBA, but throw in all the different options and properties and an index can soon be overwhelming. In this post, I’ll discuss the different options available when creating a basic index.

To create an index using SSMS, expand the tree for a table and right click on Indexes and select New Index: (This tip will not discuss the anatomy of an index, but will focus on the properties.)

SQL Freelancer SQL Server Index Properties

After choosing the index type, you will notice that there a few tabs on the left side that will define how your index will act. These tabs will differ between SSMS versions but basically have the same information. In this tip, we’ll be using SQL Server 2012 SSMS.

The first tab, General, is where you can set the index name, the key columns, and the included columns (if any). This tab also shows the table name and index type you selected:

SQL Freelancer SQL Server Index Properties
The next tab, Options, is where you can view or modify the properties for the index:

SQL Freelancer SQL Server Index Properties

Let’s go over each property.

Index Properties Options Page

We will begin with the options page.

Auto recompute statistics

This property defines whether or not you want SQL Server to automatically update the index statistics. Best practice is to leave this option set to True, otherwise you will have to manually update the statistics.

According to Microsoft, statistics are considered outdated when the following happens:

  • The table size has gone from 0 to >0 rows.
  • The number of rows in the table when the statistics were gathered was 500 or less, and the column modification counters (colmodctr) of the leading column of the statistics object has changed by more than 500 since then.
  • The table had more than 500 rows when the statistics were gathered, and the colmodctr of the leading column of the statistics object has changed by more than 500 + 20% of the number of rows in the table when the statistics were gathered.

Outdated statistics can lead to performance problems.

As the link above states, the statistics auto update is triggered by query optimization or by execution of a complied plan, and it involves only a subset of the columns referred to in the query.

Ignore duplicate values

This property specifies where a duplicate key value can be inserted into the column that is part the index. If set to “True”, SQL Server will issue a warning when an INSERT statement is about to create a duplicate key and will ignore the duplicate row. If this option is set to “False”, SQL Server will issue an error message and rolls back the INSERT statement.

Example:

In this example (AdventureWorks2012.Person.Person), I have a Non-clustered, unique index. My key column is rowguid.

SQL Freelancer SQL Server Index Properties
I’ve set the Ignore duplicate values to “False”

SQL Freelancer SQL Server Index Properties

If I try to INSERT a duplicate value for rowguid, I get the following error:

Msg 2601, Level 14, State 1, Line 1 Cannot insert duplicate key row in object ‘Person.Person’ with unique index ‘AK_Person_rowguid’. The duplicate key value is (92c4279f-1207-48a3-8448-4636514eb7e2). The statement has been terminated.

If I change the Ignore duplicate values to “True” and try to INSERT a duplicate value for rowguid I get the following:

(0 row(s) affected)

As you can see, neither of these inserted the duplicate value because it was a UNIQUE index but one returned an error message and ended the statement while the other didn’t return an error. If I was inserting multiple records the first message would have rolled back the transaction while the second message would have inserted all the unique values and skipped over the unique record.

The default value for this option is “False” and can only be used on UNIQUE indexes.

Click here to view the rest of this post.

SQL Server Integration Services Data Type Conversion Testing

There are two ways of converting data types within SQL Server Integration Services (SSIS).  Here is an overview of those options:

  • One way is using T-SQL code. Using the CAST function in your code can convert one data type into another. The scope of this tip is not intended on going into detail about the CAST function, but I will show a quick example.
  • The second way is using the Data Conversion Transformation Task. This task will convert the data in an input column to a different data type and then copy it to the output column. The task itself isn’t too complicating to setup as I’ll show you in this tip.

SQL Server Integration Services Data Conversion Task to Convert Data Types

Let’s look at the Data Conversion task first…

First, open Visual Studio (or Business Intelligence Dev Studio if you’re using pre SQL Server 2012) and create an SSIS project. Next, we can go ahead and configure a Connection Manager to our database. Right click Connection Managers in Solution Explorer and choose New Connection Manager:

SQL Freelancer SQL Server SSIS Data Type Conversion
Choose your Connection Manager type. In this example, we’ll use OLEDB. Next, configure the Connection Manager to point to your dataset. In this example, I’ll use localhost and the AdventureWorks2008R2 database:
SQL Freelancer SQL Server SSIS Data Type Conversion
Test the connection and click OK. Next, drag a Data Flow task from the SSIS toolbox onto the design screen:

SQL Freelancer SQL Server SSIS Data Type Conversion
Right click the Data Flow task and choose Edit. You are now inside the data flow task. This is where all the action happens. Drag an OLEDB source task from the SSIS toolbox to the design screen:

SQL Freelancer SQL Server SSIS Data Type Conversion
Right click the OLEDB task and choose Edit. This screen is where we will define the Connection Manager we created earlier. Under OLEDB connection manager choose the connection you created. Leave data access mode as Table or view. Change the name of the table or the view to the table that contains the data types to change. In this example, I’ll use a table named Sales.CurrencyRate:

SQL Freelancer SQL Server SSIS Data Type Conversion
To preview the data click Preview. In my example, I want to change the CurrencyRateDate and the ModifiedDate columns from a datetime data type to a date data type:

SQL Freelancer SQL Server SSIS Data Type Conversion
Click OK to close the OLEDB Source task. Drag the Data Conversion Transformation task onto the design screen. Connect the OLEDB Source task to the Data Conversion task:

SQL Freelancer SQL Server SSIS Data Type Conversion
Right click the Data Conversion task and choose Edit. Here is where we will convert our data types. Since I am converting CurrencyRateDate and ModifiedDate I will click on each of them in the Available Input Columns list:

SQL Freelancer SQL Server SSIS Data Type Conversion

Click here to view the rest of this post.

 

 

SQL Server Simple and Forced Parameterization

There are two different parameterization options that one can use in SQL Server. Simple parameterization and Forced parameterization. Let’s discuss each a little more in detail.

Simple Parameterization

When you execute a SQL statement without parameters, SQL Server internally will add parameters where needed so that it can try to reuse a cached execution plan. For example, if you look at the execution plan of the following statement you will see that SQL Server changes the WHERE value to a parameter (@1):

SELECT * FROM AdventureWorks2012.Sales.CreditCard WHERE CreditCardID = 11

SQL Freelancer SQL Server Parameterization

SQL Server builds this execution plan as if a parameter was the input instead of the number 11. Because of this parameterization, the following two statements show an example of SQL Server reusing the same execution plan even though the data results are different:

SELECT * FROM AdventureWorks2012.Sales.CreditCard WHERE CreditCardID = 11
SELECT * FROM AdventureWorks2012.Sales.CreditCard WHERE CreditCardID = 207

SQL Freelancer SQL Server Parameterization

This is the default behavior for Simple parameterization, however, it only applies to a small class of queries. If you want all your queries parameterized, you will need to enable the option, Forced parameterization.

Forced Parameterization

Forced parameterization is when the database engine parameterizes any literal value that appears in a SELECT, UPDATE, INSERT, or DELETE statement submitted in any form, but there are a few exceptions.  Refer to this article for a list of these exceptions.

Some applications use queries that pass in literals as opposed to stored procedures that pass in parameters. For these type of applications you may want to experiment with enabling Forced parameterization to see if it has a positive effect on the workload by reducing query compilations.

Running the following query in Simple parameterization produces the following execution plan where the WHERE clause is not parameterized:

SQL Freelancer SQL Server Parameterization
Click here to view the rest of this post.

Detecting CPU Pressure

There are a few different ways to detect CPU pressure, for example, Performance Monitor or SQL Profiler, but in this tip we’ll discuss the wait type SOS_SCHEDULER_YIELD and the DMV sys.dm_os_schedulers.

Some of you might have seen the wait type SOS_SCHEDULER_YIELD and wondered what in the world is this? Books Online (BOL) describes this wait type as:

Occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.

SOS_SCHEDULER_YIELD is a fairly common wait type and occurs when there is CPU pressure. SQL Server runs multiple threads and tries to allow all threads to run without problems. However, if all the threads are busy on each scheduler and can’t let other threads run, it will yield itself for another thread, which in turn creates the SOS_SCHEDULER_YIELD wait type.

You can see the top wait stats and the percentage of waits at the current point in time by running the following query by Tim Ford and Glenn Berry.

WITH Waits AS 
( 
SELECT 
wait_type, 
wait_time_ms / 1000. AS wait_time_s, 
100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct, 
ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn 
FROM sys.dm_os_wait_stats 
WHERE wait_type 
NOT IN 
('CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 
'SLEEP_TASK', 'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 
'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT') 
) -- filter out additional irrelevant waits 
SELECT W1.wait_type, 
CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s, 
CAST(W1.pct AS DECIMAL(12, 2)) AS pct, 
CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct 
FROM Waits AS W1 
INNER JOIN Waits AS W2 ON W2.rn <= W1.rn 
GROUP BY W1.rn, 
W1.wait_type, 
W1.wait_time_s, 
W1.pct 
HAVING SUM(W2.pct) - W1.pct < 95; -- percentage threshold;

After running this query on one of my servers I can see that only 2.22% of my waits are due to CPU pressure so I can rule that out as a cause for poor performance. (This is only current since the last SQL Service restart):

SQL Freelancer SQL Server CPU Pressure DMV

If you see a high percentage of this wait type then one way of troubleshooting this is to use the DMV sys.dm_os_schedulers. BOL describes sys.dm_os_schedulers as:

Returns one row per scheduler in SQL Server where each scheduler is mapped to an individual processor. Use this view to monitor the condition of a scheduler or to identify runaway tasks.

This DMV has several columns including number or workers, active tasks, and the status of each scheduler, etc., which can help troubleshoot certain problems, but the most important column is the one used for measuring queued tasks, the runnable_tasks_count column. This column displays the count of tasks yielding to other tasks resulting in a SOS_SCHEDULER_YIELD wait type. If this column is frequently greater than 0 then CPU pressure may be present and blocking may occur.

To see an average of current tasks and current waiting tasks you can use the following query:

SELECT AVG(current_tasks_count) AS [Avg Current Task], 
AVG(runnable_tasks_count) AS [Avg Wait Task]
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255
AND status = 'VISIBLE ONLINE'

SQL Freelancer SQL Server CPU Pressure DMV
Click here to view the rest of this post.

Client Statistics

A feature often overlooked while tuning queries is Client Statistics located right on SSMS editor bar.

SQL Freelancer SQL Server Client Statistics
Other ways of opening Client Statistics include:

SHIFT + ALT + S and Menu Bar – Query -> Include Client Statistics

SQL Freelancer SQL Server Client Statistics

Client Statistics is useful when the user needs to gather information about execution times, processing times, the amount of data sent between client and server, etc. it’s very easy to use. Simply turn it on using one of the methods above and execute your query.

In this example I’m using the following query:

SELECT pc.FirstName
,pc.LastName
,pc.EmailAddress
,he.Title
,hh.Rate
FROM Person.Contact pc
JOIN HumanResources.Employee he ON pc.ContactID = he.ContactID
JOIN HumanResources.EmployeePayHistory hh ON he.EmployeeID = hh.EmployeeID
WHERE hh.Rate > 10
ORDER BY hh.Rate

After executing the query you will notice a new tab labeled Client Statistics:

SQL Freelancer SQL Server Client Statistics

After looking at the statistics provided for Trial 1 you can see the processing time = 39ms, execution time = 46ms, etc.

After the running the same query again I get the following results under Trial 2:

SQL Freelancer SQL Server Client Statistics
This time processing time went up to 45ms and execution time went up to 51ms. It also has an average column to compare overall results. The green and red arrows represent differences between trials. Green arrows indicate improved statistics and red arrows indicate degrading statistics.

A max of 10 trials can be run with the 11th trial dropping the 1st trial and so on. To reset statistics, on the menu bar, go to Query, Reset Client Statistics.

SQL Freelancer SQL Server Client StatisticsThis is great when tuning indexes because you can see different statistics as you add/delete indexes.

Statistics IO and Statistics TIME

Statistics IO and Statistics TIME can help in performance tuning from a granular level and saves time compared to Execution Plans, SQL Server Profiler, etc. Let’s looks at an example of these two commands:

There a few different ways in which you can turn on these commands inside SQL Server Management Studio. If you want every query window to open up with statistics you can go to Tools, Options, Query Execution, SQL Server, Advanced and check SET STATISTICS TIME and SET STATISTICS IO:

SQL Freelancer SQL Server SET STATISTICS TIME SET STATISTICS IOMost of the time you will just want to use statistics for a certain troublesome query. To turn these commands on for a specified query window, open the query window and go to Query, Query Options, Execution, Advanced and check SET STATISTICS TIME and SET STATISTICS IO:

SQL Freelancer SQL Server SET STATISTICS TIME SET STATISTICS IOOr you can simply use T-SQL and type the following to turn statistics on

SET STATISTICS IO ON
SET STATISTICS TIME ON

And to turn off simply use:

SET STATISTICS IO OFF
SET STATISTICS TIME OFF

In this example I’m going to run a query from AdventureWorks using statistics:

SET STATISTICS IO ON
SET STATISTICS TIME ON

SELECT pc.FirstName
,pc.LastName
,pc.EmailAddress
,he.Title
,hh.Rate
FROM Person.Contact pc
JOIN HumanResources.Employee he ON pc.ContactID = he.ContactID
JOIN HumanResources.EmployeePayHistory hh ON he.EmployeeID = hh.EmployeeID
WHERE hh.Rate > 10
ORDER BY hh.Rate

After the query finishes it will show you results like normal but if you click on the Messages tab you will see some extra information:

SQL Freelancer SQL Server SET STATISTICS TIME SET STATISTICS IO

You will see IO information at the top:

Scan Count – Number of index or table scans
Logical Reads – Number of pages read from the data cache
Physical Reads – Number of pages read from disk
Read-Ahead Reads – Number of pages placed into the cache for the query
LOB Logical Reads – Number of text, ntext, image, or large value pages read from the data cache
LOB Physical Reads – Number of text, ntext, image, or large value type pages read from disk
LOB Read-Ahead Reads – Number of text, ntext, image, or large value type pages placed into the cache for the query

Execution TIME information will be directly below:

CPU Time: How long the query worked with the CPU
Elapsed Time: How long the query took to gather data

Statistics, in general, is an excellent starting point to see why you might have a poor performing query. You can gather results such as execution time, compile time, reads, writes, cost, etc.

SQL Server performance tuning for each layer of an application

Performance tuning is a big subject and there are a lot of different pieces to troubleshooting a poor performing database or application. I like to use the 5 level process shown below:

  1. Server Hardware
  2. Operating System
  3. SQL Server
  4. Database
  5. Application

SQL Freelancer SQL Server Performance Tuning

Hardware

When troubleshooting poor performance don’t always assume it’s something wrong with the database itself. The problem often lies deeper. We’ll start with hardware. If the hardware isn’t up to par, your OS, database, and application will suffer.

One of the best tools to monitor hardware are counters that are part of the Performance Monitor, or PerfMon for short. I’m not going to go into how to use PerfMon, which can be covered in a different tip, but I will tell you some of the most important counters to watch and a description of what they do.

Network Counters:

  • Network Interface: Bytes Total/sec – Bytes Total/sec is the rate at which bytes are sent and received over each network adapter, including framing characters. Network Interface\Bytes Total/sec is a sum of Network Interface\Bytes Received/sec and Network Interface\Bytes Sent/sec. This value should be pretty low.
  • Network Interface: Output Queue Length -Output Queue Length is the length of the output packet queue (in packets). If this is longer than two, there are delays and the bottleneck should be found and eliminated, if possible. Since the requests are queued by the Network Driver Interface Specification (NDIS) in this implementation, this should always be 0.
  • Network Interface: Packets Outbound Errors – Packets Outbound Errors is the number of outbound packets that could not be transmitted because of errors. This value should stay at 0 also.

SQL Freelancer SQL Server Performance TuningClick here to view the rest of this post.