SQL Server Index Fragmentation Overview

We’ve all heard about database/index fragmentation (and if you haven’t, continue reading), but what is it? Is it an issue? How do I know if it resides in my database? How do I fix it? These questions could be a tip all in itself, but I’ll try to give you an idea of each in this post.

Without going into a lot of detail, SQL Server stores data on 8KB pages. When we insert data into a table, SQL Server will allocate one page to store that data unless the data inserted is more than 8KB in which it would span multiple pages. Each page is assigned to one table. If we create 10 tables then we’ll have 10 different pages.

As you insert data into a table, the data will go to the transaction log file first. The transaction log file is a sequential record meaning as you insert, update, and delete records the log will record these transactions from start to finish. The data file on the other hand is not sequential. The log file will flush the data to the data file creating pages all over the place.

Now that we have an idea of how data is stored, what does this have to do with fragmentation?

There are two types of fragmentation: Internal Fragmentation and External Fragmentation.

SQL Server Internal Fragmentation

SQL Server Internal Fragmentation is caused by pages that have too much free space. Let’s pretend at the beginning of the day we have a table with 40 pages that are 100% full, but by the end of the day we have a table with 50 pages that are only 80% full because of various delete and insert statements throughout the day. This causes an issue because now when we need to read from this table we have to scan 50 pages instead of 40 which should may result in a decrease in performance. Let’s see a quick and dirty example.

Let’s say I have the following table with a Primary Key and a non-clustered index on FirstName and LastName:

IndexFragmentationI’ll talk about ways to analyze fragmentation later in this tip, but for now we can right click on the index, click Properties, and Fragmentation to see fragmentation and page fullness. This is a brand new index so it’s at 0% fragmentation.

IndexFragmentation

Click here to view the rest of this post.

SQL Server Index Properties in Management Studio

Understanding indexes and how they work can be complicated enough for a Jr. DBA, but throw in all the different options and properties and an index can soon be overwhelming. In this post, I’ll discuss the different options available when creating a basic index.

To create an index using SSMS, expand the tree for a table and right click on Indexes and select New Index: (This tip will not discuss the anatomy of an index, but will focus on the properties.)

SQL Freelancer SQL Server Index Properties

After choosing the index type, you will notice that there a few tabs on the left side that will define how your index will act. These tabs will differ between SSMS versions but basically have the same information. In this tip, we’ll be using SQL Server 2012 SSMS.

The first tab, General, is where you can set the index name, the key columns, and the included columns (if any). This tab also shows the table name and index type you selected:

SQL Freelancer SQL Server Index Properties
The next tab, Options, is where you can view or modify the properties for the index:

SQL Freelancer SQL Server Index Properties

Let’s go over each property.

Index Properties Options Page

We will begin with the options page.

Auto recompute statistics

This property defines whether or not you want SQL Server to automatically update the index statistics. Best practice is to leave this option set to True, otherwise you will have to manually update the statistics.

According to Microsoft, statistics are considered outdated when the following happens:

  • The table size has gone from 0 to >0 rows.
  • The number of rows in the table when the statistics were gathered was 500 or less, and the column modification counters (colmodctr) of the leading column of the statistics object has changed by more than 500 since then.
  • The table had more than 500 rows when the statistics were gathered, and the colmodctr of the leading column of the statistics object has changed by more than 500 + 20% of the number of rows in the table when the statistics were gathered.

Outdated statistics can lead to performance problems.

As the link above states, the statistics auto update is triggered by query optimization or by execution of a complied plan, and it involves only a subset of the columns referred to in the query.

Ignore duplicate values

This property specifies where a duplicate key value can be inserted into the column that is part the index. If set to “True”, SQL Server will issue a warning when an INSERT statement is about to create a duplicate key and will ignore the duplicate row. If this option is set to “False”, SQL Server will issue an error message and rolls back the INSERT statement.

Example:

In this example (AdventureWorks2012.Person.Person), I have a Non-clustered, unique index. My key column is rowguid.

SQL Freelancer SQL Server Index Properties
I’ve set the Ignore duplicate values to “False”

SQL Freelancer SQL Server Index Properties

If I try to INSERT a duplicate value for rowguid, I get the following error:

Msg 2601, Level 14, State 1, Line 1 Cannot insert duplicate key row in object ‘Person.Person’ with unique index ‘AK_Person_rowguid’. The duplicate key value is (92c4279f-1207-48a3-8448-4636514eb7e2). The statement has been terminated.

If I change the Ignore duplicate values to “True” and try to INSERT a duplicate value for rowguid I get the following:

(0 row(s) affected)

As you can see, neither of these inserted the duplicate value because it was a UNIQUE index but one returned an error message and ended the statement while the other didn’t return an error. If I was inserting multiple records the first message would have rolled back the transaction while the second message would have inserted all the unique values and skipped over the unique record.

The default value for this option is “False” and can only be used on UNIQUE indexes.

Click here to view the rest of this post.