Capturing Index Usage Stats

Index usage stats are invaluable for evaluating performance of existing indexes. Every time an index is used by the query-engine, an internal table is updated reflecting that usage. So, every time a scan, seek, lookup, or update operating is completed, a counter is incremented for the affected index. Bravo, SQL Server 2008+! I can tell which indexes are being used, and which aren’t.

Problem is, they aren’t persisted by SQL Server across server reboots, and with Windows Update ensuring your Windows servers get rebooted at least monthly for security patches, the stats captured by SQL Server are likely to be of limited utility. If, like some of the systems I’ve worked on, you have an index-per-column, and you’re not certain which indexes can be dropped safely, you need a way to determine if any given index is being used, or if the system is updating it millions of times a day for no reason. It’s not enough to look at the sys.dm_db_index_usage_stats DMV for a couple of days, then dropping all the “unused” indexes. It’s really important, for business-critical systems, to ensure you take a longer-term look at index usage to ensure you don’t remove an index that is only used rarely, but when it is used is imperative to performance.

Do *you* want to be the one to explain to the CEO why that month-end process is suddenly taking three days instead of three hours?

I wrote a script that can be scheduled in SQL Server Agent to capture index usage stats into a permanent table allowing analysis across multiple months, and even years.

I created a database to hold the index usage stats:

CREATE DATABASE StatsDB;
ALTER DATABASE StatsDB SET RECOVERY SIMPLE;
BACKUP DATABASE StatsDB TO DISK = N'NUL:';
GO

I added the following script to a step inside a SQL Server Agent job. The job is scheduled to run once every 6 hours. Existing rows are replaced every 6 hours unless the server was just rebooted, in which case a new row is created, thereby creating the required history for the usage of each index:

USE StatsDB;

DECLARE @cmd nvarchar(max);
DECLARE @DatabaseName SYSNAME;
--this is the name of the database we're capturing index usage stats for:
SET @DatabaseName = N'Test'; 

IF OBJECT_ID(N'dbo.IndexUsageStats', N'U') IS NULL
BEGIN
CREATE TABLE dbo.IndexUsageStats
(
    DatabaseID int NOT NULL
    , ObjectID int NOT NULL
    , IndexID int NOT NULL
    , RunDateTime datetime NOT NULL
    , user_seeks bigint
    , user_scans bigint
    , user_lookups bigint
    , user_updates bigint
    , system_seeks bigint
    , system_scans bigint
    , system_lookups bigint
    , system_updates bigint
    , TotalAllocatedMB bigint
    , PartitionCount bigint
    , SQLServerStartTime datetime
    , RowLastUpdated datetime 
        CONSTRAINT IndexUsageStats_RowLastUpdated_DF
        DEFAULT (GETDATE())
    , CONSTRAINT IndexUsageStats_pk
        PRIMARY KEY
        CLUSTERED
        (DatabaseID, ObjectID, IndexID, RunDateTime)
        WITH (
            DATA_COMPRESSION = PAGE
            , ALLOW_ROW_LOCKS = ON
            , ALLOW_PAGE_LOCKS = ON
            , FILLFACTOR = 100
            )
) ON [PRIMARY];
END

SET @cmd = N'DECLARE @SQLServerStartTime datetime = (
        SELECT TOP(1) dosi.sqlserver_start_time 
        FROM sys.dm_os_sys_info dosi
    );

IF OBJECT_ID(N''tempdb..#RowsUpdated'', N''U'') IS NOT NULL
DROP TABLE #RowsUpdated;
CREATE TABLE #RowsUpdated
(
    RowNum int NOT NULL PRIMARY KEY CLUSTERED
);

IF OBJECT_ID(N''tempdb..#IndexUsageStats'', N''U'') IS NOT NULL
DROP TABLE #IndexUsageStats;
CREATE TABLE #IndexUsageStats
(
    RowNum int NOT NULL IDENTITY(1,1)
    , DatabaseID int NOT NULL
    , ObjectID int NOT NULL
    , IndexID int NOT NULL
    , RunDateTime datetime NOT NULL
    , user_seeks bigint
    , user_scans bigint
    , user_lookups bigint
    , user_updates bigint
    , system_seeks bigint
    , system_scans bigint
    , system_lookups bigint
    , system_updates bigint
    , TotalAllocatedMB bigint
    , PartitionCount bigint
    , SQLServerStartTime datetime
    , PRIMARY KEY
        CLUSTERED
        (DatabaseID, ObjectID, IndexID, RunDateTime)
        WITH (
            DATA_COMPRESSION = PAGE
            , ALLOW_ROW_LOCKS = ON
            , ALLOW_PAGE_LOCKS = ON
            , FILLFACTOR = 100
            )
);'

SET @cmd = @cmd + N'

INSERT INTO #IndexUsageStats (
      DatabaseID
    , ObjectID
    , IndexID
    , RunDateTime
    , user_seeks
    , user_scans
    , user_lookups
    , user_updates
    , system_seeks
    , system_scans
    , system_lookups
    , system_updates
    , TotalAllocatedMB
    , PartitionCount
    , SQLServerStartTime
)
SELECT DatabaseID       = us.database_id
    , ObjectID          = o.object_id
    , IndexID           = i.index_id
    , RunDateTime       = GETDATE()
    , user_seeks        = COALESCE(us.user_seeks, 0)
    , user_scans        = COALESCE(us.user_scans, 0)
    , user_lookups      = COALESCE(us.user_lookups, 0)
    , user_updates      = COALESCE(us.user_updates, 0)
    , system_seeks      = COALESCE(us.system_seeks, 0)
    , system_scans      = COALESCE(us.system_scans, 0)
    , system_lookups    = COALESCE(us.system_lookups, 0)
    , system_updates    = COALESCE(us.system_updates, 0)
    , TotalAllocatedMB  = SUM(au.total_pages) * 8192E0 / 1048576
    , PartitionCount    = COUNT(p.partition_id)
    , SQLServerStartTime = @SQLServerStartTime
FROM ' + QUOTENAME(@DatabaseName) + N'.sys.objects o 
    INNER JOIN ' + QUOTENAME(@DatabaseName) + N'.sys.indexes i ON o.object_id = i.object_id 
    INNER JOIN ' + QUOTENAME(@DatabaseName) + N'.sys.schemas sc ON o.schema_id = sc.schema_id
    INNER JOIN ' + QUOTENAME(@DatabaseName) + N'.sys.dm_db_index_usage_stats us ON us.object_id = o.object_id
        AND us.index_id = i.index_id
    LEFT JOIN ' + QUOTENAME(@DatabaseName) + N'.sys.partitions p ON o.object_id = p.object_id 
                    AND i.index_id = p.index_id
    LEFT JOIN ' + QUOTENAME(@DatabaseName) + N'.sys.allocation_units au ON (
                    ((au.type = 1 or au.type = 3) AND au.container_id = p.hobt_id) 
                    OR (au.type = 2 AND au.container_id = p.partition_id)
                    )
WHERE o.is_ms_shipped = 0
    AND us.database_id = ' + CONVERT(nvarchar(5), DB_ID(@DatabaseName)) + N'
GROUP BY us.database_id
    , o.object_id
    , i.index_id
    , COALESCE(us.user_seeks, 0)
    , COALESCE(us.user_scans, 0)
    , COALESCE(us.user_lookups, 0)
    , COALESCE(us.user_updates, 0)
    , COALESCE(us.system_seeks, 0)
    , COALESCE(us.system_scans, 0)
    , COALESCE(us.system_lookups, 0)
    , COALESCE(us.system_updates, 0)
ORDER BY o.object_id
    , i.index_id;'

SET @cmd = @cmd + N'

UPDATE dbo.IndexUsageStats
SET 
      RunDateTime = t.RunDateTime
    , user_seeks = t.user_seeks
    , user_scans = t.user_scans
    , user_lookups = t.user_lookups
    , user_updates = t.user_updates
    , system_seeks = t.system_seeks
    , system_scans = t.system_scans
    , system_lookups = t.system_lookups
    , system_updates = t.system_updates
    , TotalAllocatedMB = t.TotalAllocatedMB
    , PartitionCount = t.PartitionCount
    , RowLastUpdated = GETDATE()
OUTPUT t.RowNum INTO #RowsUpdated
FROM #IndexUsageStats t
    INNER JOIN dbo.IndexUsageStats us ON 
            t.DatabaseID = us.DatabaseID
        AND t.ObjectID = us.ObjectID
        AND t.IndexID = us.IndexID
        AND t.SQLServerStartTime = us.SQLServerStartTime
WHERE us.user_seeks <> t.user_seeks
    OR us.user_lookups <> t.user_lookups
    OR us.user_scans <> t.user_scans
    OR us.user_updates <> t.user_updates
    OR us.system_lookups <> t.system_lookups
    OR us.system_scans <> t.system_scans
    OR us.system_seeks <> t.system_seeks
    OR us.system_updates <> t.system_updates;'

SET @cmd = @cmd + N'

DELETE FROM #IndexUsageStats 
WHERE EXISTS (
        SELECT 1 
        FROM #RowsUpdated ru 
        WHERE ru.RowNum = #IndexUsageStats.RowNum
    );

INSERT INTO dbo.IndexUsageStats 
(
      DatabaseID
    , ObjectID
    , IndexID
    , RunDateTime
    , user_seeks
    , user_scans
    , user_lookups
    , user_updates
    , system_seeks
    , system_scans
    , system_lookups
    , system_updates
    , TotalAllocatedMB
    , PartitionCount
    , SQLServerStartTime
)
SELECT 
      DatabaseID
    , ObjectID
    , IndexID
    , RunDateTime
    , user_seeks
    , user_scans
    , user_lookups
    , user_updates
    , system_seeks
    , system_scans
    , system_lookups
    , system_updates
    , TotalAllocatedMB
    , PartitionCount
    , SQLServerStartTime
FROM #IndexUsageStats iust
WHERE NOT EXISTS (
    SELECT 1
    FROM dbo.IndexUsageStats ius
    WHERE ius.DatabaseID = iust.DatabaseID
        AND ius.ObjectID = iust.ObjectID
        AND ius.IndexID = iust.IndexID
        AND ius.SQLServerStartTime = iust.SQLServerStartTime
    );
    
';

EXEC sys.sp_executesql @cmd;
GO

I use the following query to get the list of index usage stats:

SELECT DatabaseName = 'Test'
    , ObjectName = o.name
    , IndexName = i.name
    , IndexType = i.type_desc
    , iug.SQLServerStartTime
    , iug.RowLastUpdated
    , iug.PartitionCount
    , iug.TotalAllocatedMB
    , iug.user_lookups
    , iug.user_scans
    , iug.user_seeks
    , iug.user_updates
    , iug.system_lookups
    , iug.system_scans
    , iug.system_seeks
    , iug.system_updates
FROM dbo.IDX_USG_STATS iug
    LEFT JOIN [Test].sys.objects o ON iug.ObjectID = o.object_id
    LEFT JOIN [Test].sys.indexes i ON iug.ObjectID = i.object_id 
                AND iug.IndexID = i.index_id
WHERE iug.DatabaseID = DB_ID(N'Test');
GO

If you have SQL Server 2016+, I’d suggest looking into using SQL Server Query Store since it offers a lot more functionality. However, the script approach I show above is extremely light-weight, and creates only a tiny extra load on the live SQL Server instance. Query Store, not so much.