CHECKPOINT and Simple Recovery Model

2019-06-27 · by Hannah Vernon · in Internals, troubleshooting

This article shows how checkpoint and simple recovery model works. For databases configured with the “simple” recovery model, it’s important to understand how checkpoints work¹ under simple recovery model to avoid running out of log space. Checkpointing is the process of ensuring changes recorded in memory have been committed to the data file(s) in the affected database. Microsoft defines the checkpoint process like this:

… the Database Engine performs modifications to database pages in memory-in the buffer cache-and does not write these pages to disk after every change. Rather, the Database Engine periodically issues a checkpoint on each database. A checkpoint writes the current in-memory modified pages (known as dirty pages) and transaction log information from memory to disk…

The CHECKPOINT command can be used to manually force a checkpoint, which can be useful when rolling back a transaction if you run out of log space.

Adriaen Van Ostade – Winter Landscape with Ice Skaters – so much more fun than checkpoint and simple recovery!

The code in this post shows how rolling back a transaction may not actually free the used log space, even under the simple recovery model. This is true even for indirect checkpoints issued by SQL Server 2016 and newer.

Monitoring for Checkpoints

Since we’re interested in the checkpoint process, you might want to create an Extended Events session to track checkpoints as detailed in this code.

Once you setup the checkpoint tracking extended events session, you can use the built-in GUI interface to watch the live session data. Alternately, use the code in this post to query the ring buffer for events.

The test-bed

We need to create a new, empty database for use with our minimal, complete, and verifiable example code:

USE master;
IF DB_ID(N'LogSpaceTest') IS NOT NULL
BEGIN
    ALTER DATABASE LogSpaceTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE LogSpaceTest;
END
CREATE DATABASE LogSpaceTest
ON (NAME = N'LogSpaceTest_Primary'
    , FILENAME = N'C:\data\LogSpaceTest_Primary.mdf'
    , SIZE = 8000MB
    , FILEGROWTH = 100MB
    , MAXSIZE = 10000MB)
LOG ON (NAME = N'LogSpaceTest_Log'
    , FILENAME = N'C:\log\LogSpaceTest_log.ldf'
    , SIZE = 8000MB
    , FILEGROWTH = 1000MB
    , MAXSIZE = 8000MB);

USE master;

IF DB_ID(N'LogSpaceTest') IS NOT NULL

BEGIN

ALTER DATABASE LogSpaceTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

DROP DATABASE LogSpaceTest;

END

CREATE DATABASE LogSpaceTest

ON (NAME = N'LogSpaceTest_Primary'

, FILENAME = N'C:\data\LogSpaceTest_Primary.mdf'

, SIZE = 8000MB

, FILEGROWTH = 100MB

, MAXSIZE = 10000MB)

LOG ON (NAME = N'LogSpaceTest_Log'

, FILENAME = N'C:\log\LogSpaceTest_log.ldf'

, SIZE = 8000MB

, FILEGROWTH = 1000MB

, MAXSIZE = 8000MB);

In order to ensure the database is actually running in simple recovery model, we need to set the recovery model and perform a database backup. Since this is a test database that we don’t actually care about, I’m going to perform the backup to the NUL device. Don’t do this on a production database!

ALTER DATABASE LogSpaceTest SET RECOVERY SIMPLE;
BACKUP DATABASE LogSpaceTest TO DISK = N'NUL';
GO

ALTER DATABASE LogSpaceTest SET RECOVERY SIMPLE;

BACKUP DATABASE LogSpaceTest TO DISK = N'NUL';

Let’s create a table, and add some data:

USE LogSpaceTest;
DROP TABLE IF EXISTS dbo.RollbackTest;
CREATE TABLE dbo.RollbackTest
(
    rid int NOT NULL
        CONSTRAINT RolbackTest_pkc
        PRIMARY KEY CLUSTERED
        IDENTITY(1,1)
    , someCol varchar(8000) NOT NULL
);

INSERT INTO dbo.RollbackTest WITH (TABLOCKX) (someCol) 
SELECT TOP(1000000) REPLICATE('A', 8000)
FROM sys.syscolumns sc1
    CROSS JOIN sys.syscolumns sc2;
GO

USE LogSpaceTest;

DROP TABLE IF EXISTS dbo.RollbackTest;

CREATE TABLE dbo.RollbackTest

(

rid int NOT NULL

CONSTRAINT RolbackTest_pkc

PRIMARY KEY CLUSTERED

IDENTITY(1,1)

, someCol varchar(8000) NOT NULL

);

INSERT INTO dbo.RollbackTest WITH (TABLOCKX) (someCol)

SELECT TOP(1000000) REPLICATE('A', 8000)

FROM sys.syscolumns sc1

CROSS JOIN sys.syscolumns sc2;

The code above adds one million rows to the table. Since each row is 8,004 bytes (plus a small amount of overhead), the table is going to consume 1,000,000 * 8192 bytes, or approximately 7,812.5 MB of space. This will consume 80% of our data file. However, the log will not see substantial growth since we are using the WITH (TABLOCKX) option. This option allows SQL Server to only record the page numbers touched by the DML operation, instead of capturing all the individual rows. If a rollback is required, those pages would simply be marked unallocated, as if they were never touched.

Updating One Million Rows

The second piece of code below updates each row in the table. This operation will require a large amount of log space, since every row will be touched. Before the update, we’re going to show the size of the data and log files, then execute the update statement inside a TRY...CATCH block, mimicking the way an update is typically coded.

--check the status of the log
SELECT d.name
    , mf.name
    , Size = mf.size * 8192E0 / 1048576
    , MaxSize = mf.max_size * 8192E0 / 1048576
    , d.recovery_model_desc
    , d.log_reuse_wait_desc
FROM master.sys.databases d
    LEFT JOIN sys.database_files mf ON d.database_id = DB_ID()
    LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id
WHERE d.name = DB_NAME()
    AND mf.type_desc = N'LOG';

SELECT DatabaseName = DB_NAME()
    , LogSizeMB = su.total_log_size_in_bytes / 1048576
    , LogUsedMB = su.used_log_space_in_bytes / 1048576
    , LogUsedPercent = ROUND(su.used_log_space_in_percent, 2)
FROM sys.dm_db_log_space_usage su;

--check the status of the log

SELECT d.name

, mf.name

, Size = mf.size * 8192E0 / 1048576

, MaxSize = mf.max_size * 8192E0 / 1048576

, d.recovery_model_desc

, d.log_reuse_wait_desc

FROM master.sys.databases d

LEFT JOIN sys.database_files mf ON d.database_id = DB_ID()

LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id

WHERE d.name = DB_NAME()

AND mf.type_desc = N'LOG';

SELECT DatabaseName = DB_NAME()

, LogSizeMB = su.total_log_size_in_bytes / 1048576

, LogUsedMB = su.used_log_space_in_bytes / 1048576

, LogUsedPercent = ROUND(su.used_log_space_in_percent, 2)

FROM sys.dm_db_log_space_usage su;

The output from the above two queries look like:

╔══════════════╦══════════════════╦══════╦═════════╦═════════════════════╦═════════════════════╗
║     name     ║       name       ║ Size ║ MaxSize ║ recovery_model_desc ║ log_reuse_wait_desc ║
╠══════════════╬══════════════════╬══════╬═════════╬═════════════════════╬═════════════════════╣
║ LogSpaceTest ║ LogSpaceTest_Log ║ 8000 ║    8000 ║ SIMPLE              ║ NOTHING             ║
╚══════════════╩══════════════════╩══════╩═════════╩═════════════════════╩═════════════════════╝

╔══════════════╦═══════════╦═══════════╦════════════════╗
║ DatabaseName ║ LogSizeMB ║ LogUsedMB ║ LogUsedPercent ║
╠══════════════╬═══════════╬═══════════╬════════════════╣
║ LogSpaceTest ║      7999 ║        35 ║           0.44 ║
╚══════════════╩═══════════╩═══════════╩════════════════╝

The code that fills the log:

--run out of log space here.
BEGIN TRANSACTION
BEGIN TRY
    UPDATE dbo.RollbackTest 
    SET someCol = REPLICATE('B', 8000);
END TRY
BEGIN CATCH
    DECLARE @msg nvarchar(1000);
    SET @msg = ERROR_MESSAGE();
    PRINT N'rolling back - ' + @msg;
    ROLLBACK TRANSACTION
END CATCH
IF @@TRANCOUNT > 0 
BEGIN
    PRINT N'committing';
    COMMIT TRANSACTION;
END

--run out of log space here.

BEGIN TRANSACTION

BEGIN TRY

UPDATE dbo.RollbackTest

SET someCol = REPLICATE('B', 8000);

END TRY

BEGIN CATCH

DECLARE @msg nvarchar(1000);

SET @msg = ERROR_MESSAGE();

PRINT N'rolling back - ' + @msg;

ROLLBACK TRANSACTION

END CATCH

IF @@TRANCOUNT > 0

BEGIN

PRINT N'committing';

COMMIT TRANSACTION;

END

The message printed by the code above:

rolling back - The transaction log for database 'LogSpaceTest' is full due to 'ACTIVE_TRANSACTION'.

Check the Transaction Log

If we check the status of the log, we’ll see it is very close to full, and sys.databases reports there is an active transaction, even though there are no transactions ongoing:

SELECT d.name
    , mf.name
    , Size = mf.size * 8192E0 / 1048576
    , MaxSize = mf.max_size * 8192E0 / 1048576
    , d.recovery_model_desc
    , d.log_reuse_wait_desc
FROM master.sys.databases d
    LEFT JOIN sys.database_files mf ON d.database_id = DB_ID()
    LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id
WHERE d.name = DB_NAME()
    AND mf.type_desc = N'LOG';

SELECT DatabaseName = DB_NAME()
    , LogSizeMB = su.total_log_size_in_bytes / 1048576
    , LogUsedMB = su.used_log_space_in_bytes / 1048576
    , LogUsedPercent = ROUND(su.used_log_space_in_percent, 2)
FROM sys.dm_db_log_space_usage su;

SELECT d.name

, mf.name

, Size = mf.size * 8192E0 / 1048576

, MaxSize = mf.max_size * 8192E0 / 1048576

, d.recovery_model_desc

, d.log_reuse_wait_desc

FROM master.sys.databases d

LEFT JOIN sys.database_files mf ON d.database_id = DB_ID()

LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id

WHERE d.name = DB_NAME()

AND mf.type_desc = N'LOG';

SELECT DatabaseName = DB_NAME()

, LogSizeMB = su.total_log_size_in_bytes / 1048576

, LogUsedMB = su.used_log_space_in_bytes / 1048576

, LogUsedPercent = ROUND(su.used_log_space_in_percent, 2)

FROM sys.dm_db_log_space_usage su;

Output looks like:

╔══════════════╦══════════════════╦══════╦═════════╦═════════════════════╦═════════════════════╗
║     name     ║       name       ║ Size ║ MaxSize ║ recovery_model_desc ║ log_reuse_wait_desc ║
╠══════════════╬══════════════════╬══════╬═════════╬═════════════════════╬═════════════════════╣
║ LogSpaceTest ║ LogSpaceTest_Log ║ 8000 ║    8000 ║ SIMPLE              ║ ACTIVE_TRANSACTION  ║
╚══════════════╩══════════════════╩══════╩═════════╩═════════════════════╩═════════════════════╝

╔══════════════╦═══════════╦═══════════╦════════════════╗
║ DatabaseName ║ LogSizeMB ║ LogUsedMB ║ LogUsedPercent ║
╠══════════════╬═══════════╬═══════════╬════════════════╣
║ LogSpaceTest ║      7999 ║      7698 ║          96.24 ║
╚══════════════╩═══════════╩═══════════╩════════════════╝

Even though the transaction has been rolled back, the log records will not be cleared until a checkpoint occurs. An automatic checkpoint could be triggered by other ongoing transactions being written to the log, or a manual CHECKPOINT statement could be executed. However, for a database that is not seeing frequent transactions, the log may stay nearly full for an extended period of time. This scenario might be seen often during development where there are a very limited number of transactions being generated. Being aware of the behavior of automatic checkpoints is important, and is documented in Microsoft Docs as such:

An automatic checkpoint occurs each time the number of log records reaches the number the Database Engine estimates it can process during the time specified in the recovery interval server configuration option.
The Database Engine estimates the maximum number of log records it can process within the recovery interval. When a database using automatic checkpoints reaches this maximum number of log records, the Database Engine issues a checkpoint on the database.
Under the simple recovery model, an automatic checkpoint is also queued if the log becomes 70 percent full.

Note, in point 3 above, the documentation states a single checkpoint will be issued when the log becomes 70 % full. Once this checkpoint has been issued, further checkpoints may not be issued even thought the log is over 70% full, even if the automatic checkpoint resulted in no log truncation activity. However, if further transactions are generated, and the database engine believes the recovery time will be longer than the configured recovery interval, SQL Server will issue automatic checkpoints frequently as long as transactions are being generated. If no transactions are being generated, no automatic checkpoints will be issued.

Watching the Extended Events session we setup at the beginning of the post can be instrumental in showing when automatic checkpoints are occurring. They show up in the output with the is_system column set to True.

Summary and Follow-up

As shown in this post, the log_reuse_wait_desc column in sys.databases may not be as up-to-date as you’d like. If you see ACTIVE_TRANSACTION with a very full log, try running a CHECKPOINT command to see if that clears the log and resets the log reuse status to NOTHING.

Let’s run a CHECKPOINT, and see:

CHECKPOINT;

SELECT d.name
    , mf.name
    , mf.physical_name
    , Size = mf.size * 8192E0 / 1048576
    , MaxSize = mf.max_size * 8192E0 / 1048576
    , Growth = mf.growth * 8192E0 / 1048576
    , d.recovery_model_desc
    , d.log_reuse_wait_desc
    , fg.is_default
FROM master.sys.databases d
    LEFT JOIN sys.database_files mf ON d.database_id = DB_ID()
    LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id
WHERE d.name = DB_NAME()
ORDER BY d.name
    , mf.type
    , mf.name;

IF EXISTS (SELECT 1 FROM sys.system_objects o WHERE o.name = 'dm_db_log_space_usage')
BEGIN
    SELECT DatabaseName = DB_NAME()
        , LogSizeMB = su.total_log_size_in_bytes / 1048576.0
        , LogUsedMB = su.used_log_space_in_bytes / 1048576.0
        , LogUsedPercent = su.used_log_space_in_percent
    FROM sys.dm_db_log_space_usage su;
END

CHECKPOINT;

SELECT d.name

, mf.name

, mf.physical_name

, Size = mf.size * 8192E0 / 1048576

, MaxSize = mf.max_size * 8192E0 / 1048576

, Growth = mf.growth * 8192E0 / 1048576

, d.recovery_model_desc

, d.log_reuse_wait_desc

, fg.is_default

FROM master.sys.databases d

LEFT JOIN sys.database_files mf ON d.database_id = DB_ID()

LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id

WHERE d.name = DB_NAME()

ORDER BY d.name

, mf.type

, mf.name;

IF EXISTS (SELECT 1 FROM sys.system_objects o WHERE o.name = 'dm_db_log_space_usage')

BEGIN

SELECT DatabaseName = DB_NAME()

, LogSizeMB = su.total_log_size_in_bytes / 1048576.0

, LogUsedMB = su.used_log_space_in_bytes / 1048576.0

, LogUsedPercent = su.used_log_space_in_percent

FROM sys.dm_db_log_space_usage su;

END

Results!

╔══════════════╦══════════════════╦══════╦═════════╦═════════════════════╦═════════════════════╗
║     name     ║       name       ║ Size ║ MaxSize ║ recovery_model_desc ║ log_reuse_wait_desc ║
╠══════════════╬══════════════════╬══════╬═════════╬═════════════════════╬═════════════════════╣
║ LogSpaceTest ║ LogSpaceTest_Log ║ 8000 ║    8000 ║ SIMPLE              ║ NOTHING             ║
╚══════════════╩══════════════════╩══════╩═════════╩═════════════════════╩═════════════════════╝

╔══════════════╦═══════════╦═══════════╦════════════════╗
║ DatabaseName ║ LogSizeMB ║ LogUsedMB ║ LogUsedPercent ║
╠══════════════╬═══════════╬═══════════╬════════════════╣
║ LogSpaceTest ║      7999 ║        35 ║           0.44 ║
╚══════════════╩═══════════╩═══════════╩════════════════╝

For the curious out there, I’ve left the various configuration settings for the checkpoint process at the defaults in order to mimic what is likely to be found “in the wild”. However, running the tests above with the following setting changes makes no observable change to the outcome.

EXEC sp_configure 'recovery interval','seconds';
ALTER DATABASE ... SET TARGET_RECOVERY_TIME = target_recovery_time { SECONDS | MINUTES };

Let me know if you enjoyed this post, or if you think it could be improved. Perhaps you might like the rest of our posts on troubleshooting.

¹^{– CHECKPOINT is of course important for the FULL and BULK LOGGED recovery models, too!}

CHECKPOINT and Simple Recovery Model

Monitoring for Checkpoints

The test-bed

Updating One Million Rows

Check the Transaction Log

Summary and Follow-up

Related

Leave a ReplyCancel reply

Search

Categories

Pages

Meta