Proving the Contract Holds: Testing Database Code

The previous post ended with a contract smoke test: a query that fails the build if a published view loses a promised column or changes a promised type. That test is worth having, and it is not enough. It proves the shape of the contract is intact. It says nothing about whether the behavior behind the shape is still correct.

A view can have exactly the right columns with exactly the right types and return exactly the wrong answers. A stored procedure can keep its parameters and its result columns and still start double-charging a customer after someone “simplified” a join. The shape is the easy half of the contract. The behavior is the half that pages you, and the only way to defend it is to test it.

A woman engineer in a lab coat running database code through a testing station and catching a failed test

Database Code Is Code, and Untested Code Is a Liability

We have spent this series treating the database as an interface with consumers, a version, and a deployment pipeline. All of that is the apparatus we wrap around application code without thinking, and we extend it to the database precisely because database code is code. A constraint is a rule. A trigger is a side effect. A stored procedure is a function with branches and edge cases. None of it is more self-evidently correct than the C# next to it, and a great deal of it is harder to reason about because it operates over sets and runs inside transactions.

So the same idea that organizes application testing organizes this: the test pyramid.[1] Many fast, isolated unit tests at the bottom; fewer integration tests in the middle; a handful of slow end-to-end tests at the top. For a database that maps to:

  • Unit tests – a single procedure or function, isolated from real data, checking one behavior. Fast, numerous, run on every build.
  • Integration tests – the real schema with migrations applied, checking that a change deploys cleanly and the pieces fit. This is where the contract smoke test lives.
  • End-to-end tests – the application and the database together. Valuable, expensive, kept few.

The mistake most teams make is to own only the top of that pyramid: a handful of slow end-to-end tests, and nothing that pins down an individual procedure. When one of those tests fails, it tells you something is wrong somewhere in the whole stack, which is the least useful thing a test can tell you.

Unit Testing in the Database with tSQLt

The reason database unit testing has a reputation for being hard is that the obvious approach – point a test at a real table, insert some rows, run the procedure, check the result, delete the rows – is slow, order-dependent, and pollutes the database for the next test. tSQLt, an open-source framework that runs entirely in T-SQL, exists to remove exactly those problems.[2]

Two of its features do most of the work. Every test runs inside its own transaction that tSQLt rolls back automatically when the test finishes, so tests never see each other’s data and never leave anything behind. And FakeTable replaces a real table with an empty, constraint-free copy for the duration of a test, so a test can populate just the few columns it cares about without satisfying every foreign key and check constraint in the schema.[3] Together they turn “insert rows into a web of related tables” into “set up the three values this test is actually about.”

That test does not care how many columns customer has, what its constraints are, or what else is in the database. It pins one behavior of the published interface: the view exposes the legal name under the promised name customer_name. Write a few dozen of these and you have a description of the contract that is executable, runs in seconds, and fails the instant someone changes the answer.

Test the Behavior the Contract Actually Promises

Shape tests write themselves; behavior tests take judgment. The useful ones cluster around the places a contract quietly breaks:

  • Invariants. If the data must never violate a rule – no negative balances, no overlapping date ranges, exactly one primary contact per account – write a test that tries to violate it and asserts the database refuses. A constraint with no test is a rule you are hoping is still enforced.
  • Edge cases in values. NULLs, empty strings, the boundary between int and the value that overflows it, duplicate keys the cardinality assumption says cannot happen. These are where set-based logic surprises its author.
  • Result-set shape and order. If a procedure’s consumers read columns by position, a test that pins the column order is the only thing standing between you and a silent reordering.
  • Promised error behavior. If the contract says a procedure raises error 50001 on a duplicate, test that it does. An application catching that number is as coupled to it as to any column name, a point worth remembering when the next post makes retries safe.
  • Compatibility during a migration. This is the contract-specific one. During the expand phase of a breaking change, both the old and the new shape must return the same value. A test that asserts the deprecated email column and the new email_address agree is what lets you carry both safely until every consumer has moved.

That last category is the bridge back to the rest of the series. Expand-migrate-contract only works if “expand” really did preserve the old behavior, and a test is how you know it did rather than how you find out it did not.

Where the Tests Run

None of this matters if the tests only run when someone remembers to run them. The tSQLt suite belongs in the same pipeline as the deployment, executed against a freshly built database on every change, with a failed test failing the build exactly the way a broken contract smoke test does. This is the practice that makes the small, ordered migrations from the last post safe to apply: each change ships with the tests that prove it kept the promises, and the build is the thing that insists on them. It is also the discipline that evolutionary database design has pointed at for two decades, now with the tooling to make it ordinary.[4]

A practical note on the pyramid: keep the unit tests fast and numerous, because those are the ones developers will actually run while they work. Lean on FakeTable so they do not drag the whole schema along. Reserve the slower integration tests for the things unit tests cannot prove – that a migration applies, that the contract smoke test passes against the real built schema – and keep the end-to-end tests to the few journeys that justify their cost.

The Half of the Contract You Can Only Defend With Tests

The shape of a contract you can check with a query against system views. The behavior you can only defend by exercising it: feeding the procedure the duplicate it should reject, the NULL it should handle, the boundary value that overflows, and asserting it does the promised thing every time. Tests are how the promise stops being a comment in the code and becomes something the build enforces.

So far the contract has been about structure and behavior you can see in a single call. The remaining posts turn to the promises that span calls and consumers – the ones that are not visible in a column list at all. The first of those is the one that makes a retried request safe to send twice: Idempotency Is a Contract.

What does your database test suite look like, if you have one? I would like to hear about it on Bluesky or LinkedIn.

References and Footnotes

  1. TestPyramid – Martin Fowler. The model for a balanced automated-test portfolio: many fast unit tests, fewer integration tests, a small number of slow end-to-end tests.
  2. tSQLt – Database Unit Testing for SQL Server. An open-source framework that lets you write SQL Server unit tests in T-SQL, compatible with SQL Server 2005 SP2 and later.
  3. tSQLt User Guide. Documentation for tSQLt features including automatic per-test transaction rollback, FakeTable for isolating a table from its constraints and data, and the assertion procedures.
  4. Evolutionary Database Design – Pramod Sadalage and Martin Fowler. The practices behind small, tested, version-controlled database change that make incremental schema evolution safe.