SQL Server Patching Pre/Post Checks with AI

SQL Server patching should be boring. Apply the CU, restart the service, verify everything’s back to normal, move on. In practice, it’s the “verify everything’s back to normal” part that takes the longest — and it’s where things go wrong when you rush. A service that didn’t restart. An AG replica that’s stuck in RESOLVING. A linked server that lost its credentials. A Resource Governor pool that reverted to defaults.

I’ve been burned enough times that I now capture a comprehensive baseline before patching and compare it afterwards. Doing this manually takes 45 minutes per instance. I asked the agent to automate it.

What the Agent Produced

The agent generated two clean functions: Save-PatchBaseline and Test-PatchBaseline. The baseline export was thorough — it used a combination of dbatools cmdlets and direct DMV queries to capture everything.

What it got right immediately:

  • JSON export/import. The baseline saves as a single JSON file with a timestamp and server name in the filename. Clean and portable.
  • The comparison logic was property-by-property, not just “did the count change.” It detected a specific login going from enabled to disabled, not just “login count went from 42 to 41.”
  • Exit codes. Critical failures (services down, databases offline) returned exit code 1. Warnings (config changes) returned exit code 0 with console warnings. This integrates nicely with SCCM or other patching orchestration.

What needed fixing:

  • It missed certificate-based endpoints. The initial version captured endpoint name and state but not the authentication method. If patching somehow disrupts a certificate binding, I want to know.
  • AG comparison was too coarse. It compared replica roles — but after patching and restarting, the replica might have failed over. A role change isn’t a failure; it’s expected. I needed the comparison to distinguish between “AG is healthy with roles swapped” and “AG is broken.”
  • No smoke test. Verifying metadata is good, but I also want to run a quick connectivity test — can I connect, execute a query, and get results?

The Iteration

The Final Solution

How I Use This

The workflow during a patching window looks like this:

The smoke test queries are simple — just verify you can connect, read from a critical table, and get results. If the query fails, something is fundamentally wrong (permissions changed, database not online, corruption). If it succeeds, the engine is functional at a basic level.

What I Validated

I tested this against a non-production instance by capturing a baseline, intentionally disabling a SQL Agent job and stopping the Full-Text service, then running the validation:

  1. The disabled job was caught as a warning — exactly right, since a disabled job isn’t a critical failure but needs investigation.
  2. The stopped service was caught as a critical failure — correct, since a service that was running pre-patch should be running post-patch.
  3. AG role swaps showed as INFO rather than FAIL — this was the iteration that mattered most. After patching a secondary and restarting, the AG might have failed over. That’s not a problem; it’s expected. But a disconnected replica is a problem.

For more on migration and patching workflows, see Post 11: Migration Planning and Compatibility. For the PowerShell patterns used throughout, see Post 6: PowerShell Automation.

Try This Yourself

Run Save-PatchBaseline against any instance — you don’t need to be patching. The baseline itself is useful as documentation. How many non-default sp_configure values does your production instance have? Are there logins that should be disabled? Linked servers you forgot about?

Then plan your next patching window. Capture baselines before patching, apply the CU, and run validation after. The first time you use this, you’ll probably find something in the baseline you didn’t know about — a service set to Manual start that happens to be running, or a Resource Governor pool that nobody remembers creating. That’s the value: not just validating the patch, but understanding your instance’s actual configuration.


Part of the ALTER DBA ADD AGENT series.