Real-Time Blocking Chain Alerts with AI

Most blocking alerts are useless. “Blocking detected on SQLPROD01” tells you nothing you can act on. You open SSMS, run sp_who2, squint at the output, try to trace the chain, and by the time you’ve identified the head blocker the blocking has either resolved itself or escalated into a production incident.

What you actually need: “Session 55 (CORP\jsmith, DESKTOP-JSMITH7, SSMS) has been blocking 3 sessions for 45 seconds with an uncommitted UPDATE on dbo.Orders. Longest waiter: session 112, waiting 42 seconds for a shared lock.” That’s an alert you can act on in seconds.

I used an AI agent to build blocking chain alerts that include that level of context. Here’s how the conversation went.

A traffic control center with screens showing cascading road blockages traced to one source intersection. A woman controller reaches for the dispatch radio, ready to act on the detailed information.

The Initial Prompt

What the Agent Built

The first version was functional but flat — it found blocking pairs from sys.dm_os_waiting_tasks but didn’t trace the full chain. If session A blocked B, and B blocked C, it showed two separate rows without connecting them. I pushed back:

The agent rewrote the core query with a recursive CTE anchored on sessions that are blocking others but not themselves blocked. That’s the correct approach — the head blocker is a session that appears as a blocking_session_id in sys.dm_os_waiting_tasks but does not appear as a session_id in the same DMV.

Adding Escalation Logic

A 30-second blocking event and a 5-minute blocking event are different urgency levels. I asked the agent to add tiered alerting:

The agent added a CASE expression that set the severity based on the maximum wait_duration_ms in the chain, and used conditional logic to set the @copy_recipients parameter on sp_send_dbmail. Clean implementation.

What I Validated and Changed

I tested by opening an explicit transaction in SSMS on a dev instance, then running a blocking query from another session. The recursive CTE correctly traced a 3-level chain. A few things I adjusted:

  • The head blocker’s SQL text was missing when the session was sleeping. sys.dm_exec_requests only has rows for sessions actively executing a request. For sleeping sessions (the most dangerous blockers — they finished their query but didn’t commit), I needed to pull from sys.dm_exec_connections using most_recent_sql_handle instead. The agent added this as a COALESCE between the two sources.
  • System sessions appeared as false head blockers. Latch waits on system sessions occasionally showed up as blocking. I added s.[is_user_process] = 1 to filter those out.
  • The chain depth was off by one. The anchor of the recursive CTE was level 0 (the head blocker), but the email displayed it as level 1. Minor, but confusing during triage.

The Final Solution

Why Context-Rich Alerts Matter

The difference between “blocking detected” and a full chain table with SQL text, login names, and wait durations is the difference between triage taking 30 seconds and triage taking 5 minutes. During a production blocking event, those minutes compound — every second the head blocker holds locks, more sessions pile up behind it.

The escalation logic also prevents alert fatigue. Transient 5-second blocks during normal workload contention never fire the alert. The 30-second warning gives you early notice. The 2-minute critical gets management attention. Your inbox isn’t full of noise, so when an alert does arrive, you actually read it.

Try This Yourself

  1. Deploy the script on a test instance. Open two SSMS windows — run BEGIN TRAN; UPDATE dbo.SomeTable SET col = col WHERE 1=0; in one, then SELECT * FROM dbo.SomeTable in the other.
  2. Wait 30 seconds, then run the monitoring query manually to see the chain.
  3. Iterate with the agent: ask it to add the estimated number of rows locked by the head blocker, or to include the database name and object name from sys.dm_tran_locks.
  4. Tune the thresholds for your environment — OLTP systems may want 15-second warnings; data warehouse workloads may tolerate 5-minute thresholds.
  5. Schedule as an Agent job every 60–90 seconds. Don’t go below 30 seconds — you’ll overwhelm Database Mail during real incidents.

For the complete approach to building custom monitoring queries and tuning alert thresholds, see Post 13: Building Custom Monitoring Queries and Alerts. For deeper analysis of blocking chains, wait stats, and deadlocks, see Post 8: Wait Stats, Deadlocks, and Blocking Chains.


Part of the ALTER DBA ADD AGENT series.