Can GitHub Copilot Optimize Your T-SQL? I Put It to the Test

I’ve been curious about how well AI coding assistants handle real-world T-SQL optimization — not textbook examples, but production code with system DMVs, Availability Groups, and windowed functions. So I pointed GitHub Copilot CLI at one of my own functions and asked it to analyze, critique, and optimize it.

What followed was an iterative process with some surprises — including an “optimization” that made the estimated cost worse, a memory grant that was 300× larger than needed, and a final version that cut logical reads by 73% and memory grants by 80%.

Here’s the full journey.


The Original Function: is_primary()

This inline table-valued function is the gating mechanism in my server_maintenance database. Every backup job, DBCC check, and maintenance task calls it to answer one question: “Should I run on this server, for this database?”

It handles three scenarios:

  1. The server is the PRIMARY replica for a database in an Availability Group
  2. The database isn’t in any AG (standalone)
  3. The server matches the AG’s automated backup preference (when @for_backup = 1)

The function joins seven system DMVs/catalog views to resolve both local and distributed Availability Group topologies:

I asked Copilot: “Do you think it is efficient?”


Stage 1: Copilot’s Initial Analysis

Copilot identified three concerns, in order of significance:

1. The CTE is expanded twice (NOT EXISTS re-evaluation)

The NOT EXISTS (SELECT 1 FROM [src] ...) subquery appears in both the @for_backup = 0 and @for_backup = 1 branches. In SQL Server, CTEs are not materialized — they’re re-expanded inline each time they’re referenced. That means the entire 7-way join across the system DMVs executes twice: once for the main SELECT, and again for each NOT EXISTS check.

2. Duplicated branch logic (~80% identical)

The @for_backup = 0 and @for_backup = 1 branches share nearly all their logic — the three-way AG topology check, the NOT EXISTS subquery, and the distributed AG role check. The only real difference is one line that checks backup_preferred_replica and matches the AG role to the backup preference. This duplication is a maintenance risk.

3. Late predicate filtering

The @database_name filter is applied in the outer SELECT (WHERE [src].[database_name] = @database_name), not inside the CTE. This means the CTE’s 7-way join processes all online databases before filtering down to the one we care about.

Copilot’s verdict: “Efficient enough for its purpose” since the system DMVs are small in-memory metadata. But it had ideas for improvement.


Stage 2: The First Rewrite — Windowed Functions and Predicate Pushdown

Copilot’s first rewrite made three structural changes:

Change 1: Push @database_name into the CTE

This lets the optimizer filter to a single database before performing the 7-way join.

Change 2: Replace NOT EXISTS with a windowed flag

This eliminates the second CTE expansion entirely — the secondary check is computed once as a window aggregate.

Change 3: Factor out shared conditions

This looked like a clean improvement on paper. Then I asked Copilot to actually run it and compare execution plans.


Stage 3: The Surprise — Estimated Cost Goes UP

I had Copilot deploy both versions to my local SQL Server instance and capture actual execution plans with SET STATISTICS XML ON and SET STATISTICS IO ON.

The results were… unexpected:

Metric Original v2 (Rewrite)
Estimated Cost 67.4 114.4 ⚠️
Operators 179 104
CPU time 15–16 ms 0 ms
Elapsed time 6–7 ms 3–4 ms
Total Logical Reads 133 26
Memory Granted 64,440 KB 72,888 KB
Memory Used 224 KB 224 KB

The estimated cost nearly doubled, yet actual performance was better across every real metric. The optimizer was wrong.

Why? The MAX() OVER() windowed aggregate and ROW_NUMBER() require Sort operators. The optimizer priced these Sorts based on its estimate of 52,000 rows flowing through the CTE (it can’t predict DMV cardinality accurately). In reality, with @database_name pushed into the CTE, only 1 row flows through.

The predicate pushdown slashed logical reads by ~5×, but the Sort operators inflated the cost estimate. This is a textbook example of why estimated cost is not execution time.


Stage 4: Tackling the Real Problem — The Memory Grant

Both versions shared one glaring issue: excessive memory grants. The original requested 64 MB but used only 224 KB — a 300× overallocation. Every call to this function (and it’s called for every database on every maintenance job) needlessly reserves tens of megabytes from the workspace memory pool.

The root cause was the windowed functions:

  • ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) requires a Sort operator
  • MAX(...) OVER (PARTITION BY ...) requires a Segment + Sequence Project
  • The optimizer estimates these will process 52K rows, so it requests memory proportional to 52K × row_size

The fix was to eliminate all windowed functions:

Replace ROW_NUMBER with TOP 1

A TOP 1 ... ORDER BY uses a Top N Sort, which only maintains a buffer of N rows (in this case, 1). Its memory footprint is negligible regardless of the estimated input size.

Replace the windowed has_secondary flag with NOT EXISTS

Wait — didn’t the original use NOT EXISTS too? Yes, but the original’s CTE processed all databases, so re-expanding it was expensive. Now that @database_name is inside the CTE, re-expanding it only processes one database’s worth of rows through the 7-way join. The cost is trivial.

This gave us zero windowed functions, zero full Sort operators, and only a single cheap Top N Sort.


The Final Version

Here’s the complete optimized function:


Final Results

Metric Original Optimized Improvement
Memory Granted 64,440 KB (63 MB) 12,608 KB (12 MB) 80% reduction
Memory Used 224 KB 488 KB
Sort Operators 3 (52K est. rows) 1 (1 est. row) Top N Sort only
Total Operators 179 158 12% fewer
Total Logical Reads 133 36 73% reduction
Estimated Cost 67.4 1.0 98% reduction
Elapsed Time 5–7 ms 5–6 ms Comparable
Duplicated Logic Yes (2 branches) No (factored out) Easier to maintain

The remaining 12 MB memory grant comes from Hash Match operators the optimizer uses for the LEFT JOINs — that’s inherent to the join strategy and harder to eliminate without fundamentally changing the query shape.


Three Optimizations, Summarized

  1. Push @database_name into the CTE — The biggest single win. Reduces the 7-way join from processing all online databases to processing one. This cut logical reads from 133 to 36.

  2. Replace ROW_NUMBER() with TOP 1 ... ORDER BY — Eliminates full Sort operators. A Top N Sort buffers only 1 row regardless of estimated cardinality, slashing the memory grant.

  3. Factor out shared WHERE conditions — The three-way AG topology check, NOT EXISTS, and distributed role check were duplicated across both @for_backup branches. Now they’re written once, with only the two-line branch difference inside a final OR.


Lessons Learned

Estimated cost is not execution time. The intermediate v2 rewrite had an estimated cost of 114 (vs 67 for the original) because the optimizer overprices windowed functions on DMVs. Yet it was measurably faster. If I’d trusted estimated cost alone, I would have rejected a genuine improvement.

Windowed functions have hidden costs. ROW_NUMBER() and MAX() OVER() are elegant, but they require Sort operators that request memory proportional to estimated row counts. When the optimizer overestimates (as it does with DMVs), you get 300× memory overallocation.

CTEs are not temp tables. Every reference to a CTE re-expands the entire query. The original function’s NOT EXISTS (SELECT 1 FROM [src] ...) caused the 7-way join to execute twice. This is fine when the CTE is cheap (as it became after pushing @database_name in), but it was wasteful when the CTE scanned all databases.

AI assistants can optimize T-SQL iteratively. Copilot didn’t nail it on the first try — the v2 rewrite actually increased the memory grant. But by deploying to a real instance, capturing actual execution plans, and analyzing the results, it identified the Sort operators as the culprit and arrived at a version that’s better by every measurable metric. The key was treating it as a conversation, not a one-shot prompt.

The next post details the process of doing a code review with Github Copilot.

This post was produced through an interactive session with GitHub Copilot CLI (powered by Claude). The optimization was performed iteratively: Copilot analyzed the original function, proposed changes, deployed them to a live SQL Server instance, captured actual execution plans with SET STATISTICS XML ON and SET STATISTICS IO ON, identified where its first approach fell short, and refined the solution. All code in this post is from that real session.