Performance and Operations: Measuring Instead of Arguing

This is Part 4 of the “Where Does Business Logic Live?” series. Parts 2 and 3 made the principled cases for the database and the application tier. Principles are where these debates start; measurements are where they should be settled. This post is about what actually happens to performance and to your operational life depending on where the logic lives.

I am going to resist the temptation to declare a winner, because there isn’t one in the abstract. There are workloads where the database wins by a mile and workloads where the application tier wins by a mile, and the skill is in recognizing which one you are looking at. Let’s measure.

An engineer on a monitoring platform compares a fast set-based data train against many small carts circling a database cylinder, depicting performance tradeoffs.

Execution Plans and Plan Reuse

Every query SQL Server runs needs an execution plan, and compiling a plan costs CPU. The economics of plan reuse are one of the clearest performance differences between the two approaches.

A stored procedure’s statements get parameterized plans that are compiled once and reused. Call the procedure ten thousand times with different arguments and, in the normal case, you compile once and reuse the cached plan. That is efficient, and it is predictable – you can inspect the plan, tune it, and know what will run.

Ad-hoc SQL assembled by an application can go either way. A well-written parameterized query through any decent data-access layer also gets a reusable cached plan. But the ORM pathologies from Part 2 break this: when generated SQL embeds literal values or builds variable-length IN lists without parameterizing them, every call is a new query text, every call compiles a new plan, and the plan cache fills with single-use plans that evict the ones you wanted to keep.[1] You pay twice – in compilation CPU and in cache churn.

Plan reuse has a famous dark side that cuts in both directions, so I will not pretend it only helps the database case: parameter sniffing. SQL Server compiles a plan based on the parameter values it sees first, and if those values are unrepresentative, every later execution inherits a plan that may be terrible for them. This bites stored procedures and parameterized application queries alike. It is enough of a real-world problem that recent SQL Server versions ship Parameter Sensitive Plan optimization specifically to mitigate it.[2] The point for this series: plan-related performance is a property of how the SQL is written and parameterized, not simply of which tier emitted it. A disciplined application tier and a disciplined procedure layer both get good plan reuse; an undisciplined version of either does not.

Network Round Trips and the N+1 Problem

This is where the database’s home-field advantage shows up most dramatically, and it has a name the whole industry recognizes: the N+1 selects problem.[3]

The shape is always the same. You load a list of N parent rows with one query, then – because the object model makes it natural – you touch a child collection on each parent, and the ORM lazily fires one more query per parent. One query becomes N+1. Load 500 orders and then read each order’s customer, and you have issued 501 queries where a single join would have done. Each of those queries is a network round trip, and round trips are where latency compounds.

Logic that lives next to the data does not have this problem, because the natural way to express the work there is a set-based join that returns everything in one trip. Logic that lives above an object model has to actively resist N+1 – eager loading, batching, projections – and “has to actively resist” is exactly the kind of thing that gets forgotten under deadline.

The general principle: prefer chunky calls over chatty ones. A stored procedure is chunky by construction – one call does a unit of work. An application orchestrating the same work tends toward chatty unless the developer is deliberate. Across a low-latency in-datacenter network the difference is annoying; across availability zones or to a cloud database it can be the whole performance story.

Data Transfer Volume

Round-trip count is one axis; payload size is the other. Moving business logic to the application tier means moving the data that logic operates on to the application tier, and for data-heavy operations that transfer dominates everything else.

The late-fee example from Part 2 is the clean illustration. Done as a set-based UPDATE in the database, zero rows move across the wire – the engine updates them in place. Done by loading every overdue invoice into application objects, mutating, and saving, every one of those rows travels up the network and back down. Add the ORM’s tendency to SELECT every column of every mapped table – which also inflates memory grants on the server side – and a conceptually tiny operation becomes a bandwidth and memory event.[4]

The rule of thumb that has never let me down: move the logic to the data when the data is large, and move the data to the logic when the data is small. A rule that touches a million rows belongs where the million rows are. A rule that touches one row the user is editing on screen does not.

Caching Changes the Math

Here the application tier earns a genuine, large performance win, and DBAs should acknowledge it freely.

The fastest database query is the one you never run. A caching layer in front of the database – Redis, a distributed cache, even in-process memory – absorbs read traffic that would otherwise hit SQL Server, and at high scale this is not optional, it is how the system survives. Stack Overflow’s own architecture is the textbook example: it serves enormous traffic from SQL Server precisely because layers of caching, including very heavy use of Redis, keep the database from seeing most of the reads.[5]

Caching is an application-tier concern – the database cannot cache for you across application instances – and logic that lives in the application can make caching decisions the database has no visibility into. The catch, and there is always a catch, is invalidation: a cache is a second copy of the truth, and keeping it consistent with the authoritative data is one of the genuinely hard problems in computing. Logic in the database sees the freshest data by definition; logic in front of a cache trades freshness for speed and takes on the burden of knowing when the cached answer has gone stale. That is a real and worthwhile trade, but it is a trade, not a free win.

Service-to-Service Calls and Transaction Scope

Microservice architectures move logic not just out of the database but into separate services that call each other over the network – and that reintroduces, between services, the same round-trip and consistency costs we just discussed between app and database, often amplified.

The sharpest version is transaction scope. A stored procedure’s transaction is short and local: the engine opens it, does the work next to the data, and commits, holding locks for the minimum possible window. An application-managed transaction – the Spring @Transactional method or the equivalent – is held open across the service’s logic and its database round trips, which is more flexible but also easier to hold open too long, lengthening lock duration and hurting concurrency.

Push the work across multiple services, each owning its own data, and a single database transaction is no longer even possible. You are now in the world of distributed consistency: sagas, eventual consistency, compensating actions, and idempotency keys to survive retries. This is well-trodden ground at the organizations operating at that scale, but it is unambiguously harder than BEGIN TRAN ... COMMIT, and it is a cost that belongs on the application-tier side of the ledger. Distributing your logic buys you independent deployability and scaling; it bills you in consistency engineering.

Horizontal Scaling

The asymmetry here is the entire reason the cloud era tilted toward the application tier. Stateless application instances scale out almost linearly – add nodes behind a load balancer. The relational database is stateful, and scaling it out is the hard part: read replicas help reads but add replication lag, and write scaling means sharding or partitioning, which is a major engineering effort. GitHub’s published work on partitioning its relational databases to keep scaling is a good window into how much effort the stateful tier demands compared to adding another app server.[6]

So if a piece of logic is CPU-bound and not data-gravity-bound, running it on the elastic app tier is cheaper and scales more easily. If it is data-gravity-bound – large sets, heavy joins, aggregation – running it in the database is cheaper despite the harder scaling, because the alternative is dragging the data to the app tier across the network. The scaling argument does not pick a universal winner; it picks a winner per workload.

When Each Approach Outperforms

Pulling the performance discussion into a single view:

Workload characteristic Tends to favor Why
Large set-based data manipulation Database Data stays put; one set operation beats row-by-row transfer
Multi-step transactional unit of work Database Short, local transaction; minimal lock duration
Heavy read traffic on cacheable data Application tier Cache absorbs reads the database never has to serve
CPU-bound, non-data-centric computation Application tier Runs on cheap, elastic compute instead of the precious database core
Cross-system orchestration Application tier Coordinates external calls; database should not reach out
A single row the user is editing Application tier Data is tiny; round-trip and transfer costs are negligible
High-volume reporting/aggregation Database Aggregation next to the data avoids moving it
Write scaling beyond one box Depends App tier scales trivially; database needs sharding/partitioning either way

If you take one thing from the performance half of this post: the question is never “which tier is faster?” It is “where is the data gravity for this operation, and is this operation data-bound or compute-bound?” Answer that and the placement usually answers itself.

Operational Reality: What It’s Like to Run Each

Performance is what users feel. Operations is what the team lives. The day-to-day operational profile of the two approaches differs in ways that matter at least as much as throughput, and as a DBA this is the half I have the strongest opinions about.

Change management and deployment

Application deployments are a solved problem culturally: build artifact, blue-green or rolling deploy, instant rollback by pointing back at the previous version. Database changes are intrinsically harder because the database holds state – you cannot “roll back” a table that has already been altered and written to without thinking carefully about the data. This is a genuine operational advantage for app-tier logic, and it is the legitimate kernel inside the “database is a deployment bottleneck” complaint.

The mitigation, from Part 3, is that database code belongs in the same pipeline as application code – SSDT/DACPAC or migration tooling under CI/CD – so schema and procedure changes are versioned, reviewed, and deployed automatically rather than applied by hand. That closes most of the gap. What it cannot fully erase is that stateful changes need more care than stateless ones. A coordinated change that spans a procedure and the application that calls it also needs ordering discipline – expand/contract migrations, backward-compatible steps – regardless of which tier “owns” the logic.

Auditing

Auditing is a quiet win for the database side. Because the database sees every write, audit mechanisms anchored there – temporal tables, triggers, change data capture, a SQL audit specification – capture every change no matter which application or person made it. Audit logic in the application only captures changes that went through that application; the data-fix script and the second service slip past it. For regulated data, “we logged it in the app” is a weaker guarantee than “the database recorded every change,” and auditors increasingly know the difference.

Troubleshooting and monitoring

Where the logic lives determines where you go to debug it, and which tools you reach for.

Database-side logic is observable through a mature, decades-old toolset: execution plans, DMVs, Extended Events, and Query Store, which captures plan history and lets you spot regressions and force a known-good plan.[7] The flip side is that a 3,000-line procedure is genuinely hard to step through, and T-SQL debugging tooling is rudimentary next to an application IDE.

Application-side logic is observable through APM tools, distributed tracing, structured logging, and a debugger that developers love. The flip side is that the actual database interaction is often hidden behind the ORM, so the developer sees “the save took 4 seconds” without seeing the 501 queries that caused it. The hardest production problems I deal with sit exactly on this seam: the application’s traces say the database is slow, the database’s traces say it is being asked to do something absurd, and nobody owns the whole picture.

Ownership boundaries

That seam is, ultimately, an organizational problem, and it is worth naming directly.

Logic placement Typically owned by Common failure mode
Stored procedures DBAs / database developers Developers feel blocked waiting on database changes
Application services Application developers DBAs see pathological SQL they cannot change
SPA / client Front-end developers Validation mistaken for a security control
The seam between them Nobody, by default Each team’s tools blame the other tier

The most effective teams I have worked with do not “solve” this by giving one tier all the logic. They solve it by making the boundary explicit and shared: DBAs in the pull requests for data-access code, developers who understand execution plans, and a culture where “the ORM generated this” is the start of a conversation, not the end of one. I have written about bringing database review into the pull-request workflow precisely because it is the practical place this boundary gets negotiated.

Production support

When the pager goes off, the difference is concrete. A bad application deploy is usually fixed by rolling back to the previous version in minutes. A bad database change, or a plan regression, or a blocking chain from a transaction held open too long by an app-tier orchestration, is fixed by someone who understands the data – and that someone is usually the DBA, regardless of which tier the logic technically lives in. This is the quiet truth underneath the whole debate: moving logic out of the database does not move the consequences out of the database, and it does not eliminate the need for someone who understands what the database is doing. We will hit that misconception squarely in Part 5.

The Takeaway

Performance does not favor a tier; it favors matching the work to the resource – data-bound work to the data, compute-bound and cacheable work to the elastic tier. Operations does not favor a tier either; it favors putting all code under engineering discipline and making the DBA/developer boundary explicit instead of letting it be the place where problems go to be ignored.

In Part 5 we turn to security and compliance – permissions, injection surface, row-level security, and data masking – then take apart the four myths this debate keeps recycling, and lay out the hybrid patterns that actually work in production.

How does the DBA/developer boundary work, or not work, on your team? You can find me on Bluesky and LinkedIn.

References and Footnotes

  1. Erik Darling, “Software Vendor Mistakes With SQL Server: Passing Long IN Clauses To Queries From An ORM” – plan cache bloat and constant compilation from unparameterized, variable-length IN lists. erikdarling.com. ↩ back
  2. “Parameter Sensitive Plan Optimization,” Microsoft Learn – SQL Server’s built-in mitigation for the parameter-sniffing problem, which affects procedures and parameterized application queries alike. learn.microsoft.com. ↩ back
  3. “What is the ‘N+1 selects problem’ in ORM (Object-Relational Mapping)?” – the canonical explanation, with over 850,000 views. stackoverflow.com. ↩ back
  4. Erik Darling, “Software Vendor Mistakes With SQL Server: Selecting Everything From Everywhere” – wide ORM result sets inflate data transfer and memory grants. erikdarling.com. ↩ back
  5. Nick Craver, “Stack Overflow: The Architecture – 2016 Edition” – heavy use of Redis and layered caching in front of SQL Server. nickcraver.com. ↩ back
  6. “Partitioning GitHub’s relational databases to scale,” GitHub Engineering – the effort required to scale the stateful database tier. github.blog. ↩ back
  7. “Monitor Performance by Using the Query Store,” Microsoft Learn – capturing plan history, identifying regressions, and forcing known-good plans. learn.microsoft.com. ↩ back

Next up: Security, Misconceptions, and Hybrid Architectures – permissions, injection surface, row-level security, compliance, the four myths everyone repeats, and the hybrid patterns that work.


Part of the “Where Does Business Logic Live?” series – Part 1, Part 2, Part 3.