I Called Myself Dumb in a Code Comment. I Was Wrong About the Cursor.

I went looking through an old reconciliation script this week, the kind you write once for a quarter-end reconciliation and forget about, and I found a note from my past self. Right above a cursor, in plain text, I had written that I was too dumb to figure out a set-based way to do the job.

Here is the comment, more or less as I found it:

I laughed, then I felt a little defensive on behalf of the woman who wrote it. Because here is the thing: she was not dumb, and the cursor was the right call. The only mistake in that whole block was the comment.

A woman developer smiles as she peels an anxious, apologetic sticky note off her monitor to reveal a calm, confident note beneath, with a small friendly helper robot beside her and a tiny looping train representing a modest cursor.

What the Cursor Actually Did

The job was small and annoying. A reference table stored a set of legislative acts as camelCase codes, the kind of value an application emits and a human never wants to read. The report needed them as proper sentences. So personalPropertySecurity had to become The Personal Property Security Act., and a few dozen siblings needed the same treatment.

The rule is easy to say and fiddly to do in T-SQL: walk the string, and every time you hit an uppercase letter that is not the first character, insert a space in front of it. Then capitalize the first letter[1] and wrap the whole thing in The ... Act..

Here is a faithful, cleaned-up version of what I wrote. I lean on a pattern I like for read-only cursors: declare it LOCAL FORWARD_ONLY STATIC READ_ONLY,[2] read the row count once with @@CURSOR_ROWS,[3] and count down. No priming fetch, no @@FETCH_STATUS dance.

It runs in the blink of an eye over a few dozen rows, it is easy to read top to bottom, and anybody who has written a cursor before knows exactly what it does. That is the whole story. And yet there I was, apologizing for it in writing.

The Set-Based Version, Since You Asked

To be fair to set-based dogma, this can be done without a cursor. The trick is a tally, a small table or derived set of numbers, so you can treat the string as a set of character positions instead of looping over them. STRING_AGG[4] with WITHIN GROUP (ORDER BY ...) then stitches the characters back together in order, inserting the spaces as it goes.

That is genuinely nice, though it has a wrinkle worth knowing about. The per-character CASE has to live in its own derived table, because STRING_AGG refuses to mix an outer reference (the [code] column) with another column (the tally position) inside a single aggregate; expand first, aggregate second. Past that, it is one statement, it is set-based, and on a big table it would leave the cursor in the dust. ASCII between 65 and 90 sidesteps any collation case-sensitivity worry,[5] and STRING_AGG keeps the order honest. It needs SQL Server 2017 or later for STRING_AGG, which is worth a thought if you still support older instances.

So why did I not write it the first time? Honestly, because the cursor was done before I would have finished debugging the aggregate’s outer-reference rule and the WITHIN GROUP ordering, and it was correct, and it ran against a few dozen rows once a quarter. The set-based version is better engineering in the abstract and roughly equal engineering in the actual situation.

The Part I Want to Push Back On

We are trained, correctly, to reach for set-based code. The relational engine is built to chew through sets, and a cursor that grinds through a million-row table one row at a time is a real and common performance problem. “Avoid cursors” is good default advice.

But it is a default, not a law, and the honest question is never just “is this set-based?” It is three questions stacked together:

  • How many rows does this actually touch?
  • How often does it run, and on what schedule?
  • Which version will the next person, including future me, understand fastest?

Run the camelCase job through those three questions and the cursor wins on two of them and ties on the third. A few dozen rows, once a quarter, in code that reads like a sentence. Choosing a tally-table-plus-STRING_AGG construction there is not rigor, it is reaching for cleverness the problem did not ask for. Premature set-based optimization is still premature optimization.

The cursor would be the wrong call the moment any of those answers changed. A million rows, or a tight loop in an OLTP path, or a nightly job competing for a maintenance window, and I would happily spend the extra effort on the set-based version. Scale decides the tool. Dogma does not.

About That Comment

Here is the bit that actually nagged at me. The cursor was fine. The comment was the problem, and not because it was wrong about the SQL.

A code comment is not a diary. The next person to open that file reads it as fact, and “I’m too dumb to figure this out” quietly tells them two untrue things: that the cursor is a confession rather than a decision, and that the author did not understand her own options. Both are false. I knew the set-based shape existed. I weighed it against a tiny, infrequent workload and picked the readable one on purpose. That is a design decision, and it deserved a comment that said so:

Same code. Completely different message to whoever reads it next. One apologizes; the other documents a tradeoff.

I think a lot of us, and I will gently note it is a pattern I see more in women in this field than I would like, narrate our work in a register of apology that the work never earned. The fix is not false confidence. It is accuracy. Say what you decided and why, drop the self-deprecation, and let the comment carry information instead of doubt.

The cursor stays. The comment got rewritten. And past me was not dumb; she just needed a better editor.

If you have a self-deprecating comment buried in your own code, or a strong opinion on where the cursor line really sits, I would love to hear it. You can find me on Bluesky and LinkedIn.

References

  1. STUFF (Transact-SQL) – Microsoft Learn. Replacing the first character when re-capitalizing the label.
  2. Cursors (Transact-SQL) – Microsoft Learn. Cursor types and options, including LOCAL, FORWARD_ONLY, STATIC, and READ_ONLY.
  3. @@CURSOR_ROWS (Transact-SQL) – Microsoft Learn. Why a STATIC cursor returns an accurate row count after OPEN.
  4. STRING_AGG (Transact-SQL) – Microsoft Learn. Ordered aggregation with WITHIN GROUP, available in SQL Server 2017 and later.
  5. ASCII (Transact-SQL) – Microsoft Learn. Collation-independent character codes for the A-Z (65-90) test.