From 10 Seconds to Sub-1-Second: A Case Study in AI-Assisted Performance Engineering

Phase	Change	Expected Impact
0	Per-stage timing instrumentation	Measure before cutting
1	Database indexes on hot query paths	3–5s reduction
2	Eliminate duplicate queries and lazy loads	1–2s reduction
3	Push filters into SQL + database-level pagination	1–2s reduction
4	Response cache on the deterministic endpoint	~10s → ~50ms on cache hit
5	Configuration memoization + loop optimizations	100–300ms reduction

3 The Architecture

Interactive C4-style diagrams showing the system from users down to individual queries. Click a tab and watch the data flow.

Person

External

Container

Component

Database

Optimized

Legacy

System Context

Container View

API 1 — Flow

API 2 — Legacy vs Optimized

Discovery endpoint with 13 filter params — city, capacity, add-ons, pagination, distance sort. After optimization: 8 queries typical, 2–3 on holiday dates.

Request

[PERSON]

Client

Mobile / Web

13 query params supported

↓

HTTPS

[MIDDLEWARE]

Auth Check

JWT decode + verify

No DB, no external HTTP

Controller

[COMPONENT]

Discovery Handler

Parses params, builds response

↓

calls

[HELPER]

Config Loader

In-process 5-min cache

[HELPER]

Availability Batcher

Batched slot availability

[HELPER]

Add-ons Batcher

3 queries for all add-ons

Database Queries

[QUERY 1]

Config lookup

Cached — 0 DB hits on cache hit

[QUERY 2]

Main items + location JOIN

WHERE filters pushed to SQL

[QUERY 3]

Location names (batched)

[QUERY 4]

Categories (batched)

[QUERY 5]

Bookings JOIN (batched)

[QUERY 6-8]

Add-ons (3 queries, batched)

External

[EXTERNAL]

Cache Service

HTTP

Only when cache=true

Optional path

Authenticated detail endpoint. Original had 30–40 queries across 5 database sessions. Optimized version: ~5 queries in a single session.

LEGACY — 30-40 QUERIES, 5 SESSIONS

Sessions Opened

SESSION 1

Permission Check

Helper opens own DB session

SESSION 2

Main Session

Restriction, items, statuses, user

SESSION 3

Daypass Status

Called N times — same params!

SESSION 4

Occupancy Count

Raw SQL, per-item (N+1)

SESSION 5

Config Load

On cache miss

N+1 Queries

LOOP 1

Bookings per item

10 items = 10 queries

LOOP 2

Status check per item

Same result every iteration — wasted!

LOOP 3

Occupancy per item

Only for certain types

Bottlenecks

BOTTLENECK

Pool Starvation

Pool = 5 connections. 5 sessions → concurrent requests queue.

BOTTLENECK

Lazy Loads

Nested relationship access triggers cascade SELECTs

BOTTLENECK

Redundant Queries

Same user queried 2x (perm + account)

OPTIMIZED — ~5 QUERIES, 1 SESSION

Single Session

INLINED

User + Permission

One query, checked in Python

INLINED

Items + Location JOIN

Eager-loaded via add_columns

INLINED

Daypass Status (once)

Outside loop — same for all

INLINED

Batched Bookings + Occupancy

IN clause + SQL GROUP BY

Batched Queries

QUERY 1

Users (user + perm)

QUERY 2

Items + Location JOIN

QUERY 3

Bookings batched

QUERY 4

Status check (1x)

QUERY 5

Occupancy GROUP BY

Wins

WIN

Pool Safe

1 connection per request. No contention.

WIN

Static Lookups

Never-changing data replaced by in-code dict

WIN

Early Release

Response built after session closes

4 Implementation

"Continue with your recommendations."

What Got Built

Killed a Duplicate Database Query -1 QUERY

The original code re-ran an identical filtered SELECT just to change the result ordering. Replaced with an in-memory sort. Saved one database round-trip per request.

Pushed Application Filters into SQL 90% FEWER ROWS

Capacity ranges, category type, add-on filters — all happening in application code after fetching every row. Pushing them into the WHERE clause cut the result set by an order of magnitude.

Holiday Short-Circuit -5 TO -8 QUERIES

The endpoint executed five separate batch queries and then checked if the date was a public holiday. On holidays, all that work was thrown away. A pre-check skipped the batch queries entirely when applicable.

In-Process Configuration Cache -1 QUERY/REQ

Configuration data was being expanded on every single request. A 5-minute in-process cache eliminated that work.

Optimized Inner-Loop Computation FASTER

A nested loop with cubic complexity was rewritten using set membership checks. The complexity dropped to roughly linear in the input size.

Removed Redundant Logging -1500 I/O OPS

The same data structure was being serialized to logs three times per item — and the endpoint processed up to 500 items per request. That meant 1,500+ log writes per request, each performing serialization on a large object. Cut to one log per item.

Derived Data Instead of Re-Querying -1 QUERY

A separate database lookup was being made for data that was already present in a previous query's result set. Eliminated the round-trip.

5 The Revert Cycle

When AI code breaks production — and how each failure taught the AI something it couldn't learn from reading code alone.

This was the most instructive part of the entire engagement. The AI produced correct, well-reasoned code that passed all syntax checks — and it still broke the application three times.

Attempt 1: Full Deploy → Container Boot Crash

The AI applied all optimizations at once. The container failed to boot. Standard library imports it had added were incompatible with the production runtime's cooperative scheduling model.

"I'm getting following error and docker is not coming up"

The AI couldn't diagnose it — the real error was hidden above the pasted logs. Then came the critical prompt:

"After reverting your changes, the build went through."

This single sentence told the AI: (a) the issue is definitely in its code, not the environment, and (b) it needs to isolate which change caused the crash.

Attempt 2: Fewer New Imports → Still Crashed

The AI removed the suspect import but kept other related ones. Before testing, upstream changes were merged in — so the new attempt was never tested in isolation.

Attempt 3: Logic Fixes Only → Success

"Still getting the same error, can you revert all your changes?"

The AI reverted everything to the known-good baseline. Then reapplied only four small, surgical logic fixes — zero new imports, zero new module-level code. Build succeeded on the first try.

This established the core constraint: in this runtime environment, any new module-level code is suspect until proven safe in deployment.

Attempt 4: Performance Optimizations, Reapplied Smartly

With the constraint now understood, the AI rewrote all the performance optimizations to use zero new import lines — using stdlib modules already imported, skipping instrumentation that would have required new dependencies.

"As this API is taking 10 to 12 seconds, I'm more interested in fixing performance and reducing response time below 1 second."

This reframed the AI from “be cautious” to “be fast, within the constraints you've now learned.”

Attempt 5: Empty Response Bug — Edge Values

The Edge-Value Trap

After the performance optimizations deployed, the API returned empty results for certain query parameter combinations. The original application-side filter treated a value of 0 as “no filter applied” because of language truthiness rules. The new database-side filter treated 0 as “filter for the literal value zero,” which matched nothing.

"I'm getting this response... [pasted the empty JSON response and the exact query parameters]"

Providing the exact failing input let the AI trace both code paths and find the semantic difference in one shot. One-line fix.

The AI didn't get it right on the first try — or the second, or the third. But each failure narrowed the constraints: no unsafe imports, no new module-level code, edge values that appear identical can mean opposite things in different filter implementations. By attempt 5, the AI had internalized every constraint and produced code that passed on first deploy.

6 Second API — Proving the Playbook Scales

After the first API, the same methodology was applied to a second endpoint — simpler parameters but far worse internals.

The AI found 30–40 queries per request for just 10 items: three N+1 loops, five separate database sessions, and a function called N times with identical parameters.

Three Rounds

Round 1: Batch the N+1 Queries 30 → 7 QUERIES

Per-item queries replaced with batch queries. One batch function already existed in the codebase — it just wasn't wired up. Still 5 seconds.

Round 2: Single DB Session 5 SESSIONS → 1

Three helper functions each opened their own database connection. Under a pool of 5, concurrent requests queued for connections. Inlined everything into one session. Down to ~2 seconds.

Round 3: Eliminate Round-Trips 7 → ~5

Static lookup instead of a query. Pre-filtered config. Released connection before response building. Down to ~1 second.

Results

Metric	Before	After
DB sessions per request	5	1
Total queries (10 items)	30–40	~5
N+1 query loops	3	0
Implicit lazy loads	2–4 per parent	0

The playbook — analyze, batch, inline, validate — transferred directly. What took a full day of discovery on the first API took under an hour on the second because the patterns were already learned.

Before vs After — At a Glance

🔴 API 2 (Legacy)

Response time~10-12s

DB sessions / request5

Total DB queries (10 items)30-40

N+1 query loops3

Lazy loads per parent2-4

Connection pool contentionHigh

Python sort overheadYes

Config DB hitEvery request

🟢 API 2 (Optimized)

Response time~1s

DB sessions / request1

Total DB queries (10 items)~5

N+1 query loops0

Lazy loads per parent0

Connection pool contentionNone

Python sort overheadNo (SQL ORDER BY)

Config DB hitCached 5 min

The Prompts That Taught the AI the Most

Prompt	What It Taught the AI
"After reverting your changes, the build went through"	The problem is in the code, not the environment. Isolate.
"Still getting the same error, can you revert all your changes"	Stop guessing. Go back to known-good. Re-approach incrementally.
"[exact failing input]"	Concrete reproduction lets the AI diagnose in one shot instead of guessing.
"As this API is taking 10 to 12 seconds, I'm more interested in fixing performance"	Reframe from “be careful” to “be fast, within the constraints you've now learned.”
"Now re-analyze with the pulled code"	After upstream changes, prior analysis is no longer valid. Start fresh.
"Can you cross-check if the response structure is the same and not changed?"	Every optimization is suspect until the contract is verified field by field.

Anti-Patterns to Avoid

What Went Wrong	What to Do Instead
Applied all optimizations at once	Ship one change at a time, verify each
Added new dependencies without testing in deployment	Test the minimal change in the actual deployment environment first
Assumed edge values would behave the same in both filter implementations	Check how the original code interprets every edge value
Got distracted by downstream symptom logs instead of finding the root error	Always grab the full error log before guessing

Key Lessons

AI as an analyst, not just a coder

The most valuable output wasn't the code. It was the systematic analysis that mapped twenty bottlenecks across multiple files in a single pass.

Environment awareness matters

The AI had to learn things it couldn't learn from reading code alone: which stdlib primitives are unsafe in the production runtime, that web framework default values don't protect against empty strings, that certain numeric edge values are treated as “no filter” by convention.

Iterate in small, testable increments

The big-bang approach failed. The surgical approach (small fixes with zero new imports) worked on first deploy. When the AI has earned a constraint the hard way, every subsequent change should respect it.

Always verify the response contract

Every optimization was audited against the original response shape. Performance work that breaks the API isn't a win — it's a regression with extra steps.

The playbook transfers

Same methodology, second API: under an hour instead of a full day. The real value of AI-assisted engineering compounds as patterns get reused.

The biggest wins are often outside the code

All the application-level optimizations together saved a few seconds. The remaining latency budget lives in database indexes and response caching. Sometimes the smartest engineering decision is recognizing where the real bottleneck lives.

What This Demonstrates About AI-Assisted Engineering

Three things, mostly:

AI shortens the analysis phase dramatically. What used to be a multi-day profiling exercise became a multi-hour conversation.
AI doesn't replace judgment — it amplifies it. Every revert was a moment of human judgment. The AI proposed, the human disposed.
The prompt is the spec. Bad prompts produce bad code. Concrete, contextual, constrained prompts produce reasoning that holds up under production scrutiny.

From 10 Seconds to Sub-1-Second

The Problem

The Approach: AI as a Pair Programmer

1 Deep Analysis

The Prompt That Worked

Why This Framing Mattered

What Came Back

2 Structured Planning

The Prompt That Worked

Why This Framing Mattered

3 The Architecture

4 Implementation

What Got Built

Killed a Duplicate Database Query -1 QUERY

Pushed Application Filters into SQL 90% FEWER ROWS

Holiday Short-Circuit -5 TO -8 QUERIES

In-Process Configuration Cache -1 QUERY/REQ

Optimized Inner-Loop Computation FASTER

Removed Redundant Logging -1500 I/O OPS

Derived Data Instead of Re-Querying -1 QUERY

5 The Revert Cycle

Attempt 1: Full Deploy → Container Boot Crash

Attempt 2: Fewer New Imports → Still Crashed

Attempt 3: Logic Fixes Only → Success

Attempt 4: Performance Optimizations, Reapplied Smartly

Attempt 5: Empty Response Bug — Edge Values

The Edge-Value Trap

6 Second API — Proving the Playbook Scales

Three Rounds

Round 1: Batch the N+1 Queries 30 → 7 QUERIES

Round 2: Single DB Session 5 SESSIONS → 1

Round 3: Eliminate Round-Trips 7 → ~5

Results

Before vs After — At a Glance

🔴 API 2 (Legacy)

🟢 API 2 (Optimized)

The Prompts That Taught the AI the Most

Anti-Patterns to Avoid

Key Lessons

AI as an analyst, not just a coder

Environment awareness matters

Iterate in small, testable increments

Always verify the response contract

The playbook transfers

The biggest wins are often outside the code

What This Demonstrates About AI-Assisted Engineering