An audit asked a small towing company for three years of dispatch records out of a national billing portal. The portal is the kind of system where every detail page is its own request and every request stares at you for five seconds before the data arrives. Roughly twenty-eight thousand records, one at a time, through a UI. That is weeks of full-time clicking, and the audit window did not have weeks.
The portal's UI is a thin wrapper around two JSON endpoints. Open the network tab, click around for a minute, and the URLs reveal themselves:
POST /D3Dispatch/orange/calls/datatable/filtered # listing, paged
GET /D3Dispatch/orange/mcd/load/full/{call_info}/… # one call, full detail
From there it is a Python script with the session cookies, a loop over months, a loop over pages of the listing, a loop over the per-call detail endpoint, and a half-second sleep between detail fetches with a ten-second backoff on a 4291. A progress file on disk so a crash does not cost the run.
Three years of records, two ways
Bars are drawn to the same time scale. The UI bar is pure spinner time and ignores the clicks, copy-pastes, and tab-switches that go between each load. The script bar is the lower bound on HTTP at the throttle the script actually used.
Started Friday night, done by Monday. The full date range was on the auditor's desk in a CSV they could open in Excel.
-
The detail-fetch loop sleeps 0.5 s between requests and backs off 10 s on a 429 response, with progress checkpointed to
enrichment_progress.jsonafter each successful enrichment so a crash or restart resumes mid-stream rather than from the top. ↩