A weekend in place of a quarter

Three years of dispatch records from a portal that shows them one click at a time. The receipt was a CSV by Monday.

Three years of records pulled in a weekend. CSV on the auditor's desk Monday morning.

Python · requests · throttled HTTP · session cookies · progress checkpointing

An audit asked a small towing company for three years of dispatch records out of a national billing portal. The portal is the kind of system where every detail page is its own request and every request stares at you for five seconds before the data arrives. Roughly twenty-eight thousand records, one at a time, through a UI. That is weeks of full-time clicking, and the audit window did not have weeks.

The portal's UI is a thin wrapper around two JSON endpoints. Open the network tab, click around for a minute, and the URLs reveal themselves:

POST /D3Dispatch/orange/calls/datatable/filtered     # listing, paged
GET  /D3Dispatch/orange/mcd/load/full/{call_info}/…  # one call, full detail

From there it is a Python script with the session cookies, a loop over months, a loop over pages of the listing, a loop over the per-call detail endpoint, and a half-second sleep between detail fetches with a ten-second backoff on a 4291. A progress file on disk so a crash does not cost the run.

Three years of records, two ways

click through the UI ~39 h of just spinners throttled script ~3.9 h of HTTP, ran over a weekend 28,000 detail fetches scale: 1 second per pixel
UI at 5 s/record script at 0.5 s/record

Bars are drawn to the same time scale. The UI bar is pure spinner time and ignores the clicks, copy-pastes, and tab-switches that go between each load. The script bar is the lower bound on HTTP at the throttle the script actually used.

Started Friday night, done by Monday. The full date range was on the auditor's desk in a CSV they could open in Excel.

  1. The detail-fetch loop sleeps 0.5 s between requests and backs off 10 s on a 429 response, with progress checkpointed to enrichment_progress.json after each successful enrichment so a crash or restart resumes mid-stream rather than from the top.