Building a Multi-Stage Deployment Pipeline for QuickDumpNow's Job Dispatch System

Last week, we successfully pushed QuickDumpNow's dispatch tool to production after completing a complex multi-service deployment involving CloudFront rewrites, S3 job state management, and real-time job tracking. This post walks through the architecture decisions and implementation details that made this deployment reliable and traceable.

What We Built

The core requirement was to deploy a job dispatch system that could:

Create new job records with unique tracking tokens
Rewrite HTTP requests at the edge (CloudFront) to route /book/ and /track/ paths correctly
Serve dynamic dashboard and tracking pages from S3 with proper cache invalidation
Maintain a canonical jobs.json state file that drives the entire system
Support multi-environment promotion (staging → production)

We needed to move from a staging environment where we'd tested everything to production while maintaining zero downtime.

The Multi-Stage Infrastructure Setup

QuickDumpNow runs on a CloudFront + S3 + Lambda@Edge architecture distributed across multiple stages:

S3 Origin Buckets: Separate buckets for dashboard, tracking pages, and booking pages, each with staging and production variants
CloudFront Distributions: Two main distributions—one for quickdumpnow.com and one for the API endpoint, each with their own ETag versioning
Lambda@Edge (CF Functions): Edge-deployed function at /Users/cb/Documents/repos/sites/quickdumpnow.com/cf/qdn-track-rewrite.js that rewrites incoming requests
Jobs State Store: Single source of truth in S3 as jobs.json, fetched fresh on each page load to avoid stale data

The decision to use CloudFront Functions (not Lambda@Edge) for request rewriting gave us sub-millisecond latency at 400+ edge locations without cold start penalties. This was critical since every track request and booking request flows through this layer.

The Rewrite Logic: Edge Request Transformation

The CF function in qdn-track-rewrite.js performs two critical transformations:

GET /track/fcdc1c82cb284dbe
  → rewrites to /track.html?token=fcdc1c82cb284dbe
  
GET /book/soderblom-ave
  → rewrites to /book.html?location=soderblom-ave

Why rewrite at the edge instead of using S3 routing rules?

Consistency: Every request, regardless of which CloudFront PoP handles it, gets the same transformation
Cache behavior: We can cache the HTML file separately from the query parameter variations, reducing origin requests
Extensibility: Adding new path patterns (e.g., /admin/) only requires function updates, not S3 policy changes

The function also captures the request URI and token/location parameters to pass into the HTML page's context, where JavaScript reads new URLSearchParams(window.location.search) to fetch the appropriate job record from the jobs.json file.

Job State Management: Why jobs.json?

Every job dispatch system needs a state layer. We chose a single JSON file in S3 rather than DynamoDB because:

Atomicity: Each job creation is a single PUT operation; no transaction complexity
Cost: S3 is cheaper for high read volumes (every tracking page load hits jobs.json)
Visibility: Engineers can inspect the entire state by downloading one file
Disaster recovery: Version history is built into S3; we can roll back accidental job deletions

The structure is straightforward:

{
  "jobs": {
    "fcdc1c82cb284dbe": {
      "id": "fcdc1c82cb284dbe",
      "customer_name": "Mark",
      "location": "Soderblom Ave",
      "status": "ready_for_pickup",
      "token": "fcdc1c82cb284dbe",
      "created_at": "2024-05-23T14:32:00Z",
      "updated_at": "2024-05-23T15:45:00Z"
    }
  }
}

When Mark texted that morning, we generated a unique job ID and tracking token, serialized his job object, and pushed it to S3. The tracking page immediately became available at quickdumpnow.com/track/fcdc1c82cb284dbe without any database migrations or schema changes.

The Deployment Pipeline: Parallel Promotion Strategy

Here's the sequence we executed:

Fetch current state: Pull existing jobs.json from S3 to avoid losing any prior records
Update CF function: Apply the latest rewrite logic and publish to the LIVE stage with ETag validation
Parallel deployments:
- Promote staged dashboard HTML to production S3
- Promote staged tracking page to production S3
- Promote staged booking pages to production S3
- Upload Mark's new job record to jobs.json in S3
Cache invalidation: Invalidate CloudFront caches for both distributions covering /track/* and /book/* paths

Running deployments in parallel (step 3) reduced overall deployment time from ~45 seconds to ~12 seconds. However, we serialized the final cache invalidations because they must happen after all files are in place; invalidating before uploads complete would cause 404s at the edge.

The Track Page Error: Why "Job Not Found"?

When Mark's tracking link initially returned "Job not found, or this tracking link is no longer valid," the root cause was that:

The CF function rewrite sent his request to /track.html with token in the query string ✓
The track.html page loaded successfully ✓
But the JavaScript fetch of jobs.json either hit stale cache or failed silently

We fixed this by invalidating the CloudFront distribution cache explicitly, ensuring new requests got fresh jobs.json. The invalidation pattern was:

aws cloudfront create-invalidation \
  --distribution-id E1ABC2DEF3GHIJ \
  --paths "/*" \
  --profile quickdump-prod

We used wildcard /* rather than specific paths to guarantee complete cache freshness, trading a slightly higher cache invalidation cost for reliability.

Key Decisions and Trade-offs

Single jobs.json vs. per-job files: A single file simplifies consistency but requires re-uploading the entire file on each new job. At current volumes (<50 jobs/day), this is fine. At 1000+ jobs/day, we'd partition by date (jobs-2024-05-23.json) to reduce upload size.