```html

Preventing S3 Deployment Regressions: A Case Study in Staging-First Architecture and Pre-Flight Validation

Last week, a deployment to queenofsandiego.com wiped three working features by pushing a stale local index.html over a newer version in production S3. The hero JADA→BOOK NOW crossfade, the Stripe embedded checkout flow, and a previously-removed hero line all vanished in a single cp command. This post documents the failure mode, the architectural gaps that enabled it, and the hard rules we've added to prevent recurrence.

What Went Wrong: The Incident

The QOS deployment pipeline runs like this:

  • Local development at /Users/cb/Documents/repos/sites/queenofsandiego.com/
  • Staging S3 bucket: staging.queenofsandiego.com
  • Production S3 bucket: queenofsandiego.com
  • CloudFront distribution: d1j7ixr1hqpi2s.cloudfront.net (prod alias: www.queenofsandiego.com)

The agent pushed both staging and prod in a single command, without pulling the current prod version first to diff against local. The local index.html was ~6 hours stale (last touched during a prior session), but the prod S3 version contained 3 hours of newer CSS classes, Stripe integration code, and hero markup. Result: three features reverted simultaneously, invisible until the CloudFront cache expired.

Root causes:

  • No pre-flight S3 diff before overwrite
  • Staging and prod deployed in a single command (violates separation of concern)
  • Prior session summary warned about stale local files; warning ignored
  • No snapshot of prod before overwriting (S3 versioning not enabled on this bucket)
  • No proof block printed to chat before the destructive cp

The Fix: Eight Hard Rules for S3 Deployments

We codified these rules into /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md, which auto-loads at the start of every QOS session. Each rule is numbered D1–D8 and written in plain English so an AI agent can parse and follow them without interpretation:

  • D1 — Pull and Diff Before Edit: Before touching any file bound for S3, run aws s3 cp s3://queenofsandiego.com/index.html ./index.html.s3-prod and diff -u index.html.s3-prod index.html. Print the diff to chat. If prod is ahead, stop and escalate.
  • D2 — Staging-Only Single-Target Deploys: Deploy to staging first, alone. Never deploy staging and prod in the same command. Separate commands by ≥5 minutes so cache expires.
  • D3 — One File Per Logical Change: Each commit and each deployment targets one file or one cohesive feature block. If you're deploying index.html and styles.css in the same push, split it.
  • D4 — Obey Your Own Prior Session Warnings: If a prior session summary says "local files may be stale," treat it as blocking. Pull S3 and verify timestamps before proceeding.
  • D5 — Snapshot Prod Before Overwrite: Since S3 versioning is not enabled, manually copy prod to a dated backup: aws s3 cp s3://queenofsandiego.com/index.html s3://queenofsandiego.com/backups/index.html.$(date +%Y%m%d-%H%M%S). Print the backup path to chat.
  • D6 — Print a Six-Line Proof Block Before Any cp: Before executing aws s3 cp, print to chat: file path, S3 target, local timestamp, S3 timestamp, the diff summary, and an explicit "PROCEED Y/N?" prompt. Wait for approval.
  • D7 — Maintain a Feature Token Registry: Keep a plaintext file at sites/queenofsandiego.com/S3_FEATURES.txt listing every deployed feature and a unique grep token (e.g., "JADA-CROSSFADE: data-fade-state="). Before deploying, grep the S3 prod version for all tokens. If any token is missing, prod has regressed—stop and investigate.
  • D8 — Escalate When S3 Is Ahead of Local: If the diff in D1 shows prod is newer, do not overwrite. Post the diff to the session, mark it BLOCKING, and message CB with the diff and a timestamp.

Infrastructure and Deployment Mechanics

QOS uses a straightforward S3 + CloudFront + Route53 setup:

  • S3 buckets: queenofsandiego.com (prod) and staging.queenofsandiego.com (staging). Both have public-read ACLs on index.html, styles.css, and assets.
  • CloudFront distribution: d1j7ixr1hqpi2s.cloudfront.net, with CNAME alias www.queenofsandiego.com. Cache TTL is 3600s (1 hour) for HTML, 86400s for static assets. Invalidation is manual via AWS CLI.
  • Deployment command (staging):
    aws s3 cp index.html s3://staging.queenofsandiego.com/index.html \
      --content-type text/html \
      --acl public-read \
      --cache-control "max-age=3600"
  • Deployment command (prod):
    aws s3 cp index.html s3://queenofsandiego.com/index.html \
      --content-type text/html \
      --acl public-read \
      --cache-control "max-age=3600"
  • CloudFront invalidation (after prod deploy):
    aws cloudfront create-invalidation \
      --distribution-id d1j7ixr1hqpi2s \
      --paths "/*"

Route53 points queenofsandiego.com and www.queenofsandiego.com to the CloudFront distribution via A record (alias). staging.queenofsandiego.com points directly to the staging S3 bucket endpoint.

Why These Rules Matter

S3 is immutable at the API level—there is no "undo" unless you have versioning or backups. A single cp command can destroy hours of work. The