```html

Preventing Deployment Regressions: Hardening the QOS Site Workflow After a Stale-File Incident

What Happened

During a three-hour development session, a Sonnet 4.6 agent deployed an outdated local copy of queenofsandiego.com/index.html to the production S3 bucket, inadvertently wiping three working features:

  • The hero section JADA → BOOK NOW crossfade animation
  • The Stripe embedded checkout booking flow
  • A previously-removed "For Ranch & Coast readers..." hero line that had been intentionally deleted

The root cause was straightforward: the agent edited the local file, then deployed it to both staging and production in a single command without first pulling the current S3 state and comparing it against the local copy. The local file was three commits behind what was already live in production.

Technical Details: The Deployment Chain

The sailjada.com and queenofsandiego.com sites use this deployment pipeline:

  • Local repo: /Users/cb/Documents/repos/sites/queenofsandiego.com/
  • S3 bucket (staging): staging.queenofsandiego.com
  • S3 bucket (prod): queenofsandiego.com
  • CloudFront distribution: D7XXXXXX (fronts the prod bucket)
  • CloudFront staging dist: DXXXXXXX (fronts the staging bucket)

The agent's command was equivalent to:

aws s3 cp /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html \
  s3://queenofsandiego.com/index.html \
  s3://staging.queenofsandiego.com/index.html

This bypassed the staging-first rule and deployed directly to both targets without verification. Critically, the agent did not run:

aws s3 cp s3://queenofsandiego.com/index.html ./index.html.s3-current
diff -u index.html.s3-current index.html

Had it done so, the diff would have shown the three features missing in the local file and flagged the regression before any upload.

Infrastructure & File Structure

The QOS site spans multiple repositories and deployment targets:

  • Git repo root: /Users/cb/Documents/repos/sites/queenofsandiego.com/
  • Main index: index.html (3,650 lines; contains hero, booking flow, crew pages)
  • Proposals/previews: proposals/keely-email-preview.html, crew-uniform-reminder-preview.html, *-fob-explainer-preview.html
  • Memory & context: /Users/cb/.claude/projects/-Users-cb-Documents-repos/memory/MEMORY.md
  • Site rules: /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md (newly hardened)

The CloudFront invalidation command required after any S3 change:

aws cloudfront create-invalidation --distribution-id D7XXXXXX --paths "/*"

The Fix: Eight Hard Rules for Deployment Safety

To prevent this class of regression, eight mandatory rules were encoded into CLAUDE.md (auto-loaded by any Claude agent working on QOS):

  1. D1 — Pull S3 state before any edit: Always fetch the current production and staging files from S3 and diff them against your local copy before opening the editor. If S3 is ahead, abort and escalate to CB.
  2. D2 — Staging only, single target: Every deploy command targets exactly one S3 bucket. Staging deploys to staging.queenofsandiego.com only. Never combine staging and prod in a single `cp` or `sync` command.
  3. D3 — One logical change per deploy: Each file or feature modification is deployed in isolation. If you touch the hero section, deploy only the hero section (or the full `index.html` if unavoidable). If you touch crew pages, deploy crew pages separately. This makes rollbacks surgical.
  4. D4 — Obey prior session warnings: Every session summary includes deployment risks or deprecated local files. If your session notes say "local index.html is stale," do not deploy it without explicit re-verification.
  5. D5 — Snapshot prod before overwriting: Before any production deploy, save the current S3 file locally as a backup: aws s3 cp s3://queenofsandiego.com/index.html ./backups/index.html.prod.$(date +%s). The site does not use S3 versioning; this is your undo.
  6. D6 — Print a six-line proof block: Before executing any `cp` or `sync` to S3, print in chat: the source file path, destination S3 bucket, file size, MD5 hash, the three features being deployed, and the command itself. Wait for a human "yes" or have CB pre-approve the command in writing.
  7. D7 — Maintain a feature-token registry: Keep a FEATURE_TOKENS.md file listing every user-facing feature, the function or HTML block implementing it, and a grep-able comment token. Before deploying, grep the current S3 file for these tokens to confirm nothing is missing.
  8. D8 — Escalate if S3 is ahead: If the diff in step D1 shows S3 has code that your local file doesn't, halt immediately and escalate to CB. Do not merge blind; this usually means another agent pushed a fix you don't know about.

Key Decisions: Why These Rules Matter

Why D1 (pull-before-edit)? In a multi-agent environment, S3 is often the source of truth because humans and CI/CD push to it. Local files lag. A single diff before editing eliminates 90% of stale-file regressions.

Why D2 (staging only)? Staging is the human's validation checkpoint. Forcing a staging-first deploy creates a natural pause point where CB can catch regressions before they hit users. Combined staging+prod deploys eliminate this gate.

Why D6 (proof block)? A written summary before execution forces the agent to state its assumptions in English. Mismatches become obvious ("wait, the file says 1.2 MB but it was 890 KB yesterday"). It's a manual circuit-breaker.

Why D7 (feature tokens)? Regex searching S3 content for known markers (e.g., ``) provides a cheap, deterministic rollback check. If the token is missing,