```html

Preventing Deployment Regressions: Hard Rules for S3-Based Static Site Deploys

During a recent development session on queenofsandiego.com, a stale local index.html was deployed to S3 production, wiping three features that had been working moments before: the hero JADA→BOOK NOW crossfade animation, the embedded Stripe checkout flow, and a previously-deleted marketing line. The local file was newer in modification time but older in actual content. This post documents the root causes, the infrastructure decisions that enabled the regression, and the hard rules now in place to prevent it.

What Happened: The Regression Incident

The incident unfolded across two S3 buckets and a local repo filesystem:

  • S3 prod bucket: s3://queenofsandiego.com (served via CloudFront distribution E2A7EXAMPLE) contained the correct index.html with all three features intact.
  • Local filesystem: /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html was checked out from git but based on a commit from several hours earlier, before those three features had been added.
  • The deploy command: A single aws s3 cp with --recursive overwrote prod with the stale local version, plus simultaneously deployed to staging in the same operation (violating the single-target rule that should have been followed).

No local copy of prod was pulled before the overwrite. S3 versioning was not enabled on the bucket. There was no diff output printed before the operation executed. The prior session summary had explicitly warned about stale local files but the warning was ignored.

Technical Root Causes

1. No pull-before-edit discipline. The standard practice when working with S3-backed static sites is: always fetch the current S3 state, diff it locally, then edit. Instead, the workflow was: edit local, push to S3. This is backwards when S3 is the source of truth during active feature development.

2. Single-target deploy rule not enforced. The deploy command pushed to both s3://queenofsandiego.com/ (prod) and staging in one operation. When one target has stale files and the other is current, a combined push guarantees corruption of at least one environment.

3. No proof block before destructive operations. The aws s3 cp executed without printing a diff or manifest of what was about to change. A six-line proof block (source file count, target diff, three changed key files named explicitly) must be printed in the chat before any S3 overwrite.

4. Feature-token registry not consulted. Each feature in the codebase was already tagged with a grep-able token (e.g., /* FEATURE_HERO_CROSSFADE */, /* STRIPE_EMBEDDED_CHECKOUT */). Before deploying, the current S3 version should have been scanned for these tokens and compared to the local version. A single missing token is a regression signal.

5. Escalation rule was absent. When the state of S3 is unknown or ahead of local, the rule must be: pause and escalate to CB (the human owner) rather than proceeding.

Infrastructure: S3, CloudFront, and Versioning

The setup is straightforward but undefended:

  • S3 bucket: s3://queenofsandiego.com (no versioning enabled, no MFA delete, public-read ACL on objects).
  • CloudFront distribution: E2A7EXAMPLE (origin: S3 bucket, TTL 300 seconds for HTML, 86400 for assets). Cache invalidation is fast but requires explicit aws cloudfront create-invalidation commands.
  • Local repo: Git history is authoritative only if the working tree is synced. During active S3 edits (e.g., uploading PDFs, pushing email templates), git can lag.

The bucket should have versioning enabled to allow point-in-time recovery. Until that's done, every deploy must be preceded by a local snapshot: aws s3 sync s3://queenofsandiego.com ./s3-snapshot-$(date +%s)/.

The Eight Hard Rules (D1–D8)

These rules are now baked into /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md and auto-loaded at the start of every session:

  • D1: Pull S3 state before any edit. Run aws s3 sync s3://queenofsandiego.com ./s3-current/ --delete into a temp directory. Diff it against local. If S3 is ahead, stop and escalate.
  • D2: Single-target deploys only. One aws s3 cp command targets one bucket (either staging or prod, never both). Staging first, always.
  • D3: One logical change per deploy. A single feature (hero animation, email template, PDF link) gets one commit and one push. Multi-feature deploys hide which change broke what.
  • D4: Obey your own prior warnings. If a prior session summary says "local files may be stale," that's a stop condition. Verify before proceeding.
  • D5: Snapshot prod before overwriting. aws s3 sync s3://queenofsandiego.com ./prod-snapshot-$(date +%Y%m%d-%H%M%S)/ creates a local backup. No S3 versioning means this is the only undo.
  • D6: Print proof block before any cp. Before executing aws s3 cp, print to chat: (a) file count in source, (b) names of 3–5 key files being deployed, (c) their local timestamps, (d) a grep-check for feature tokens, (e) the full command, (f) confirmation prompt. Do not execute until reviewed.
  • D7: Maintain a feature-token registry. Key features in index.html are tagged with /* FEATURE_TOKEN_NAME */. Before deploy, grep both the local file and the current S3 version for all tokens. Any token present in S3 but missing in local is a regression.
  • D8: Escalate to CB when S3 is ahead. If S3 has content not in the local repo, do not overwrite. Escalate: "S3 is ahead of local. [File list]. What do you want to do?"

Command Examples (Sanitized)

# Correct workflow: pull-diff-edit-snapshot-deploy-verify

# 1. Pull current S3 state
aws s3 sync s3://queenofsandiego.com ./s3-current/ --delete

# 2. Diff against local
diff -r ./s3-current/ ./sites/queenofsandiego.com/ | head -20

# 3. Snapshot prod before touching it
aws s3 sync s3://queenofsandiego.com ./prod-backup-$(date +%s)/ --no-progress

# 4. Feature-token check
grep -