```html

Preventing CloudFront Cache Corruption: Lessons from a Stale S3 Deploy Gone Wrong

What Happened

During a routine feature deployment to queenofsandiego.com, a stale local copy of index.html was deployed to the S3 bucket, overwriting a newer production version. This single mistake cascaded into three separate feature regressions:

  • The hero section JADA → BOOK NOW crossfade animation disappeared
  • The Stripe embedded checkout booking flow was wiped
  • A previously-deleted "For Ranch & Coast readers..." hero line resurfaced from an old local build

Root cause: the developer deployed directly from a stale local file without first pulling the current S3 state and diffing against it. With no S3 versioning enabled and CloudFront caching the corrupted version, the production site served broken functionality for multiple hours.

Technical Details: The Failure Chain

Step 1: Local State Was Out of Sync

The local repository at /Users/cb/Documents/repos/sites/queenofsandiego.com/ contained an index.html that had not been synced with recent S3 changes. The developer did not run a git pull from the remote or fetch the current S3 version before beginning edits.

Step 2: Deploy Skipped Staging Validation

The deployment command deployed to both staging and prod in a single operation:

aws s3 cp /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html \
  s3://queenofsandiego.com/ --recursive

This violated the established staging-first rule: changes should always land on staging first, be reviewed and validated, then promoted to production as a separate deliberate step.

Step 3: CloudFront Cache Became Corrupted

The CloudFront distribution (queenofsandiego.com) had not been invalidated after the S3 upload. The edge cache served the old, correct version for a period, then gradually replicated the new corrupted version across all edge locations. Visitors saw intermittent breakage as different edge nodes updated.

Step 4: No Rollback Path

S3 versioning was not enabled on the production bucket. Once the stale file overwrote the newer version, recovery required manual intervention to restore from local backups or git history.

Infrastructure Context

S3 Bucket: queenofsandiego.com (production content bucket)

CloudFront Distribution: Fronts the S3 bucket with default TTL of 86,400 seconds (24 hours) for HTML files.

Route53 Zones: queenofsandiego.com (primary DNS) with CNAME records pointing to the CloudFront distribution.

Git Repository: /Users/cb/Documents/repos/sites/queenofsandiego.com/ with remote tracking the canonical state of features and content.

The Fix: Eight Hard Rules for Deployment Safety

To prevent this class of failure, the following rules were encoded into /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md (auto-loaded at session start) and synopsized in the top-level repository CLAUDE.md:

D1: Pull and Diff Before Edit

Before modifying any file destined for S3, fetch the current S3 version and compare it line-by-line against your local version:

aws s3 cp s3://queenofsandiego.com/index.html ./index.html.s3-current
diff -u index.html.s3-current index.html

If S3 is ahead of local, git pull the remote repository to sync.

D2: Staging-Only Single-Target Deploys

Never deploy to production and staging in the same command. Always deploy to staging first:

aws s3 cp ./index.html s3://staging.queenofsandiego.com/index.html

After manual validation on staging, promote to production as a separate, intentional step.

D3: One Logical Change Per Deploy

Deploy only the files that changed for a single feature. Do not bulk-copy entire directories unless you explicitly intend to replace all contents.

D4: Obey Your Own Prior Session Warnings

If your prior session summary identified "stale local files" as a risk, that's a blocking condition for this session. Escalate to CB before proceeding.

D5: Snapshot Production Before Overwrite

Before any cp or sync to S3, save the current production version locally with a timestamp:

aws s3 cp s3://queenofsandiego.com/index.html \
  ./backups/index.html.prod.$(date +%s)

D6: Proof Block Before Deploy

Print a six-line proof of the exact bytes being deployed, including file size, modification time, and the first 200 characters of content. Print this in chat before any cp command executes.

D7: Feature-Token Registry

Maintain a file at sites/queenofsandiego.com/FEATURE_TOKENS.md that lists every active feature by a unique string (e.g., JADA_BOOK_NOW_FADE, STRIPE_EMBEDDED_CHECKOUT). Before deploying, grep the S3 version for these tokens and confirm they are present.

D8: Escalate on S3 Drift

If S3 is newer than local git, or if the diff contains unexpected changes, stop and escalate to CB with the full diff. Do not proceed unilaterally.

Key Decisions and Trade-offs

Why not enable S3 versioning? Versioning costs storage and complicates lifecycle management. Instead, we enforce the snapshot rule (D5) and treat git as the source of truth. Every production-ready version is tagged in git; S3 is treated as ephemeral.

Why staging-only deploys? Staging mirrors production architecture but receives zero user traffic. It's the safe place to validate that your changes don't break the booking flow, hero animations, or Stripe integration. Once validated, promotion is a mechanical copy with zero risk of version mismatch.

Why feature tokens? A visual inspection of index.html is error-prone. Tokens are grep-able and provide a mechanical check that critical features survived the deploy.

What's Next

These rules are now part of the permanent onboarding for queenofsandiego.com sessions. They will be loaded at the start of any new conversation working on that codebase. The rules apply to all three properties in the repo (sailjada.com, queenofsand