Preventing Deployment Regressions: How We Fixed a Three-Feature Rollback on queenofsandiego.com
What Happened
During a recent development session, a deployment to production accidentally wiped three working features on queenofsandiego.com by pushing a stale local index.html over a newer S3 version:
- The JADA → BOOK NOW hero image crossfade animation
- The Stripe embedded checkout booking flow
- A previously-deleted "For Ranch & Coast readers..." hero line that shouldn't have been there
The root cause: deploying without first pulling the current S3 state and diffing it against local. The deployment also violated the staging-first rule by pushing directly to both staging and prod in a single command.
Why This Matters
In a multi-agent workflow where Claude sessions run sequentially but independently, local file state can drift from production. Without a defensive pre-deployment check, a stale local copy gets treated as truth and overwrites live, working code. This is especially dangerous with static site hosting (S3 + CloudFront) where there's no rollback mechanism built in, and where the last write wins.
The Fix: Eight Hard Rules for Safe Deployments
We added an executable ruleset to /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md that loads automatically in every new QOS session. These rules are:
D1: Pull and Diff Before Any Edit
aws s3 cp s3://queenofsandiego-prod/ ./s3-prod-snapshot/ --recursive
diff -r ./s3-prod-snapshot/ ./public/ --exclude=.git
Before touching any file, snapshot the current S3 prod state and compare it to local. If S3 is ahead, escalate to CB before proceeding.
D2: Single-Target, Staging-First Deploys
Never deploy to staging and prod in the same command. Always deploy to staging first:
aws s3 cp /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html \
s3://queenofsandiego-staging/index.html --cache-control "max-age=0"
Only after CB approves the staging preview, promote to prod as a separate, logged action.
D3: One Logical Change Per Deploy
Each deploy represents exactly one feature or bug fix. If you're updating the hero section and the footer simultaneously, split them into two separate deploys. This makes regressions traceable and rollback-scoped.
D4: Obey Your Own Prior Session-Summary Warnings
If your previous session summary says "local index.html is stale, refresh before deploy," treat that as a hard blocker. Read it. Follow it. Don't dismiss it as context noise.
D5: Snapshot Prod Before Overwriting
S3 has versioning disabled on our buckets for cost reasons. Before you overwrite a file, log a snapshot:
aws s3 cp s3://queenofsandiego-prod/index.html \
./backups/index.html.backup.$(date +%s)
This gives CB a manual recovery point if something goes wrong.
D6: Print a Six-Line Proof Block Before Any cp
Before running the deploy command, print this in chat (with exact values filled in):
SOURCE: /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html
TARGET: s3://queenofsandiego-prod/index.html
SIZE: [bytes]
FEATURE: [what you're changing]
TESTED: [yes/no, and where]
DIFF LINES: [number of lines changed]
This forces a pause and a chance for CB to catch mistakes before the upload happens.
D7: Keep a Feature-Token Registry in S3
Maintain a JSON file at s3://queenofsandiego-prod/feature-tokens.json listing active features and their signatures in the live code:
{
"hero_crossfade": "data-hero-state=\"jada-to-book\"",
"stripe_embedded_checkout": "data-stripe-key-live",
"crew_roster": "crew-uniform-reminder-preview.html"
}
Before deploying, grep S3-current against these tokens. If a token disappears, the deploy is regressing a feature — escalate immediately.
D8: Escalate to CB if S3 is Ahead of Local
If your D1 diff shows S3 has code that local doesn't have, stop. Don't assume local is correct. Message CB with the exact diff and wait for direction before overwriting.
Infrastructure Context
The queenofsandiego.com site runs on:
- S3 bucket:
queenofsandiego-prod(index.html, proposals/, crew-uniform-reminder-preview.html, etc.) - S3 staging bucket:
queenofsandiego-staging(for pre-approval testing) - CloudFront distribution: Points to queenofsandiego-prod, with cache invalidation needed after deploys
- Route53: Handles DNS for queenofsandiego.com
Because S3 has no built-in versioning on these buckets, the local file and the remote file are the sources of truth, and whichever was written last wins. This asymmetry is why the diff-first rule (D1) is non-negotiable.
Deployment Command Pattern (Safe Version)
# 1. Pull and diff (D1)
aws s3 cp s3://queenofsandiego-prod/index.html ./index.html.remote
# 2. Compare
diff ./index.html ./index.html.remote
# 3. If safe, snapshot prod (D5)
aws s3 cp s3://queenofsandiego-prod/index.html \
./backups/index.html.backup.$(date +%s)
# 4. Print proof block (D6), then deploy to staging (D2)
aws s3 cp ./index.html s3://queenofsandiego-staging/index.html \
--cache-control "max-age=0"
# 5. Test on staging.queenofsandiego.com, get CB approval
# 6. When approved, deploy to prod
aws s3 cp ./index.html s3://queenofsandiego-prod/index.html \
--cache-control "max-age=0"
# 7. Invalidate CloudFront cache
aws cloudfront create-invalidation --distribution-id [DIST_ID] --paths "/*"
Key Decision: Why These Rules Are Mandatory, Not Guidelines
In a synchronous, single-agent workflow, defensive checks feel like overhead. But in a multi-turn, multi-session environment where agents hand off work, stale state is the default condition. Every session starts in the dark about what the last session deployed. The cost of a ten-second diff-and-snapshot is measured in seconds. The cost of a regression is hours of recovery and a broken user experience