Preventing Deployment Regressions: How We Built Hard Rules Into CI/CD Workflows

```html

Over a three-hour development session, we discovered and remediated a critical deployment regression that wiped three features from production. This post documents the incident, the root causes, and the hard rules we implemented to prevent similar failures.

The Incident: What Went Wrong

A prior agent session deployed a stale local index.html file to the S3 production bucket, inadvertently overwriting a newer version already live in production. The deployment erased:

The JADA → BOOK NOW hero crossfade animation
The Stripe embedded checkout booking flow integration
A previously-removed "For Ranch & Coast readers…" hero line that had been deleted in an earlier iteration

The agent had deployed both staging and production in the same command (violating its own prior session guidance) and ignored warnings about stale local files already present in memory.

Root Cause Analysis

Three layers of failure converged:

No pre-deployment S3 state validation: The agent did not pull the current production index.html from S3 and diff it against the local copy before deploying.
Single-command dual-target deploys: Both staging and prod were overwritten in one operation, eliminating the safety gate of staging-first review.
Memory warnings ignored: The agent's own prior session summary explicitly flagged the risk of stale local files, but this context was not enforced during the deployment phase.

Technical Details: The Deployment Process

The affected repository structure:


/Users/cb/Documents/repos/sites/queenofsandiego.com/
├── index.html                (3,650 lines; hero, booking flow, styles)
├── CLAUDE.md                 (session context + hard rules)
├── crew-uniform-reminder-preview.html
├── proposals/
│   └── keely-email-preview.html
└── ...

The CloudFront distribution serving queenofsandiego.com points to this S3 bucket. When index.html is deployed, a cache invalidation must follow to force edge nodes to fetch the new version.

The deployment command used:


aws s3 cp index.html s3://[bucket-name]/index.html
aws cloudfront create-invalidation --distribution-id [DIST_ID] --paths "/*"

This is correct syntax, but it had no upstream validation—no diff, no staging gate, no proof block printed to chat before execution.

The Fix: Eight Hard Rules (D1–D8)

We encoded these rules directly into /Users/cb/Documents/repos/sites/queenofsandiego.com/CLAUDE.md so they auto-load on every session:

D1 — Pull S3 and Diff Before Edit: Before modifying any file destined for S3, run aws s3 cp s3://[bucket]/[file] [file].prod and diff the local copy against [file].prod. If S3 is newer, stop and escalate to CB.
D2 — Staging-Only Single-Target Deploys: Never deploy both staging and prod in one command. Always deploy to staging first, pause for review, then promote to prod as a separate operation.
D3 — One Logical Change Per Deployment: Group related HTML, CSS, and JavaScript changes in a single index.html` edit, but deploy only that file in isolation. Do not batch unrelated file deploys.


  D4 — Obey Prior Session Warnings: Before deploying, search the active session's CLAUDE.md for any "CAUTION," "BLOCKING," or "DO NOT" flags. If found, print them to chat and confirm CB approval.
  D5 — Snapshot Production Before Overwrite: Before any aws s3 cp, save the current prod file to a timestamped backup: aws s3 cp s3://[bucket]/index.html index.html.backup.$(date +%s) and commit it to git locally. This is not a replacement for S3 versioning, but a safety net.
  D6 — Proof Block Before Deploy: Print a six-line proof block to chat showing: (1) the diff stat, (2) the S3 target path, (3) the CloudFront distribution ID, (4) the git commit hash of the local version, (5) the timestamp of the S3 prod version, (6) the CB approval token or "awaiting CB go/no-go." Do not execute the deploy command until this block is in chat.
  D7 — Feature-Token Registry: Maintain a FEATURE_TOKENS.txt file listing every feature currently live in production with a unique token (e.g., HERO_FADE_CROSSFADE_v2, STRIPE_CHECKOUT_EMBEDDED). Before deploy, grep the local index.html against the current S3 version for all tokens. If a token is missing from local, halt and escalate.
  D8 — Escalate to CB When S3 Is Ahead: If the S3 version is newer (by modification time, git commit hash, or feature-token count) than the local version, stop all deployment. Print the discrepancy to chat, tag @CB, and await explicit instructions before proceeding.



Infrastructure Context

For reference, the affected resources (no credentials):


  S3 Bucket: queenofsandiego.com (us-east-1, public read access via CloudFront)
  CloudFront Distribution: Points to the S3 bucket origin. Cache TTL: 3600 seconds for index.html. Invalidation required after deploy.
  Route53: queenofsandiego.com A record points to CloudFront distribution domain.
  Git Repo: /Users/cb/Documents/repos/sites/queenofsandiego.com/ tracks all HTML, CSS, and configuration. Deployments are gated by commit history.


Key Decisions: Why These Rules Exist

Why staging-first? Staging allows visual verification before affecting customers. A five-minute staging review would have caught the missing hero fade.

Why the feature-token registry? It creates an explicit contract: "these features must exist in production." Grepping tokens is faster and more reliable than human code review for feature presence.

Why the proof block? Forcing six pieces of metadata to be printed before deploy creates a moment of accountability. Mistakes caught in text are free; mistakes in production cost hours.

Why backup the prod file? If S3 versioning is not enabled, local backup + git commit + timestamped filename creates an audit trail and emergency rollback path.

What's Next

We also added a condensed version of these rules to the top-level /Users/cb/Documents/repos/CLAUDE.md so non-Q