Comprehensive Infrastructure Snapshot Strategy: Protecting Multi-Site JADA Architecture

```html

Following unexpected data loss during a development session, we implemented a complete v1.0 snapshot of the entire JADA infrastructure ecosystem—three production sites, serverless functions, database configurations, and all deployment tooling. This post details the snapshot architecture, execution strategy, and lessons learned.

What Went Wrong & Why Snapshots Matter

During routine staging updates across queenofsandiego.com, sailjada.com, and salejada.com, development changes were inadvertently reverted, requiring significant token overhead to restore functionality. This incident exposed a critical gap: no comprehensive point-in-time backup strategy existed across distributed infrastructure components.

The v1.0 snapshot addresses this by capturing everything simultaneously across multiple infrastructure layers, enabling rapid rollback if needed.

Snapshot Scope & Components

The v1.0 snapshot encompasses:

Storage Layer: 45 S3 buckets across production, staging, and archive environments
Content Delivery: 66 CloudFront distributions with origin configurations and cache behaviors
Compute: 21 Lambda functions with source code, environment variables, and IAM role bindings
DNS & Routing: 16 Route53 hosted zones with all DNS records and health check configurations
Application Code: Google Apps Script (GAS) projects for booking automation and workflow management
Local Tooling: Python deployment scripts, configuration files, and development utilities
Database State: 14 DynamoDB tables with schema definitions (data not exported due to sensitivity)
Infrastructure Config: ACM certificates, API Gateway configurations, SES templates, IAM policies
Server Infrastructure: Lightsail instance snapshots with system state

Execution Architecture

Rather than sequential backups (which would take hours), we parallelized the snapshot into four concurrent agents:

Agent 1: S3 Sync
  └─ aws s3 sync s3://[bucket-name] ./v1.0-snapshot/s3/[bucket-name]/
  └─ Synced 45 buckets (~500MB total)
  └─ Progress tracking: 30/45 → 45/45 completed

Agent 2: Lambda Export
  └─ aws lambda get-function --function-name [name]
  └─ aws lambda get-function-code-location
  └─ Exported 21 functions with config + environment
  └─ Progress tracking: 10/21 → 21/21 completed

Agent 3: AWS Infrastructure Config
  └─ CloudFront: aws cloudfront list-distributions
  └─ Route53: aws route53 list-hosted-zones
  └─ DynamoDB: aws dynamodb describe-table
  └─ API Gateway, ACM, SES configs exported

Agent 4: Local File Backup
  └─ /Users/cb/Documents/repos/sites/queenofsandiego.com/
  └─ /Users/cb/Documents/repos/sites/sailjada.com/
  └─ /Users/cb/Documents/repos/sites/salejada.com/
  └─ /Users/cb/Documents/repos/tools/
  └─ Clasp pull: main JADA GAS, Rady Shell GAS, EYD GAS

Key Infrastructure Components Captured

S3 Bucket Inventory

The snapshot catalogued all 45 S3 buckets, critical ones including:

queenofsandiego.com — production site content
queenofsandiego.com-staging — staging environment with current test changes
sailjada.com — production booking site
sailjada.com-staging — staging with pending updates
salejada.com — secondary production site
Specialized buckets for Lambda deployment packages, CloudFormation templates, and archive data

Each bucket was synced with versioning metadata preserved where applicable.

CloudFront Distribution Architecture

All 66 CloudFront distributions were exported with:

Origin configurations (S3 bucket origins, custom domain origins, API Gateway origins)
Cache behaviors and path patterns
Origin access identities (OAI) for S3 bucket authentication
Distribution-specific headers and custom error responses
Current invalidation status

Example: d1234example.cloudfront.net serving queenofsandiego.com with origin queenofsandiego.com.s3.amazonaws.com and cache TTL of 86400 seconds for static assets.

Lambda Function Baseline

All 21 Lambda functions were exported including:

Function source code (downloaded from CloudFormation packages)
Environment variables (excluding secrets manager references)
Execution role ARN and inline policies
Memory allocation, timeout, and concurrency settings
Layer dependencies and versions
VPC configuration (security groups, subnet mappings)

Critical functions included booking processors, webhook handlers, and automated deployment triggers.

Google Apps Script Projects

GAS projects were pulled via clasp and stored with full project metadata:

BookingAutomation.gs — main JADA booking workflow
Rady Shell replacement script
Rady Shell legacy script (for historical reference)
EYD event management script

Each GAS project's .clasp.json manifest was preserved, enabling future clasp operations to reference the exact project ID.

Technical Decisions & Rationale

Parallel Over Sequential

Rather than syncing buckets one-by-one (estimated 2-3 hours), we launched four agents in parallel, completing in ~40 minutes. This required careful AWS CLI session management to avoid credential conflicts.

Infrastructure-as-Data, Not Infrastructure-as-Code

The snapshot exports actual AWS resource configurations (not CloudFormation templates), ensuring exact point-in-time capture. This trades declarative reproducibility for guaranteed accuracy—critical when the current state may contain undocumented manual changes.

Excluding Sensitive Data

DynamoDB table schemas were captured, but not data (customer records, booking history). Lambda environment variables were exported, but Secrets Manager references were preserved as-is rather than resolving actual secret values.

Local Tooling Inclusion

Often overlooked: deployment scripts, configuration files, and development utilities. Files like update_dashboard.py and the newly-created release.py are production dependencies that live outside version control. These were included in the snapshot.

Manifest & Verification

A comprehensive MANIFEST.md was generated documenting:

Snapshot creation timestamp: 2026-05-