Comprehensive Infrastructure Snapshot Strategy: Protecting Multi-Site JADA Architecture
Following unexpected data loss during a development session, we implemented a complete v1.0 snapshot of the entire JADA infrastructure ecosystem—three production sites, serverless functions, database configurations, and all deployment tooling. This post details the snapshot architecture, execution strategy, and lessons learned.
What Went Wrong & Why Snapshots Matter
During routine staging updates across queenofsandiego.com, sailjada.com, and salejada.com, development changes were inadvertently reverted, requiring significant token overhead to restore functionality. This incident exposed a critical gap: no comprehensive point-in-time backup strategy existed across distributed infrastructure components.
The v1.0 snapshot addresses this by capturing everything simultaneously across multiple infrastructure layers, enabling rapid rollback if needed.
Snapshot Scope & Components
The v1.0 snapshot encompasses:
- Storage Layer: 45 S3 buckets across production, staging, and archive environments
- Content Delivery: 66 CloudFront distributions with origin configurations and cache behaviors
- Compute: 21 Lambda functions with source code, environment variables, and IAM role bindings
- DNS & Routing: 16 Route53 hosted zones with all DNS records and health check configurations
- Application Code: Google Apps Script (GAS) projects for booking automation and workflow management
- Local Tooling: Python deployment scripts, configuration files, and development utilities
- Database State: 14 DynamoDB tables with schema definitions (data not exported due to sensitivity)
- Infrastructure Config: ACM certificates, API Gateway configurations, SES templates, IAM policies
- Server Infrastructure: Lightsail instance snapshots with system state
Execution Architecture
Rather than sequential backups (which would take hours), we parallelized the snapshot into four concurrent agents:
Agent 1: S3 Sync
└─ aws s3 sync s3://[bucket-name] ./v1.0-snapshot/s3/[bucket-name]/
└─ Synced 45 buckets (~500MB total)
└─ Progress tracking: 30/45 → 45/45 completed
Agent 2: Lambda Export
└─ aws lambda get-function --function-name [name]
└─ aws lambda get-function-code-location
└─ Exported 21 functions with config + environment
└─ Progress tracking: 10/21 → 21/21 completed
Agent 3: AWS Infrastructure Config
└─ CloudFront: aws cloudfront list-distributions
└─ Route53: aws route53 list-hosted-zones
└─ DynamoDB: aws dynamodb describe-table
└─ API Gateway, ACM, SES configs exported
Agent 4: Local File Backup
└─ /Users/cb/Documents/repos/sites/queenofsandiego.com/
└─ /Users/cb/Documents/repos/sites/sailjada.com/
└─ /Users/cb/Documents/repos/sites/salejada.com/
└─ /Users/cb/Documents/repos/tools/
└─ Clasp pull: main JADA GAS, Rady Shell GAS, EYD GAS
Key Infrastructure Components Captured
S3 Bucket Inventory
The snapshot catalogued all 45 S3 buckets, critical ones including:
queenofsandiego.com— production site contentqueenofsandiego.com-staging— staging environment with current test changessailjada.com— production booking sitesailjada.com-staging— staging with pending updatessalejada.com— secondary production site- Specialized buckets for Lambda deployment packages, CloudFormation templates, and archive data
Each bucket was synced with versioning metadata preserved where applicable.
CloudFront Distribution Architecture
All 66 CloudFront distributions were exported with:
- Origin configurations (S3 bucket origins, custom domain origins, API Gateway origins)
- Cache behaviors and path patterns
- Origin access identities (OAI) for S3 bucket authentication
- Distribution-specific headers and custom error responses
- Current invalidation status
Example: d1234example.cloudfront.net serving queenofsandiego.com with origin queenofsandiego.com.s3.amazonaws.com and cache TTL of 86400 seconds for static assets.
Lambda Function Baseline
All 21 Lambda functions were exported including:
- Function source code (downloaded from CloudFormation packages)
- Environment variables (excluding secrets manager references)
- Execution role ARN and inline policies
- Memory allocation, timeout, and concurrency settings
- Layer dependencies and versions
- VPC configuration (security groups, subnet mappings)
Critical functions included booking processors, webhook handlers, and automated deployment triggers.
Google Apps Script Projects
GAS projects were pulled via clasp and stored with full project metadata:
BookingAutomation.gs— main JADA booking workflow- Rady Shell replacement script
- Rady Shell legacy script (for historical reference)
- EYD event management script
Each GAS project's .clasp.json manifest was preserved, enabling future clasp operations to reference the exact project ID.
Technical Decisions & Rationale
Parallel Over Sequential
Rather than syncing buckets one-by-one (estimated 2-3 hours), we launched four agents in parallel, completing in ~40 minutes. This required careful AWS CLI session management to avoid credential conflicts.
Infrastructure-as-Data, Not Infrastructure-as-Code
The snapshot exports actual AWS resource configurations (not CloudFormation templates), ensuring exact point-in-time capture. This trades declarative reproducibility for guaranteed accuracy—critical when the current state may contain undocumented manual changes.
Excluding Sensitive Data
DynamoDB table schemas were captured, but not data (customer records, booking history). Lambda environment variables were exported, but Secrets Manager references were preserved as-is rather than resolving actual secret values.
Local Tooling Inclusion
Often overlooked: deployment scripts, configuration files, and development utilities. Files like update_dashboard.py and the newly-created release.py are production dependencies that live outside version control. These were included in the snapshot.
Manifest & Verification
A comprehensive MANIFEST.md was generated documenting:
- Snapshot creation timestamp: 2026-05-