Building a Comprehensive Infrastructure Snapshot: Lessons from Multi-Region AWS Disaster Recovery

```html

When working with distributed systems spanning multiple AWS regions, S3 buckets, Lambda functions, and Google Apps Script projects, the risk of data loss or configuration drift is substantial. This post documents the technical approach taken to create a complete v1.0 snapshot of the JADA infrastructure—covering 45 S3 buckets, 21 Lambda functions, 66 CloudFront distributions, and four Google Apps Script projects across three production domains.

The Challenge: Distributed State Across Multiple Services

The JADA infrastructure consists of:

Three production domains: queenofsandiego.com, sailjada.com, and salejada.com
Cloud storage: 45 S3 buckets distributed across regions
Content delivery: 66 CloudFront distributions with origin configurations
Serverless compute: 21 Lambda functions with environment variables and deployment packages
DNS management: 16 Route53 hosted zones
Automation: 4 Google Apps Script projects (main JADA, Rady Shell replacement, Rady Shell legacy, EYD)
Local source: Tools, booking automation, code generation scripts across multiple repositories

The core issue: no single AWS API call captures the entire infrastructure state. Configuration lives in different services, each requiring separate export logic. Recovery from a major incident would require stitching together data from a dozen sources—assuming each was backed up correctly.

Architecture: Parallel Multi-Agent Snapshot Strategy

Rather than sequentially exporting each service (which would take hours), we implemented a four-agent parallel approach:

Agent 1: S3 Bucket Synchronization

All 45 S3 buckets were synced locally using aws s3 sync with parallel operations enabled. Key considerations:

Sync destination: /snapshot/v1.0/s3-buckets/ with subdirectories per bucket name
Command pattern: aws s3 sync s3://bucket-name ./snapshot/v1.0/s3-buckets/bucket-name/ --parallel 10
Progress tracking: Real-time monitoring showed 68MB downloaded across 30+ buckets with remaining queued
Staging buckets: Identified dedicated staging buckets for QOS and sailjada, synced separately to preserve staging-specific content
Size optimization: Prioritized buckets by modification date; older buckets queued for batch processing to avoid network saturation

Agent 2: Lambda Function Export

Lambda functions exported using AWS CLI with code packages and environment variable snapshots:

aws lambda list-functions --region us-west-2 --output json > /snapshot/v1.0/lambda/functions-list.json

# For each function:
aws lambda get-function --function-name FUNCTION_NAME --region REGION \
  --query 'Code.Location' --output text | xargs -I {} curl -o FUNCTION_NAME.zip {}

aws lambda get-function-configuration --function-name FUNCTION_NAME \
  --region REGION > /snapshot/v1.0/lambda/FUNCTION_NAME-config.json

Critical details captured per function:

Deployment package (compiled code)
Runtime version and handler configuration
Environment variables (sanitized of secrets)
IAM execution role ARN
VPC configuration and security groups
Memory allocation and timeout settings
Layers and dependencies

Agent 3: AWS Infrastructure Configuration Export

CloudFront, Route53, DynamoDB, SES, API Gateway, and IAM configurations exported using describe-* commands:

# CloudFront distributions (all 66)
aws cloudfront list-distributions --output json > /snapshot/v1.0/aws-config/cloudfront-distributions.json

# Route53 hosted zones (16 zones)
aws route53 list-hosted-zones --output json > /snapshot/v1.0/aws-config/route53-zones.json

# For each hosted zone:
aws route53 list-resource-record-sets --hosted-zone-id ZONE_ID \
  --output json > /snapshot/v1.0/aws-config/zone-ZONE_ID-records.json

# DynamoDB tables
aws dynamodb list-tables --output json > /snapshot/v1.0/aws-config/dynamodb-tables.json

# ACM certificates (tracking expiration dates)
aws acm list-certificates --output json > /snapshot/v1.0/aws-config/acm-certificates.json

Why separate from Lambda export: CloudFront and Route53 data is configuration-heavy but code-light, making it faster to export in parallel without competing for the same AWS API quota.

Agent 4: Local Source Code and Configuration

Google Apps Script projects and local tool repositories backed up using clasp and filesystem copy:

# Pull from Google Apps Script projects
cd /Users/cb/Documents/repos/sites/queenofsandiego.com/
clasp pull

# Copy entire project trees
cp -r /Users/cb/Documents/repos/sites/queenofsandiego.com \
  /snapshot/v1.0/sites/queenofsandiego.com

cp -r /Users/cb/Documents/repos/tools \
  /snapshot/v1.0/local-tools

Key GAS projects captured:

BookingAutomation.gs — primary booking system logic
Code.gs — shared utilities and helpers
Rady Shell replacement and legacy GAS projects
EYD automation scripts

Lightsail Instance Snapshot

A Lightsail instance snapshot jada-agent-v1.0-20260509 was initiated to capture the agent infrastructure itself. This provides an immutable image of the compute environment running the export operations, allowing rapid restoration if agent infrastructure were compromised.

Directory Structure for v1.0 Snapshot

/snapshot/v1.0/
├── s3-buckets/
│   ├── jada-assets/
│   ├── queenofsandiego-prod/
│   ├── queenofsandiego-staging/
│   ├── sailjada-prod/
│   ├── sailjada-staging/
│   └── ... (40 additional buckets)
├── lambda/
│   ├── functions-list.json
│   ├── FUNCTION_NAME.zip
│   ├── FUNCTION_NAME-config.json
│   └── ... (21 functions)
├── aws-config/
│   ├── cloudfront-distributions.json
│   ├── route53-zones.json
│   ├── zone-ZONE_ID-records.json
│   ├── dynamodb-tables.json
│   ├── acm-certificates.json
│   └── api-gateway-apis.json
├── sites/
│   ├── queenofsandiego.com/
│   ├── sailjada.com/
│   └── salejada