Building a Production Snapshot Infrastructure: Comprehensive AWS State Capture for Three E-Commerce Sites

```html

When working with distributed infrastructure across multiple production sites, a single misconfiguration or deployment can cascade into hours of recovery work. This post details the approach taken to create a comprehensive v1.0 snapshot of the JADA infrastructure—encompassing three e-commerce properties (queenofsandiego.com, sailjada.com, salejada.com), their supporting AWS services, and associated automation code.

The Problem: No Rollback Point

Without a documented, versioned snapshot of infrastructure state at a known-good point, recovery from deployment errors becomes reactive rather than proactive. The goal was to capture a complete point-in-time record of:

All S3 bucket contents and configurations (46 buckets total)
CloudFront distribution settings and cache behaviors (66 distributions)
Lambda function code, environment variables, and IAM roles (21 functions)
Route53 DNS records and hosted zones (16 zones)
Google Apps Script projects powering backend automation (4 GAS projects)
Lightsail instance snapshots for compute resources
RDS, DynamoDB, and database configurations
SES, API Gateway, and integration service configs
Local development tooling and documentation

Technical Architecture: Parallel Distributed Snapshot

Rather than sequentially exporting each service (which would take hours), the snapshot process was architected using four parallel agents running concurrently:

Agent 1: S3 Sync — AWS CLI recursive downloads of all 45 JADA-related buckets
Agent 2: Lambda Export — Code extraction, environment variable capture, and IAM policy documentation for all 21 functions
Agent 3: AWS Config Export — CloudFront distributions, Route53 zones, DynamoDB tables, SES configuration, ACM certificates
Agent 4: Local Asset Capture — Google Apps Script projects via clasp CLI, development files, LaunchAgent configurations

This parallelization reduced wall-clock time from an estimated 3+ hours (sequential) to approximately 45 minutes (concurrent).

Infrastructure Inventory Captured

AWS Storage & CDN

The snapshot documented all S3 bucket configurations:


# Example: bucket naming conventions used
- qos-prod-site (Queen of San Diego production content)
- qos-staging-site
- sailjada-prod, sailjada-staging
- salejada-prod, salejada-staging
- qos-lambda-layers (shared Lambda dependencies)
- jada-backups-archive
- jada-admin-uploads
- [36 additional specialized buckets for assets, logs, archives]

CloudFront distributions were exported with full cache behavior configurations, origin settings, and WAF rules. Route53 hosted zones captured DNS records, health checks, and failover configurations across all three domains and their subdomains (admin, api, staging variants, etc.).

Compute & Functions

Lambda function snapshots included:

Source code (downloaded via AWS Lambda console export)
Environment variable names and structure (values redacted for security)
IAM execution role permissions
Memory allocation, timeout settings, and concurrency limits
VPC configuration and security group associations
Layer dependencies and their versions

Lightsail instance snapshots were initiated for persistent compute resources, capturing disk state, application configurations, and installed packages.

Automation & Code

Four Google Apps Script projects were pulled using the clasp CLI:


# Main JADA GAS project
clasp pull [project-id]

# Rady Shell replacement GAS
clasp pull [project-id]

# Rady Shell old version (maintained for reference)
clasp pull [project-id]

# EYD (Elizabeth Y. Davis) GAS project
clasp pull [project-id]

These projects handle order processing, customer communication, inventory management, and data synchronization between Shopify, databases, and email services.

Directory Structure & Organization

The v1.0 snapshot was organized hierarchically for easy navigation and future version control:


v1.0/
├── s3-buckets/
│   ├── qos-prod-site/
│   ├── sailjada-prod/
│   ├── salejada-prod/
│   ├── [43 additional buckets]
│   └── MANIFEST.md (file counts, sync timestamps)
├── cloudfront/
│   ├── distributions.json (all 66 distributions)
│   ├── cache-behaviors.json
│   └── origins.json
├── route53/
│   ├── hosted-zones.json
│   └── dns-records/ (per domain)
├── lambda/
│   ├── functions/ (code + config for each of 21)
│   ├── layers/
│   └── permissions.json
├── gas-projects/
│   ├── jada-main/
│   ├── rady-replacement/
│   ├── rady-old/
│   └── eyd/
├── lightsail/
│   ├── snapshots/ (jada-agent-v1.0-20260509)
│   └── instance-configs.json
├── databases/
│   ├── dynamodb-schemas/
│   ├── rds-configs.json
│   └── table-exports/
├── integrations/
│   ├── ses-config.json
│   ├── api-gateway-apis.json
│   └── webhooks.json
└── MANIFEST.md (master inventory)

Key Technical Decisions

1. Parallel Agents Over Sequential Exports

Running four independent agents allowed services with different dependencies to export simultaneously. Lambda code doesn't depend on S3 sync completion, so why wait?

2. Redacted Environment Variables

Environment variable names and structure were captured to understand dependencies, but actual values (API keys, database passwords, credentials) were excluded from the snapshot file structure. A separate encrypted file maintained the actual values with strict access control.

3. Infrastructure as Documentation

Rather than maintaining separate documentation, the snapshot itself became the source of truth. CloudFront distribution IDs, Lambda function names, S3 bucket structures, and Route53 records are all captured in queryable formats (JSON, markdown manifests).

4. GAS Projects via Clasp

Using Google's clasp CLI meant GAS source code was versioned alongside infrastructure, enabling side-by-side comparison of automation changes. Each GAS project's manifest.json and appsscript.json were captured to document library dependencies and OAuth scope requirements.

Validation & Verification

Post-snapshot, critical validations were performed:

S3 file counts — Production vs. staging bucket parity confirmed (exact file counts per bucket documented)
Lambda function count — All 21 functions accounted for with code size and memory configuration validated
Route53 records — DNS record counts per zone verified; TTL values captured for failover scenarios
GAS project consistency — Each project's script.json and code files