Building a Comprehensive Infrastructure Snapshot: Lessons from a Multi-Cloud Staging Environment

```html

When working with distributed infrastructure spanning multiple cloud providers, services, and local systems, the ability to capture a complete point-in-time snapshot becomes critical. This post details the technical approach we used to create a v1.0 snapshot of a complex JADA ecosystem comprising three production websites, 45+ S3 buckets, 66 CloudFront distributions, 21 Lambda functions, and multiple Google Apps Script projects.

The Challenge: Capturing Distributed State

The JADA infrastructure spans several layers:

AWS Services: S3 buckets, CloudFront distributions, Lambda functions, Route53 hosted zones, DynamoDB tables, API Gateway endpoints, ACM certificates, and SES configurations
Google Cloud: Four distinct Google Apps Script projects for different business functions
AWS Lightsail: Virtual server instances requiring VM-level snapshots
Local Development: Repository code, tooling scripts, configuration files, and documentation

A traditional point-in-time backup of any single layer would be insufficient. We needed comprehensive snapshots across all layers, taken simultaneously to ensure consistency.

Technical Architecture: Parallel Multi-Agent Approach

Rather than sequential operations that would require hours, we implemented a four-agent parallel architecture:

Agent 1: S3 Sync Operations
├── sync queenofsandiego.com buckets
├── sync sailjada.com buckets  
├── sync salejada.com buckets
├── sync auxiliary buckets
└── parallel batch processing (A/B/C splits)

Agent 2: Lambda & Code Export
├── export all 21 Lambda function code zips
├── capture environment variables
├── extract IAM role policies
└── document configuration and triggers

Agent 3: AWS Configuration Export
├── describe all 66 CloudFront distributions
├── export Route53 hosted zone configurations
├── capture 14 DynamoDB table schemas
├── extract API Gateway definitions
└── document SES configuration and sending limits

Agent 4: Local System Snapshots
├── clasp pull all GAS projects
├── sync repository directories
├── capture development tooling
└── archive documentation and handoffs

S3 and CloudFront Infrastructure Details

The snapshot captured the production S3 bucket structure for three primary sites. For each domain, we created corresponding staging buckets with prefixes:

queenofsandiego.com → synced to staging with CloudFront distribution d***staging-qos
sailjada.com → synced to staging with corresponding distribution
salejada.com → synced to staging with corresponding distribution

Each CloudFront distribution required invalidation after staging updates to ensure cache coherency. The invalidation pattern used was:

aws cloudfront create-invalidation \
  --distribution-id D*** \
  --paths "/*" \
  --query 'Invalidation.Id' \
  --output text

This approach ensures staging serves fresh content while maintaining production integrity through separate distribution IDs.

Lambda Function Inventory and Deployment State

We captured all 21 Lambda functions with:

Source code exported via aws lambda get-function to retrieve deployment packages
Environment variable configurations preserved without exposing values
IAM role associations documented for permission matrix reconstruction
Trigger configurations (API Gateway, S3 events, scheduled rules) recorded for infrastructure-as-code regeneration

The snapshot included the update_dashboard.py deployment utility, which orchestrates Lambda function updates across the JADA ecosystem. This tool was modified during the staging workflow to support better error reporting and batch operations.

Google Apps Script Projects: Clasp Integration

Four GAS projects were pulled via clasp:

Main JADA GAS project (core automation)
Rady Shell replacement GAS
Rady Shell legacy/old GAS
EYD GAS project (event-specific automation)

Each project was exported to /Users/cb/Documents/repos/memory/snapshot-v1.0/gas/ with subdirectories preserving project structure. The clasp workflow used:

clasp pull [project-id] --rootDir /snapshot/path/project-name

This captured not only source code but also .clasp.json configuration files containing project script IDs for future redeploy scenarios.

Critical Staging Workflow Decisions

During snapshot creation, we encountered and documented three critical staging synchronization issues:

1. Font Rendering in Staging

The staging environment showed letter-spacing inconsistencies in brand headers. We identified the issue in the brand CSS styles and modified the letter-spacing property while preserving the original text-transform declarations. The fix was applied to the staging index files and validated before CloudFront invalidation.

2. Product Pricing References

The Bob Dylan product page contained hardcoded price references that required staging-specific values. We downloaded the production version from the bobdylan bucket, identified all $225 references, updated them for staging, and deployed to the staging path with CloudFront cache invalidation.

3. Navigation and Event Page Consistency

The events page required synchronization between production and staging, particularly for "James Taylor" and "All Events" navigation elements. We used S3 API operations to directly compare and sync content:

aws s3api list-objects-v2 \
  --bucket sailjada-staging \
  --prefix events/ \
  --output table

Infrastructure Snapshot Storage

The complete v1.0 snapshot was organized into the following directory structure:

/snapshot-v1.0/
├── MANIFEST.md              # Complete inventory and checksum index
├── s3-buckets/              # All 45 synced buckets (68MB+)
├── lambda/                  # 21 function codes + configurations
├── cloudfront/              # 66 distribution definitions
├── route53/                 # 16 hosted zone exports
├── dynamodb/                # 14 table schemas and backups
├── gas/                     # 4 GAS projects with source
├── lightsail/               # VM snapshot identifier
├── local-repos/             # Development repositories
└── tools/                   # Deployment and utility scripts

A MANIFEST.md file was generated documenting every resource, file count, and checksum for verification purposes.

Lessons Learned: Why Parallel Agents Mattered

Sequential operations would have taken 4-6 hours. The parallel four-agent approach reduced this to approximately 30 minutes. More importantly, it enabled true point-in-time consistency—all components were captured within a narrow time window, preventing state inconsistencies that often occur during long sequential backup operations.

What's Next

With v1.0 snapshot complete, future work includes:

Automated daily snapshots using the same parallel methodology
Differential backup system to reduce storage and bandwidth
Infrastructure-as-Code generation