Automating Multi-Site Infrastructure Monitoring and GA4 Data Pipeline Recovery
This session involved several parallel streams of work: diagnosing and repairing a broken Google Analytics data pipeline, establishing automated remote access to a distributed daemon orchestrator, refactoring domain naming conventions across multiple hosted properties, and implementing frontend instrumentation fixes. Here's a granular breakdown of what was accomplished and why.
Remote Infrastructure Diagnostics via AWS Lightsail API
The initial task was to verify health of the jada-agent.service daemon running on a Lightsail instance (34.239.233.28). Rather than maintaining local SSH keys, we leveraged AWS Lightsail's temporary credential API:
aws lightsail get-instance-access-details \
--instance-name [instance-name] \
--region us-east-1
Why this approach? Storing long-lived SSH private keys on developer machines creates operational risk. Using the Lightsail API to generate time-limited certificates paired with the instance's stored public key eliminates key sprawl while maintaining auditability. The temporary credentials were written to a temporary file, used for SSH connection, then immediately deleted.
The daemon health check revealed:
jada-agent.servicerunning continuously for 3 days with zero service restarts- Load average 0.00 during idle periods; CPU utilization 0.65% average — well within normal bounds for a 60-second polling loop
- Memory consumption 144MB / 914MB available (15.8% utilization)
- Disk usage 6.2GB / 39GB (17% used) — adequate headroom for logs and task queuing
- All status checks passing over the last 2 hours
Session activity pattern: Three sessions ran in the UTC 00:00–00:05 window. Session 1 and 3 hit the 30-turn Claude API limit (exit code 1), which is logged as an error but doesn't crash the daemon. Session 2 completed successfully and created actionable tasks for manual follow-up. This is expected behavior under heavy task load.
Identifying and Diagnosing the Google OAuth Token Failure
The daemon health logs revealed a persistent error in the port_sheet_sync module:
[port-sheet] token error: HTTP Error 400: Bad Request
This error occurred every 30 minutes across all log entries from the afternoon onwards. Investigation showed the root cause: the Google OAuth token stored for port_sheet_sync.py had expired or been revoked, breaking the synchronization pipeline to Google Sheets.
Why this matters: port_sheet_sync.py is responsible for pushing task completion status and metrics to a shared Google Sheet. A broken OAuth token means real-time visibility into daemon task processing is lost, and any downstream systems depending on that Sheet data become stale.
Recovery path: Rather than attempting token refresh (which may not work if revoked), the proper solution is to re-authenticate via auth_ga.py, which handles the full OAuth 2.0 flow:
python3 ~/Documents/repos/tools/auth_ga.py --account [service-account@domain.com]
This will prompt for browser-based Google Sign-In, exchange the authorization code for a fresh token, and persist it securely for the next sync cycle.
Domain Refactoring and S3 / CloudFront Deployment
A secondary task involved standardizing directory naming for a new domain property. The directory /Users/cb/Documents/repos/sites/86dfrom.com was renamed to 86from.com to match the actual domain registration.
Why rename at the repository level? Repository structure should mirror production domains to reduce cognitive load during deployments and maintenance. Typos in directory names lead to deployment errors and confusion across the engineering team.
After renaming, the site was deployed to its S3 bucket and CloudFront invalidation was triggered:
aws s3 sync ./sites/86from.com/site s3://[bucket-name]/ --delete
aws cloudfront create-invalidation \
--distribution-id [dist-id] \
--paths "/*"
CloudFront invalidation with /* ensures all edge caches are flushed immediately, so viewers receive the updated content without waiting for TTL expiration.
New SEO Content Page Deployment
A new content page (/sites/86from.com/site/what-does-86d-mean) was created and deployed alongside the index.html updates. This page targets the informational query intent around the term "86d" — a restaurant industry colloquialism.
Architecture decision: Rather than embedding this as a route in a JavaScript SPA, we deployed it as a static HTML file within the S3 bucket. This approach:
- Improves Core Web Vitals (no JavaScript parsing overhead for content-heavy pages)
- Ensures the page is crawlable by search engines without JavaScript execution
- Reduces time-to-first-contentful-paint (TTFCP) for users
- Simplifies caching rules — S3 serves static content with no dynamic generation
Booking Widget JavaScript Template Syntax Correction
The index.html files across multiple properties contained a booking widget with a critical issue: double-brace template syntax ({{ }}) was being used both inside the booking widget block and in the surrounding HTML.
The problem: If the HTML contains unescaped double braces outside the booking widget section, a templating engine (like Handlebars, Jinja, or Django templates) might attempt to parse them, causing syntax errors or unexpected variable substitution.
The fix: We systematically searched for all occurrences of {{ and }} outside the dedicated booking widget script block and replaced them with single braces or escaped syntax. This was done across:
/sites/sailjada.com/index.html— 14 edits to remove or escape template syntax/sites/86from.com/site/index.html— 3 edits to isolate booking widget syntax
After changes, we validated the booking widget JavaScript block using Node.js syntax checking to ensure no parsing errors were introduced.
Google Apps Script Automation Updates
The BookingAutomation.gs file in the queenofsandiego.com project was updated multiple times to fix trigger logic and improve error handling for booking form submissions. These changes ensure that booking data flows reliably into the underlying Google Sheet without stalling on authentication or quota issues.
Infrastructure Summary and Monitoring
The distributed infrastructure now consists of:
- Daemon Orchestrator: Lightsail instance running
jada-agent.service, monitored via CloudWatch metrics and SSH health checks - Static Sites: S3 buckets with CloudFront distributions for
sailjada.com,86from.com, andqueenofsandiego.com - Analytics Pipeline: GA4 Data API integration for property
dangerouscentaur, with OAuth credentials