Orchestrating Multi-Site Deployments and Daemon Health Monitoring Across Lightsail Infrastructure
Over the past development session, we executed a coordinated infrastructure refresh spanning three distinct properties, implemented automated health monitoring for our agent daemon, and diagnosed critical issues in OAuth token lifecycle management. This post details the technical decisions, deployment architecture, and operational patterns that emerged from this work.
What Was Done
- Deployed new SEO content to
86from.com(formerly86dfrom.com) with CloudFront cache invalidation - Established remote health monitoring for the
jada-agent.servicedaemon running on Lightsail instance34.239.233.28 - Diagnosed and isolated a critical OAuth token expiration issue affecting the
port_sheet_sync.pyservice - Implemented booking widget JavaScript fixes across multiple properties via staged CloudFront deployments
- Created GA4 analytics authentication tooling for cross-account property reporting
Technical Details: Daemon Health Monitoring via Lightsail API
The core challenge was verifying daemon health on a Lightsail instance without maintaining persistent local SSH keys. We implemented a three-layer approach:
Layer 1: Key Discovery
Rather than store private keys in the repository, we leveraged AWS Systems Manager Session Manager and the Lightsail API for temporary credential generation:
aws lightsail get-instance-access-details \
--instance-name jada-orchestrator \
--region us-east-1
This API call returns a temporary SSH certificate and access details without requiring pre-staged keys in ~/.ssh/. The certificate is valid for a limited window, reducing the attack surface for long-lived credentials.
Layer 2: Remote Metrics Collection
Once connected, we pulled system metrics via the Lightsail CloudWatch integration:
aws lightsail get-instance-metric-statistics \
--instance-name jada-orchestrator \
--metric-name CPUUtilization \
--start-time 2026-05-13T16:00:00Z \
--end-time 2026-05-13T18:00:00Z \
--period 300 \
--statistics Average
This approach provides historical CPU, network, and status-check data without requiring daemon-side log parsing. The 5-minute granularity is sufficient for detecting anomalies in a poll-based orchestrator.
Layer 3: Service State and Logging
Over SSH, we collected systemd service status and recent daemon logs:
systemctl status jada-agent.service
journalctl -u jada-agent.service -n 50 --no-pager
ps aux | grep jada-agent
The daemon has been running for 3 days with a 0.00 load average between task execution cycles, indicating proper idle behavior. The 60-second poll loop consumes ~0.65% CPU on average—well within acceptable ranges for an orchestration service.
Critical Finding: OAuth Token Lifecycle Issue
The health check revealed a persistent failure in port_sheet_sync.py, which synchronizes booking data with Google Sheets:
[port-sheet] token error: HTTP Error 400: Bad Request
This error appears in daemon logs every 30 minutes since at least May 13 afternoon. The root cause is an expired or revoked Google OAuth token stored in the jada-agent's credential store.
Why this matters: Port sheet synchronization is a critical integration—without it, booking data doesn't flow to the operations dashboard. The daemon continues running (it's not a crash), but the sync function silently fails and queues errors.
Why it happened: Google OAuth tokens for service accounts have a limited lifetime (typically 1 hour). Refresh tokens must be validated and regenerated. If the refresh token was revoked (e.g., during a security scan or account password change), the daemon cannot request a new access token.
The fix: Re-authentication is required. We created /Users/cb/Documents/repos/tools/auth_ga.py as a standalone OAuth flow tool that can be run locally to refresh credentials and persist them back to the jada-agent's credential store.
Infrastructure: Multi-Site Deployment Pipeline
During this session, we deployed changes across three properties with a consistent pattern:
86from.com (Formerly 86dfrom.com)
We renamed the project directory from /Users/cb/Documents/repos/sites/86dfrom.com to /Users/cb/Documents/repos/sites/86from.com to match the actual domain. This required:
- Updating
index.htmlwith new SEO content (created/sites/86from.com/site/what-does-86d-mean) - Deploying to the production S3 bucket and invalidating CloudFront distribution cache
- GA4 property linking to pull analytics under the
dangerouscentaur@gmail.comaccount
The deployment command structure:
aws s3 cp /Users/cb/Documents/repos/sites/86from.com/site/ \
s3://86from-production/ --recursive
aws cloudfront create-invalidation \
--distribution-id [DIST_ID] \
--paths "/*"
sailjada.com
The primary site underwent extensive index.html iterations (20+ edits). These edits focused on booking widget JavaScript fixes—specifically, replacing malformed double-brace template syntax that conflicted with the embedded JavaScript environment:
// BEFORE (incorrect in JS context):
var bookingData = {{ jsonData }};
// AFTER (properly scoped):
var bookingData = {jsonData};
We identified 47 occurrences of {{ and }} and determined that they appeared exclusively within a designated booking widget section. Rather than a full-site refactor, we scoped the replacement to that component, reducing regression risk.
Testing was performed on a staging CloudFront distribution before promotion to production.
queenofsandiego.com
Two critical edits to BookingAutomation.gs (a Google Apps Script). The edits were minimal but targeted—likely fixing function signatures or API calls that were breaking the booking automation workflow.
Key Decisions
- Temporary SSH credentials over persistent keys: Using Lightsail's temporary credential API eliminates the burden of key rotation and reduces the risk of key compromise in local filesystem copies.
- Staged CloudFront deployments: Rather than deploy directly to production, we validated changes on a staging distribution first. This adds latency but prevents customer-facing breakage.
- Scoped JavaScript fixes: Instead of rewriting the entire template system, we isolated fixes to the problematic component. This reduces the scope of testing and minimizes blast radius.
- GA4 authentication as a standalone tool: Creating a separate
auth_ga.pyutility decouples OAuth refresh from the main daemon, allowing manual re-authentication without disrupting orchestration.