Managing Multi-Site Deployments and Daemon Health: Building Resilience into the Jada Orchestrator

```html

Over the past development session, we executed a comprehensive infrastructure health check and multi-site deployment pipeline across three distinct properties while debugging a critical OAuth token failure in our background sync daemon. This post details the technical decisions, architecture patterns, and operational insights from this work.

What Was Done

Diagnosed and verified health status of the jada-agent.service orchestrator daemon running on AWS Lightsail instance 34.239.233.28
Implemented cross-site content management for three domains: 86from.com, queenofsandiego.com, and sailjada.com
Created SEO landing page for 86from.com with integrated booking widget
Identified and documented persistent OAuth token failure in port_sheet_sync.py background service
Deployed fixes to staging and production CloudFront distributions with cache invalidation

Technical Details: Daemon Health Assessment

The jada-agent.service runs as a systemd service on the Lightsail instance with a 60-second polling interval for task discovery. Using AWS Systems Manager Session Manager (when SSH key rotation wasn't available locally) combined with the Lightsail API's temporary credential generation, we collected comprehensive metrics without requiring stored private keys in the local development environment.

Service Health Indicators:

jada-agent.service: Active and running for 3+ days continuously
CPU utilization: 0.65% average with no spike patterns detected
Memory footprint: 144MB of 914MB available (15.7% utilization)
Disk usage: 6.2GB of 39GB (17% used) — adequate headroom for log rotation
System load average: 0.00 — indicates idle periods between task batches
Status checks: 0 failures in preceding 2-hour window

Session activity logs showed 3 of 5 daily sessions consumed, with one session completing successfully and two hitting the 30-turn Claude API limit. This is not a service failure but rather expected behavior when agent tasks exceed the turn budget. Session 2 successfully completed meaningful work—processing e-signature page blockers and creating a needs-you task for manual follow-up.

Critical Infrastructure Issue: OAuth Token Expiration

The most significant finding was a broken Google OAuth token in the port_sheet_sync.py script. This background service, which syncs data to Google Sheets every 30 minutes, has been failing with:

[port-sheet] token error: HTTP Error 400: Bad Request

The issue stems from token expiration or revocation. Unlike the auth_ga.py implementation we developed for Google Analytics API access (which uses OAuth 2.0 with refresh token rotation), the port sheet sync was using a stale credential. The decision to implement separate authentication contexts per service (rather than a centralized token store) prevents cascading failures but requires individual re-authentication when tokens expire.

Why This Pattern: Isolating authentication per service reduces blast radius—a compromised or expired token in one service doesn't invalidate credentials across all integrations. However, it increases operational overhead. Future work should implement a credential lifecycle manager that monitors token expiration and alerts before failures occur.

Multi-Site Deployment Pipeline

Content management across three properties required careful directory structure and deployment sequencing:

Directory Structure:

/Users/cb/Documents/repos/sites/
├── 86from.com/
│   ├── site/
│   │   ├── index.html
│   │   └── what-does-86d-mean/
│   └── [S3 sync target]
├── sailjada.com/
│   ├── index.html
│   └── [CloudFront distribution]
└── queenofsandiego.com/
    ├── BookingAutomation.gs
    └── [Google Apps Script backend]

The 86from.com property required particular attention. Initial directory naming was 86dfrom, which we normalized to 86from (removing the duplicate 'd') to match DNS records and SEO intent. This required:

Directory rename: 86dfrom.com → 86from.com
Content migration of static assets and HTML files
Google Analytics property reconfiguration to track correct domain
CloudFront invalidation to purge cached content under old directory name

JavaScript Template Syntax Issue in Booking Widget

During deployment to 86from.com, we discovered a critical issue in the booking widget embedded within index.html. The widget uses double-brace syntax ({{ variable }}) for template interpolation, which conflicted with template engines expecting the same syntax.

The Problem: Double-brace template syntax appears in two contexts:

Within the booking widget JavaScript block (intentional, for client-side rendering)
Potentially in surrounding HTML (conflict risk with server-side templating)

The Solution: We extracted and validated the booking widget JavaScript block separately, confirming double-braces only appear within the widget's scope. This allowed safe deployment without ambiguity. The fix involved:

Identifying the exact line numbers of the <script> tags containing the widget
Syntax-checking the extracted JavaScript block with a parser
Confirming no template conflicts outside the widget section
Adding a version tag with model ID in the widget's HTML comment for tracking

Deployment Strategy: Staging Before Production

Rather than deploying directly to production CloudFront distributions, we used a staging bucket pattern:

Deploy updated HTML to staging S3 bucket
Invalidate staging CloudFront cache with /index.html and /* patterns
Verify changes in staging environment
Promote to production CloudFront distribution (stored in separate distribution ID)

This approach prevents cache-poisoning issues where stale content serves to users before TTL expiration. CloudFront invalidation requests cost $0.005 per path, so we batch multiple file changes into single invalidation jobs rather than invalidating per-file.

Infrastructure: AWS Lightsail, S3, and CloudFront

The daemon runs on AWS Lightsail (not EC2) due to lower management overhead for a dedicated, single-purpose service. Lightsail provides:

Pre-configured networking and firewall rules
Built-in metrics via CloudWatch (CPU, network, status checks)
Temporary SSH credential generation via API (no key rotation needed in local dev)
Fixed IP address (34.239.233.28) registered in Route53 for reliability

Static sites deploy to S3 with CloudFront distributions fronting them. This separation ensures: