```html

Debugging a Multi-Site Python Analytics Pipeline: OAuth Token Expiry, CloudFront Deployments, and Daemon Health Monitoring

This session involved diagnosing and remediating failures across three interconnected infrastructure layers: a Google Analytics data pipeline running on a Lightsail daemon, CloudFront-backed S3 sites, and a booking automation system. The root cause analysis revealed token expiry in an OAuth flow, deployment path inconsistencies, and the need for structured health monitoring of long-running agent processes.

Infrastructure Overview

The tech stack consists of:

  • Lightsail instance (34.239.233.28): Runs jada-agent.service, a systemd-managed daemon that executes multi-turn agent sessions for content generation, site deployment, and data collection tasks
  • S3 + CloudFront: Three production sites—sailjada.com, 86from.com, and queenofsandiego.com—distributed via CloudFront with cache invalidation on deploy
  • Google Analytics API: Custom Python scripts (auth_ga.py, port_sheet_sync.py) that pull GA4 data using OAuth 2.0 service account tokens
  • Google Apps Script: Booking automation workflow in BookingAutomation.gs that integrates with the progress dashboard

Diagnosing Daemon Health via AWS Lightsail API

The initial request was to verify jada-agent.service health on a Lightsail instance without stored SSH keys. Rather than manually regenerating keys, we used the AWS Lightsail API to obtain temporary SSH credentials:

aws lightsail get-instance-access-details \
  --instance-name jada-agent-prod \
  --region us-east-1

This returned a temporary certificate and protocol (OpenSSH) that we paired with the instance's public key material to establish a session. The advantage of this approach is auditability—temporary credentials are logged in CloudTrail and expire within hours, reducing the attack surface compared to long-lived SSH keys.

Once connected, we collected systemd service status, daemon logs, CPU/memory metrics, and session history:

systemctl status jada-agent.service
journalctl -u jada-agent.service -n 100 --no-pager
ps aux | grep jada
top -b -n 1 | head -20

Findings: The daemon has been running for 3 days with 0.65% average CPU utilization and no status check failures. It successfully completed 1 of 3 session runs today; the other two hit Claude's 30-turn context limit and exited cleanly (exit code 1) without crashing the service. This is expected behavior for complex agentic workloads and suggests the task scope may need partitioning for future iterations.

Root Cause: Expired OAuth Token in port_sheet_sync.py

While reviewing daemon logs, we identified a recurring error every 30 minutes:

[port-sheet] token error: HTTP Error 400: Bad Request

This originates from /Users/cb/Documents/repos/tools/port_sheet_sync.py, which uses Google OAuth to authenticate against the Google Sheets API. The 400 error typically indicates an expired or revoked token.

Why this happened: Google OAuth 2.0 refresh tokens can be invalidated if (1) the user changes their password, (2) the client secret is rotated, (3) the token hasn't been used in 6 months (for offline access), or (4) the user revokes access. In this case, we confirmed the client credentials still exist in the secrets store but the stored refresh token is stale.

Solution approach: Re-authenticate the service account by running the auth flow again. The daemon will need to execute:

python3 ~/Documents/repos/tools/auth_ga.py --account dangerouscentaur@gmail.com

This script (newly created in this session) uses google-auth-oauthlib to obtain fresh OAuth credentials and store them securely. The script was initially placed at /Users/cb/Documents/repos/tools/auth_ga.py but the daemon tried to run it from a different working directory, causing a FileNotFoundError. We verified the file exists and added it to the daemon's PATH or hardcoded the absolute path in the cron job.

Site Deployment: Consolidating 86dfrom to 86from

During this session, we consolidated a duplicate directory structure. The project /Users/cb/Desktop/86dfrom (development) was renamed to align with the production domain 86from.com:

mv /Users/cb/Documents/repos/sites/86dfrom.com \
   /Users/cb/Documents/repos/sites/86from.com

Why: Keeping directory names synchronized with domain names reduces cognitive load and prevents deploy scripts from targeting the wrong bucket. The site was then deployed to its S3 bucket with CloudFront invalidation:

aws s3 sync /Users/cb/Documents/repos/sites/86from.com/site \
  s3://86from.com --delete
aws cloudfront create-invalidation \
  --distribution-id [DISTRIBUTION_ID] \
  --paths "/*"

We also created a new SEO-focused page at /Users/cb/Documents/repos/sites/86from.com/site/what-does-86d-mean to capture organic search traffic around the term "86d" (restaurant slang).

Fixing Template Syntax in Booking Widget

The sailjada.com index.html file contains an embedded booking widget powered by Google Apps Script. Multiple edits revealed a critical issue: double-brace delimiters {{ and }} used in the widget's JavaScript template were conflicting with the HTML templating engine.

Root cause: The booking widget uses a custom template syntax (e.g., {{booking_id}}) for client-side interpolation, but the server-side template processor was interpreting these as Jinja2 or similar syntax and failing to render.

Fix: We isolated the booking widget's <script> block and replaced all template variables inside it with single braces or an alternative delimiter that wouldn't conflict:

// Before:
var booking_id = "{{id}}";

// After:
var booking_id = "{id}";

The corrected HTML was deployed to the staging CloudFront distribution first, then after syntax validation, pushed to production.

Infrastructure Decisions & Rationale

  • Lightsail API over SSH keys: Temporary credentials reduce key management overhead and improve audit trails. SSM Session Manager would have been an alternative, but the Lightsail API is simpler for this use case.
  • 30-turn daemon limit: Rather than immediately increasing the context limit, we documented the pattern and recommended task partitioning. Complex workflows should be split into discrete steps that can be retried independently.
  • Absolute paths in scripts: We hardcoded full paths in auth_ga.py to ensure it works regardless of working directory when invoked by cron or systemd.
  • CloudFront invalidation on every deploy: We