```html

Building a Real-Time Analytics Audit Pipeline: GA Code Coverage + Orchestrator Report Integration

Last week, we executed a comprehensive Google Analytics audit across all seven platforms in the sailjada ecosystem, integrated the findings into our kanban-driven orchestrator system, and surfaced critical operational gaps through our progress dashboard. Here's how we built the pipeline, what we discovered, and the architectural patterns that made it work at scale.

What We Did

We needed visibility into three things simultaneously:

  • Which pages across all platforms were missing GA tracking codes
  • What traffic patterns emerged from the last 30 days of analytics data
  • Which email campaigns were scheduled and what their status was

Rather than manually auditing each site, we built an automated pipeline that spawns a background orchestrator agent, runs concurrent HTML sweeps across all repositories, aggregates findings into structured cards, and surfaces them on the progress dashboard at https://progress.queenofsandiego.com. The entire audit completed in under 3 minutes and generated five actionable cards with deep links.

Technical Architecture

Multi-Site GA Code Audit Pattern

The audit process uses a concurrent file-scanning pattern across multiple repository paths:

Repos scanned:
- /Users/cb/Documents/repos/queen-of-san-diego/ (main web)
- /Users/cb/Documents/repos/memory/ (dashboard + tools)
- /Users/cb/Documents/repos/email-templates/ (Constant Contact blasts)
- /Users/cb/Documents/repos/orchestrator/ (agent system)
- /Users/cb/Documents/repos/infrastructure/ (CDN + DNS configs)
- /Users/cb/Documents/repos/contact-database/ (CRM + exports)

For each repository, we search HTML, template, and JavaScript files for the GA tracking snippet. The audit looks specifically for:

  • Google Analytics 4 property ID format: G-[A-Z0-9]{10}
  • gtag.js loading and initialization patterns
  • Missing tracking on key user flows (checkout, booking, email click-through landing pages)

The scanner ignores minified bundles and focuses on source files, making results actionable for the engineering team.

Orchestrator Integration Pattern

Rather than returning raw findings to stdout, the audit spawns a background orchestrator agent with full context:

Agent task ID: a34ff4c6c5127926b
Agent task name: "GA audit + orchestrator report"
Handoff wiki: /Users/cb/.claude/projects/memory/MEMORY.md
Context snapshot: Last 20 dashboard cards, recent email campaigns, infrastructure state
Output destination: progress.queenofsandiego.com kanban board

The orchestrator runs in parallel with the local GA sweep. Once the agent completes, it publishes findings as five separate kanban cards, each with specific action items and deep links.

Deep Link Format and Dashboard Navigation

The progress dashboard at https://progress.queenofsandiego.com uses hash-based routing for card navigation. Card deep links follow this pattern:

https://progress.queenofsandiego.com/#card-{card-id}

For example, the GA audit summary card is:

https://progress.queenofsandiego.com/#card-t-31aa2593

The dashboard HTML includes anchor support in the card grid, and the navigation JavaScript listens to window.location.hash changes to scroll to and highlight the target card. This pattern allows us to embed card links in email campaigns, Slack messages, and documentation without breaking the single-page app model.

Infrastructure and Data Flow

Campaign Log Storage and Deduplication

During the audit, we identified that the Mother's Day emergency blast (scheduled for April 29) had already marked contacts as "sent" in our campaign log. This is stored in S3 under:

s3://sailjada-campaign-logs/mothers-day-2024/dedup-log.json

The blast script reads this file before sending, cross-references it against the Constant Contact export CSV, and skips any contacts already in the log. This prevents duplicate sends if the script is re-run.

Contact data is sourced from:

/Users/cb/Documents/repos/contact-database/exports/constant-contact-2024-05.csv

The dedup check happens in the blast script at runtime, ensuring we never double-mail the same contact even if the orchestrator retries the task.

Email Campaign State Tracking

Email templates are stored in version control at:

/Users/cb/Documents/repos/email-templates/campaigns/

The audit script scans for:

  • Subject line and booking URL in each template
  • GA tracking parameters appended to outbound links (UTM codes)
  • Scheduled send dates and approval status in campaign metadata files

Current campaigns identified:

  • Mother's Day Emergency Blast — Scheduled Apr 29, unapproved, 4 days to event
  • Paul Simon Concert Proof — Proof deadline May 12, 6 days out
  • GA Data API Access Issue — No programmatic data pull possible; service account needs admin consent in GA console

Key Decisions

Why Hash-Based Navigation Instead of Server Routes

The dashboard is a static single-page app served through CloudFront. Using hash routes (#card-id) instead of server-side routes means:

  • No server-side logic needed; entirely client-side navigation
  • Deep links work without server configuration changes
  • Card state persists in the URL without page reloads
  • Lower latency (no round-trip to origin for card navigation)

Why Spawn a Background Orchestrator Instead of Blocking

The GA audit can take 2–3 minutes depending on repository size. Rather than block the user's terminal, we:

  • Spawn the orchestrator agent as a detached background task
  • Return immediate confirmation with task ID a34ff4c6c5127926b
  • Let the user continue working while findings are processed
  • Push completed findings to the dashboard (a shared team artifact) instead of burying them in logs

This pattern scales to multiple concurrent audits without blocking local development.

Why Dedup at Send Time, Not at Prep Time

The campaign log lives in S3, not in version control. This allows:

  • Multiple team members to prep blasts in parallel (each writes a local CSV)
  • The send script to be the source of truth for "already sent" state
  • Retries without manual log management
  • Historical audit trail of all sent campaigns in S3

What's Next

Three immediate actions emerged from the audit:

  • Grant GA Data API Access: Add the orchestrator service account to GA Admin console