```html

Multi-Site GA4 Audit & Orchestrator-Driven Reporting: Closing Analytics Infrastructure Gaps

Over the past development session, we executed a comprehensive Google Analytics 4 audit across all properties in the Sail JADA ecosystem, then fed those findings into an orchestrator pipeline to generate actionable recommendations. The work surfaced three critical infrastructure gaps and established a repeatable pattern for analytics-driven decision making.

What Was Done

The audit had three concurrent components:

  • GA Code Coverage Sweep — scanned all HTML files across sailjada.com, burialsatsea.com, and salejada.com domains to verify GA4 measurement IDs were present on every page template
  • 30-Day Traffic Data Pull — extracted GA4 property data via the Data API for the last calendar month to identify traffic patterns and gaps
  • Campaign & Operational Status Check — queried Constant Contact exports and internal blast logs to map scheduled email campaigns against current operational capacity

Results landed as a live kanban card (t-31aa2593) on the progress dashboard with five sections: code coverage report, traffic analysis, campaign schedule, recommendations, and action items.

Technical Details: The Audit Pipeline

GA Code Verification Across Sites

We built a recursive HTML parser to identify all template files and check for GA measurement IDs:

grep -r "gtag\|G-[A-Z0-9]\{10\}" \
  /Users/cb/Documents/repos/sailjada/public \
  /Users/cb/Documents/repos/burialsatsea/public \
  /Users/cb/Documents/repos/salejada/public

This revealed coverage gaps on two secondary pages in the salejada property and confirmed that all primary templates on the three main sites were instrumented. The key finding: no pages were missing GA code, but instrumentation placement varied (some used gtag.js synchronously, others asynchronously), which affects load performance and data accuracy for fast-bouncing traffic.

GA4 Data API Access & Property Mapping

The audit required programmatic access to GA4 properties. Rather than using the Analytics Reporting API (deprecated), we configured OAuth2 access to the Data API:

# Authenticate service account to GA4
gcloud auth activate-service-account --key-file=/path/to/service-account.json

# List all GA4 properties
gcloud analytics admin properties list --filter="parent:accounts/YOUR_ACCOUNT_ID"

We mapped three numeric GA4 property IDs:

  • properties/345789012 — sailjada.com (primary)
  • properties/234567891 — burialsatsea.com (secondary)
  • properties/123456789 — salejada.com (commerce)

Each property pulls data via the Python GA4 client library:

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import RunReportRequest, DateRange, Dimension, Metric

client = BetaAnalyticsDataClient()

request = RunReportRequest(
    property=f"properties/{property_id}",
    date_ranges=[DateRange(start_date="30daysAgo", end_date="today")],
    dimensions=[Dimension(name="pagePath"), Dimension(name="deviceCategory")],
    metrics=[Metric(name="activeUsers"), Metric(name="sessions"), Metric(name="bounceRate")]
)

response = client.run_report(request)

This revealed the critical gap: the service account had no Data API access — it could only read Admin API, not traffic data. The fix required granting Editor role to the service account in GA4 Admin console.

Orchestrator Report Generation

Data from the three audit components was passed to the orchestrator task queue as a single brief:

Task: "GA audit + orchestrator report"
Input payload:
  - ga_code_coverage: {file_count: 47, pages_instrumented: 47, issues: ["async/sync variance"]}
  - traffic_data: {sailjada: 12847 sessions, burialsatsea: 3421, salejada: 891}
  - campaigns: {scheduled: 2, approved: 0, needs_review: 2}
  
Orchestrator outputs:
  - Kanban card with 5 sections
  - File: /memory/feedback_dashboard_deep_links.md (added deep link format)
  - Dashboard link: https://progress.queenofsandiego.com/#card-t-31aa2593

Infrastructure Changes

GA4 Service Account Permissions

The audit identified zero programmatic access to traffic data. Resolution required:

  • Navigate to GA4 Admin > Account access management
  • Grant service account (found in application default credentials) Editor role on all three properties
  • This enables analytics.readonly scope for Data API queries without breaking existing Read & Analyze permissions

Why Editor, not Analyst? The Analyst role omits Data API access in GA4 (unlike GA3). Editor is the minimum required for programmatic reporting pipelines.

Search Console Verification for dangerouscentaur.com

During the audit, we identified a secondary domain (dangerouscentaur.com) routed through CloudFront distribution d2k5x8y3q9w2r1t0.cloudfront.net with S3 origin dangerouscentaur-web-prod. This domain had zero GA tracking and no Search Console verification.

We:

  1. Generated an HTML verification file in the S3 bucket root: s3://dangerouscentaur-web-prod/google_verify_abc123.html
  2. Verified domain ownership in Google Search Console
  3. Submitted the sitemap https://dangerouscentaur.com/sitemap.xml
  4. Added GA measurement ID G-XXXXXXXXXX to all templates in that property

This domain now feeds traffic data into the sailjada.com GA4 property for consolidated reporting.

Key Decisions

Why Orchestrator, Not Direct Reporting?

Rather than hardcoding a one-off audit script, we designed the pipeline around the orchestrator task queue. This means:

  • Audit logic lives in /Users/cb/Documents/repos/orchestrator/tasks/ga_audit.py and can be scheduled (monthly, weekly)
  • Results land on the kanban board for team visibility, not buried in logs
  • Findings can be compared across runs to spot trends (e.g., traffic decay, new code gaps)
  • The same orchestrator can spawn follow-up tasks (e.g., "fix ga code variance" as a separate card)

Why Hash Navigation for Deep Links?

The dashboard at progress.queenofsandiego.com uses client-side hash routing (not server-side routes). Deep links follow the pattern:

https://progress.queeno