```html

Building a Real-Time Analytics Audit Pipeline: GA4 Integration, Multi-Site Tracking Verification, and Orchestrator-Driven Reporting

What Was Done

We executed a comprehensive Google Analytics 4 audit across five distinct properties (sailjada.com, burialsatsea.com, salejada.com, dangerouscentaur.com, and queenofsandiego.com) to establish baseline traffic data, identify tracking gaps, and generate actionable operational recommendations. The work involved provisioning GA4 Data API access via service account authentication, building a reusable Python authentication module, auditing HTML across all site repositories for tracking code coverage, pulling 30-day historical traffic data, and feeding all findings into an orchestrator-driven report generation system that surfaced results as a kanban card on our progress dashboard.

Technical Details: GA4 Service Account Authentication

The first blocker was programmatic access to GA4 Data API. We created a reusable OAuth2 service account flow in /Users/cb/Documents/repos/tools/reauth_ga.py:

# Service account key location pattern:
# ~/.config/google/analytics-service-account.json

# OAuth scopes required:
# - https://www.googleapis.com/auth/analytics.readonly
# - https://www.googleapis.com/auth/analytics

# Token exchange:
# 1. Load service account JSON (email, private_key, client_id)
# 2. Build JWT assertion signed with private_key
# 3. POST to https://oauth2.googleapis.com/token
# 4. Cache access_token for 3600 seconds
# 5. Use Bearer token for all GA4 Data API calls

Why this approach: Service account authentication eliminates the need for interactive OAuth consent flows in CI/CD pipelines and background jobs. The private key is stored locally with restricted file permissions (0600), and the JWT assertion is self-contained—no refresh token dance required for read-only operations.

The critical step was granting the service account email address (from the JSON key) Editor or Analyst role in GA4 Admin > Account Access Management. Without this IAM grant, all API calls return 403 Forbidden. We documented this in a preflight checklist at /Users/cb/Documents/repos/tools/preflight_check.py to prevent future deployment friction.

Technical Details: GA Property ID Mapping

GA4 uses two identifiers: human-readable numeric Property IDs (e.g., 407...) and the measurement IDs embedded in gtag (e.g., G-ABC...). We discovered all properties across repositories:

  • sailjada.com: Property 407... (dev snapshot confirmed in gtag calls)
  • burialsatsea.com: Secondary property in same account
  • salejada.com: Third property, mapped via Search Console
  • dangerouscentaur.com: New property, verified via GSC HTML file upload to S3 origin
  • queenofsandiego.com: Dashboard/reporting site, self-tracking property

The mapping process:

# 1. List all GA4 accounts and properties (requires analytics.readonly scope)
# 2. Cross-reference gtag measurement IDs in HTML/JS against property IDs
# 3. Verify DNS/GSC ownership before marking property "verified"
# 4. Flag any sites missing gtag initialization entirely

Infrastructure: Dashboard Deep Linking and Card Creation

The progress dashboard at https://progress.queenofsandiego.com implements hash-based routing for deep-linking individual task cards. The URL format is:

https://progress.queenofsandiego.com/#card-{card-id}

The orchestrator created card t-31aa2593 with full audit results across five sections. The card deep link is:

https://progress.queenofsandiego.com/#card-t-31aa2593

This required no new infrastructure—the dashboard JS already supported hash navigation—but it meant the orchestrator needed to inject well-formed card IDs into the reporting payload.

Technical Details: Audit Automation and Orchestrator Integration

We structured the audit as a multi-stage orchestrator task:

  • Stage 1: Code Audit — Scan all HTML files across five site repositories for gtag initialization. Pattern: <script async src="https://www.googletagmanager.com/gtag/js?id=G-...">. Flag any page missing this.
  • Stage 2: Historical Data Pull — Query GA4 Data API for the last 30 days of traffic across all five properties. Dimensions: page path, device category, country. Metrics: active users, sessions, page views, bounce rate.
  • Stage 3: Campaign Status Check — Query Constant Contact API (or read campaign log exports) to list all scheduled email blasts, including Mother's Day (scheduled April 29) and Paul Simon (proof due May 12).
  • Stage 4: Recommendations Engine — Analyze traffic patterns, identify pages with high bounce rates, flag untracked pages, and generate operational excellence recommendations.
  • Stage 5: Card Generation — Serialize all findings into a structured kanban card with hyperlinks to deep-linked sub-cards.

Key Decisions: Why Service Accounts Over User OAuth

Service accounts are preferable for backend analytics pipelines because:

  • No user session dependency — Runs 24/7 without requiring a person to stay logged in.
  • Auditability — All API calls are attributed to the service account email, making it trivial to audit who/what accessed analytics.
  • Scoping — We grant only analytics.readonly scope; the service account cannot delete properties or modify filters.
  • Secret rotation — The private key can be rotated in GA Admin without affecting user accounts.

Why not use the Cloud Console service account directly? We chose to store the key locally and use it from the application tier because our tooling runs on developer machines and Lambda functions, not on GCP infrastructure. This decouples analytics access from GCP billing/project management.

Key Decisions: Deep Linking on the Dashboard

Rather than dumping audit output into a console log or Slack message, we created a permanent, linkable artifact on the dashboard. This enables:

  • Persistence — The card stays on the board until manually closed, so nothing gets buried in chat history.
  • Sharing — A single URL (`#card-t-31aa2593`) is shareable with the team and can be referenced in tickets.
  • Structure — The card format enforces organization: sections for tracking gaps, traffic recommendations, campaign deadlines, and next steps.

What's Next

Three immediate action items emerged from the audit:

  1. Mother's Day Blast Approval — The campaign is scheduled for April 29 (4 days out) but still unapproved. The orchestrator created a "needs-you" card blocking this.
  2. Paul Simon Blast Proof — Proof must be sent by May 12. We prepared the proof using the email template at /repos/email-templates/paul-