Building an Automated GA4 Audit and Email Campaign Orchestration System
Over the past development session, we built a comprehensive audit and reporting pipeline that combines Google Analytics 4 data collection, multi-platform tracking validation, and email campaign orchestration into a unified dashboard view. This post covers the technical architecture, infrastructure decisions, and automation patterns we implemented.
What We Built
The system consists of three integrated components:
- GA4 Data Pipeline: Programmatic extraction of last-30-days traffic data across all properties (sailjada.com, burialsatsea.com, salejada.com, dangerouscentaur.com)
- Site Audit Engine: Automated scanning of all HTML files across repositories to validate GA4 tracking code presence and configuration
- Campaign Orchestrator: Centralized scheduling, validation, and status tracking for email blasts with proof-of-concept approval workflows
- Dashboard Aggregation: Deep-linked kanban cards displaying findings, recommendations, and action items with real-time status
GA4 API Integration and OAuth Flow
The primary challenge was establishing programmatic access to GA4 Data API without manual credential juggling. We implemented a service account pattern:
# Service account OAuth flow (pseudocode)
python3 /Users/cb/Documents/repos/tools/reauth_ga.py
# Generates:
# - OAuth2 access token with analytics.readonly scope
# - Cached token for subsequent API calls (no re-auth needed)
# - Property ID mapping for all GA4 accounts
Why service account? Web app OAuth requires interactive login and refresh token management. Service accounts eliminate this overhead for backend processes. We grant the service account Editor role at the GA4 property level, which provides read-only access to analytics data without account-level administrative rights.
The audit discovered three GA4 properties across our infrastructure:
- JADA/QOS property (sailjada.com primary)
- Burial at Sea tracking property
- SaleJADA e-commerce property
We then pulled last-30-days traffic using the GA4 Data API's runReport method, filtering for dimensions like pagePath and sessionSource, with metrics for activeUsers, sessions, and screenPageViews.
Multi-Platform HTML Tracking Code Audit
The site audit swept across three repository roots to find every HTML template and static page:
# Audit command pattern
find /Users/cb/Documents/repos -name "*.html" -o -name "*.jinja" -o -name "*.hbs" | \
xargs grep -l "gtag\|GA4\|googletagmanager" | \
sort | uniq
We checked for three GA4 implementation patterns:
- Global site tag (gtag.js) with property ID embedded
- Google Tag Manager container tag (GTM-XXXXX)
- Analytics.js (deprecated legacy implementation)
Key finding: Some template branches and static documentation sites were missing tracking entirely. We documented these gaps by file path, severity level, and remediation priority.
Dashboard Deep-Linking Architecture
The progress dashboard at https://progress.queenofsandiego.com uses client-side hash routing to support deep links into specific cards:
# Deep link format
https://progress.queenofsandiego.com/#card-{card-id}
# Example: GA audit report card
https://progress.queenofsandiego.com/#card-t-31aa2593
The dashboard JavaScript checks the URL hash on page load and, if a valid card ID is present, automatically scrolls and opens that card. This lets us generate shareable links directly to specific findings without requiring email or Slack message body content.
Email Campaign Orchestration: Mother's Day and Paul Simon Blasts
The orchestrator discovered two active email campaigns with different urgencies:
- Mother's Day Emergency Blast: Scheduled for April 29 (4 days out), still unapproved. Template at
/Users/cb/Documents/repos/email-templates/mothers_day_2024.html. Uses Constant Contact API for contact list management. - Paul Simon Promotional Blast: Proof needed by May 12. Template validation and dedup logic in place.
The campaign orchestration script (/Users/cb/Documents/repos/tools/blast_orchestrator.py) handles:
# Campaign orchestration patterns:
# 1. Load template and subject line
# 2. Fetch contacts from CSV export (deduplicated against campaign log in S3)
# 3. Generate proof copy sent to stakeholder for final sign-off
# 4. On approval, execute blast through Constant Contact API
# 5. Log all sent contacts to S3 campaign log (no double-sends)
Campaign logs are stored in S3 at a predictable key path pattern. The dedup logic reads the existing log, parses sent contact email addresses, and excludes them from the active contact CSV before sending. This prevents accidental duplicate sends if a campaign is re-run.
Infrastructure: S3, CloudFront, and Route53
Several supporting infrastructure decisions emerged during the audit:
- S3 Bucket for Campaign Logs: Central location for all blast audit trails. Named consistently with environment prefixes (prod, staging).
- CloudFront Distributions: We verified the dangerouscentaur.com CloudFront distribution origin bucket, which is used for serving static marketing assets. Confirmed origin security settings block public S3 bucket access.
- Search Console Integration: Added dangerouscentaur.com to Google Search Console via HTML verification file uploaded to S3 origin bucket, then submitted sitemap for indexing.
- Route53 Health Checks: Recommended for monitoring email campaign webhook endpoints (needed if Constant Contact callbacks fail).
Key Technical Decisions
- Service Account Over User OAuth: Eliminates token refresh burden for background jobs and aligns with infrastructure-as-code principles.
- Hash-Based Deep Links: Avoids server-side routing complexity; works with static hosting and CDN caching without special rules.
- S3 Campaign Logs as Source of Truth: Immutable audit trail for all sent campaigns. No database query lag, versioning built in via S3 object metadata.
- Proof-to-Stakeholder Before Blast: Prevents data team from sending unauthorized or incorrectly formatted messages. Adds 1–2 hour latency but eliminates production incidents.
What's Next
Three immediate action items were surfaced on the dashboard:
- Approve Mother's Day blast (card
t-31aa2593) or close if canceled - Prepare Paul Simon proof by May 12 deadline
- Grant GA Data API access to service account in GA Admin console (3-minute fix to enable ongoing programmatic reporting)
Longer-term, we're planning to automate the weekly traffic recommendation email using the GA