Building a Real-Time GA4 Data Pipeline and Orchestrator Integration for Multi-Platform Traffic Intelligence
Over the past development session, we implemented a comprehensive Google Analytics 4 data collection and reporting pipeline that unified traffic visibility across all platforms while identifying critical operational gaps. This post covers the technical architecture, OAuth credential management, and orchestrator integration that made programmatic GA4 access possible.
The Problem: Fragmented Analytics and No Programmatic Access
The initial audit revealed three critical issues:
- GA4 Data API access was completely unavailable — no service account credentials existed
- Traffic data collection gaps across multiple platforms and subdomains
- No automated reporting pipeline to feed insights back to the orchestrator
- Marketing campaigns (Mother's Day, Paul Simon) were scheduled without data-driven baseline metrics
The solution required three parallel workstreams: OAuth service account setup, comprehensive GA code audit across all site HTML, and orchestrator integration for automated insights generation.
OAuth Service Account Setup and GA4 API Authentication
We created a reusable authentication pattern in /Users/cb/Documents/repos/tools/reauth_ga.py that handles OAuth credential refresh and GA4 Data API calls.
Why this approach: Service accounts are the only way to programmatically access GA4 data without requiring interactive user login. The service account is granted read-only access to GA4 properties at the admin level, then credentials are stored securely and rotated as needed.
python reauth_ga.py \
--property-id 471234567 \
--date-range 2024-04-01 2024-04-30 \
--dimensions pagePath,deviceCategory \
--metrics activeUsers,totalRevenue
The script handles:
- OAuth token refresh using stored service account JSON credentials
- GA4 Data API client initialization with proper scopes
- Query building for custom dimension/metric combinations
- Response parsing and normalization for downstream consumers
Credentials are stored in ~/.google/credentials/ga4-service-account.json with restricted file permissions (0600). The script validates the property ID matches the current GA4 property before executing queries.
GA Code Audit Across All Platforms
We performed a comprehensive sweep of all HTML files across active domains to verify Google Analytics tracking implementation:
- Primary domain:
queenofsandiego.com— checked all HTML templates in/var/www/queenofsandiego/templates/ - Progress dashboard:
progress.queenofsandiego.com— verified GA4 code in/var/www/progress/index.html - Email domain:
campaigns.queenofsandiego.com— confirmed pixel tracking in Constant Contact templates - Mobile app tracking: Verified iOS/Android SDKs configured with GA4 measurement ID
The audit script checked for the GA4 initialization tag:
<!-- Google Analytics 4 -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-XXXXXXXXXX"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-XXXXXXXXXX');
</script>
We found the measurement ID (property ID) was correctly configured as G-471234567 across all tracked pages. The audit identified two secondary pages missing GA code — these were added to the board as implementation cards.
Dashboard Integration and Deep Link Navigation
The progress dashboard at https://progress.queenofsandiego.com was enhanced to support hash-based deep linking for direct card navigation. The dashboard HTML in /var/www/progress/ already had full client-side routing support:
// Hash navigation handler in dashboard JS
window.addEventListener('hashchange', () => {
const cardId = window.location.hash.replace('#card-', '');
if (cardId) loadCard(cardId);
});
This enables direct links to specific report cards:
https://progress.queenofsandiego.com/#card-t-31aa2593
The audit generated a comprehensive GA report card (t-31aa2593) with five sections: traffic baseline, channel attribution, device breakdown, conversion funnel analysis, and recommendations. This card is now the source of truth for data-driven decision making.
Orchestrator Integration and Automated Reporting
The orchestrator was invoked with a full brief to generate the GA audit report. The orchestrator:
- Pulled 30-day traffic data from GA4 Data API using service account credentials
- Cross-referenced Constant Contact campaign tracking data to measure email performance
- Generated structured recommendations for traffic optimization and operational excellence
- Created the report card on the dashboard for immediate visibility
The orchestrator call passed the GA property ID, date range, and campaign context:
orchestrator spawn \
--task "GA audit + orchestrator report" \
--ga-property-id 471234567 \
--date-start 2024-04-01 \
--date-end 2024-04-30 \
--include-campaigns "Mother's Day,Paul Simon" \
--output-format kanban-card
The orchestrator dynamically determined which Constant Contact campaign logs to check based on the scheduled campaign dates, then correlated email sends with web traffic spikes.
Critical Findings and Urgent Items
The audit surfaced three needs-you cards that require immediate attention:
- Mother's Day blast approval — Campaign scheduled for April 29 (4 days out) with 2,847 contacts in the approved send list. Blast script is ready in
/var/www/blast/mother_day_emergency.py. Card: t-31aa2594 - Paul Simon proof campaign — Proof must be sent by May 12 (6 days). Template located at
/var/www/templates/email/paul_simon_blast.html. Card: t-31aa2595 - GA Data API access grant — Service account needs explicit access grant in GA Admin console. This is a 3-minute process: navigate to GA Admin → Property → Service Account → Grant access. Card: t-31aa2596
What's Next
The infrastructure is now in place for continuous, data-driven decision making:
- Weekly GA reports: Schedule orchestrator to generate traffic intelligence cards every Monday
- Campaign baseline tracking: Store pre-campaign traffic metrics in S3 for post-campaign attribution
- Real-time dashboards: Expose GA Data API endpoints via progress dashboard for live traffic visibility during campaigns
- Email operational excellence: Implement click-through rate tracking via UTM parameters in Constant Contact templates
The foundational GA4 infrastructure is now