Automated GA4 Traffic Auditing and Multi-Site Analytics Instrumentation
We recently completed a comprehensive Google Analytics 4 audit across all Sail JADA properties, implemented programmatic GA4 Data API access, and standardized tracking instrumentation across five distinct web platforms. This post covers the technical architecture, instrumentation gaps discovered, and the infrastructure changes required to enable real-time traffic reporting within our internal dashboard system.
The Problem: Fragmented Analytics Visibility
Our analytics setup had three critical gaps:
- No programmatic GA4 access — All traffic data was trapped in the GA console; no API-driven reporting existed
- Inconsistent tracking codes — Some properties had measurement IDs, others had legacy tags, and several pages were completely untracked
- Manual campaign monitoring — Email blasts and scheduled campaigns had no automated status tracking or traffic correlation
The business impact: We were flying blind on campaign effectiveness and had no way to auto-generate traffic reports for stakeholders without manual GA console work.
Architecture: GA4 Service Account + Dashboard Integration
We implemented a three-tier architecture:
- Service Account Authentication — Created a Google Cloud service account with
Editorrole on the GA4 properties, stored the JSON key in our secure vault - GA4 Data API Layer — Built Python wrapper (
/Users/cb/Documents/repos/tools/reauth_ga.py) using thegoogle-analytics-datalibrary to fetch time-series traffic data - Dashboard Integration — Orchestrator spawns GA audit tasks that write results directly as kanban cards to
progress.queenofsandiego.comusing hash-based deep linking
Technical Implementation: GA4 Data API Setup
Service Account Configuration
First, we granted the service account access at the GA4 property level (not just the Cloud project):
# In GA Admin > Account Access Management > Users and permissions
# Added service-account-email@project.iam.gserviceaccount.com
# With role: Editor (required for Data API read access)
Why Editor, not Viewer? The Data API's runReport() method requires Editor-level access on the property itself, even though we're only reading. This is a GA4 permissions quirk.
Python GA4 Client Library
Installed the official Google client:
pip install google-analytics-data
Core authentication pattern in reauth_ga.py:
from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/path/to/service-account-key.json',
scopes=['https://www.googleapis.com/auth/analytics.readonly']
)
client = BetaAnalyticsDataClient(credentials=credentials)
Last 30 Days Traffic Query
We standardized on a simple metric set for the orchestrator report:
request = {
'property': f'properties/{PROPERTY_ID}',
'date_ranges': [{'start_date': '30daysAgo', 'end_date': 'today'}],
'dimensions': [{'name': 'pagePath'}, {'name': 'date'}],
'metrics': [
{'name': 'activeUsers'},
{'name': 'screenPageViews'},
{'name': 'bounceRate'},
{'name': 'averageSessionDuration'}
]
}
response = client.run_report(request)
Why these metrics? They answer the three business questions: (1) Are people visiting? (2) Are they engaging? (3) Are they bouncing immediately? More granular metrics (goal completions, revenue) require stream-level configuration.
Multi-Site Property Mapping
We discovered five separate GA4 properties across our platforms:
Queen of San Diego— Property ID408345892(primary ecommerce site)JADA QOS— Property ID386721549(booking/operations hub)Sail JADA Blog— Property ID392814756(content/SEO property)Email Campaign Tracking— Property ID401923847(UTM-tagged links only)Analytics Dashboard— Property ID419338291(internal tracking of the tracker)
Mapping logic stored in /repos/tools/ga_property_registry.json:
{
"properties": [
{
"id": "408345892",
"domain": "queenofsandiego.com",
"site_name": "Queen of San Diego",
"measurement_id": "G-XXXXXXXXXX",
"stream_id": "123456789"
}
]
}
Instrumentation Audit: Finding the Gaps
We swept all HTML files across the repos for measurement IDs:
grep -r "G-[A-Z0-9]\{10\}" /repos --include="*.html" --include="*.php" --include="*.jsx"
Findings from the orchestrator audit card t-31aa2593:
- 100% coverage on primary domain —
queenofsandiego.comhas GTM container on all pages - 72% coverage on JADA QOS — Blog posts and admin pages missing measurement ID
- 0% coverage on staging/preview environments — No tracking code on dev or staging sites
- Email template tracking — All Constant Contact blasts have UTM parameters but no pixel tracking
Why we didn't instrument staging: Staging traffic would pollute production analytics. We use URL filters in GA4 to exclude internal IP ranges instead.
Dashboard Integration: Deep Linking and Real-Time Reports
The orchestrator generates HTML report cards and writes them to the dashboard at progress.queenofsandiego.com. Deep linking uses hash-based routing:
https://progress.queenofsandiego.com/#card-t-31aa2593
Each card includes:
- GA code coverage heatmap by site
- Top 10 pages by traffic (last 30 days)
- Bounce rate analysis by device type
- Scheduled email campaigns with link-click correlation
- Actionable recommendations (e.g., "Add measurement ID to /blog/posts.html")
Key Decisions and Trade-offs
Service Account vs. OAuth2 User Flow
We chose service account authentication because: