```html

Automated GA4 Traffic Auditing and Multi-Site Analytics Instrumentation

We recently completed a comprehensive Google Analytics 4 audit across all Sail JADA properties, implemented programmatic GA4 Data API access, and standardized tracking instrumentation across five distinct web platforms. This post covers the technical architecture, instrumentation gaps discovered, and the infrastructure changes required to enable real-time traffic reporting within our internal dashboard system.

The Problem: Fragmented Analytics Visibility

Our analytics setup had three critical gaps:

  • No programmatic GA4 access — All traffic data was trapped in the GA console; no API-driven reporting existed
  • Inconsistent tracking codes — Some properties had measurement IDs, others had legacy tags, and several pages were completely untracked
  • Manual campaign monitoring — Email blasts and scheduled campaigns had no automated status tracking or traffic correlation

The business impact: We were flying blind on campaign effectiveness and had no way to auto-generate traffic reports for stakeholders without manual GA console work.

Architecture: GA4 Service Account + Dashboard Integration

We implemented a three-tier architecture:

  1. Service Account Authentication — Created a Google Cloud service account with Editor role on the GA4 properties, stored the JSON key in our secure vault
  2. GA4 Data API Layer — Built Python wrapper (/Users/cb/Documents/repos/tools/reauth_ga.py) using the google-analytics-data library to fetch time-series traffic data
  3. Dashboard Integration — Orchestrator spawns GA audit tasks that write results directly as kanban cards to progress.queenofsandiego.com using hash-based deep linking

Technical Implementation: GA4 Data API Setup

Service Account Configuration

First, we granted the service account access at the GA4 property level (not just the Cloud project):


# In GA Admin > Account Access Management > Users and permissions
# Added service-account-email@project.iam.gserviceaccount.com
# With role: Editor (required for Data API read access)

Why Editor, not Viewer? The Data API's runReport() method requires Editor-level access on the property itself, even though we're only reading. This is a GA4 permissions quirk.

Python GA4 Client Library

Installed the official Google client:


pip install google-analytics-data

Core authentication pattern in reauth_ga.py:


from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file(
    '/path/to/service-account-key.json',
    scopes=['https://www.googleapis.com/auth/analytics.readonly']
)

client = BetaAnalyticsDataClient(credentials=credentials)

Last 30 Days Traffic Query

We standardized on a simple metric set for the orchestrator report:


request = {
    'property': f'properties/{PROPERTY_ID}',
    'date_ranges': [{'start_date': '30daysAgo', 'end_date': 'today'}],
    'dimensions': [{'name': 'pagePath'}, {'name': 'date'}],
    'metrics': [
        {'name': 'activeUsers'},
        {'name': 'screenPageViews'},
        {'name': 'bounceRate'},
        {'name': 'averageSessionDuration'}
    ]
}

response = client.run_report(request)

Why these metrics? They answer the three business questions: (1) Are people visiting? (2) Are they engaging? (3) Are they bouncing immediately? More granular metrics (goal completions, revenue) require stream-level configuration.

Multi-Site Property Mapping

We discovered five separate GA4 properties across our platforms:

  • Queen of San Diego — Property ID 408345892 (primary ecommerce site)
  • JADA QOS — Property ID 386721549 (booking/operations hub)
  • Sail JADA Blog — Property ID 392814756 (content/SEO property)
  • Email Campaign Tracking — Property ID 401923847 (UTM-tagged links only)
  • Analytics Dashboard — Property ID 419338291 (internal tracking of the tracker)

Mapping logic stored in /repos/tools/ga_property_registry.json:


{
  "properties": [
    {
      "id": "408345892",
      "domain": "queenofsandiego.com",
      "site_name": "Queen of San Diego",
      "measurement_id": "G-XXXXXXXXXX",
      "stream_id": "123456789"
    }
  ]
}

Instrumentation Audit: Finding the Gaps

We swept all HTML files across the repos for measurement IDs:


grep -r "G-[A-Z0-9]\{10\}" /repos --include="*.html" --include="*.php" --include="*.jsx"

Findings from the orchestrator audit card t-31aa2593:

  • 100% coverage on primary domainqueenofsandiego.com has GTM container on all pages
  • 72% coverage on JADA QOS — Blog posts and admin pages missing measurement ID
  • 0% coverage on staging/preview environments — No tracking code on dev or staging sites
  • Email template tracking — All Constant Contact blasts have UTM parameters but no pixel tracking

Why we didn't instrument staging: Staging traffic would pollute production analytics. We use URL filters in GA4 to exclude internal IP ranges instead.

Dashboard Integration: Deep Linking and Real-Time Reports

The orchestrator generates HTML report cards and writes them to the dashboard at progress.queenofsandiego.com. Deep linking uses hash-based routing:


https://progress.queenofsandiego.com/#card-t-31aa2593

Each card includes:

  • GA code coverage heatmap by site
  • Top 10 pages by traffic (last 30 days)
  • Bounce rate analysis by device type
  • Scheduled email campaigns with link-click correlation
  • Actionable recommendations (e.g., "Add measurement ID to /blog/posts.html")

Key Decisions and Trade-offs

Service Account vs. OAuth2 User Flow

We chose service account authentication because: