Injecting Structured Data at Scale: Event Page JSON-LD Automation Across Multi-Domain Event Infrastructure
The Problem: SEO Visibility Without Machine-Readable Context
We discovered that twelve active event pages across four concert-branded subdomains (paulsimonradyshell.com, bobdylanradyshell.com, and others) were rendering beautifully in browsers but completely invisible to search engines and rich result indexing. Google couldn't parse event details, venues, or dates because the pages lacked structured data markup.
This is a common infrastructure gap: dynamic site generation tools often prioritize rendering HTML for users while neglecting the JSON-LD schemas that enable Google's Knowledge Graph enrichment, Event rich results, and Local Business credibility signals.
What We Built: Programmatic Structured Data Injection
Rather than manually editing 12 pages across 4 domains, we implemented a Python-based injection script that:
- Scanned all event pages for existing Event and LocalBusiness schema
- Generated compliant JSON-LD blocks with Event date/time, performer, venue, and ticket information
- Injected schemas into the
<head>tag without breaking existing markup - Deployed updated HTML to the correct S3 buckets and invalidated CloudFront caches
Technical Implementation: The Injection Pipeline
Step 1: Audit Existing Markup
We first mapped the event page architecture across subdomains. The Rady Shell event pages live in a shared templating system at /Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/. A Python script (render_event_sites.py) generates the static HTML for each concert subdomain by applying templates to event metadata.
We discovered none of the 12 active event pages contained Event or LocalBusiness JSON-LD. This was the root cause: pages had correct semantic HTML but lacked the structured data that powers Google's Event rich results and Knowledge Panel enrichment.
Step 2: Write the Injection Script
Created /Users/cb/Documents/repos/tools/inject_structured_data.py with the following logic:
# Pseudocode structure
def inject_event_schema(html_content, event_metadata):
# Parse event details from filename/frontmatter
event_date = extract_date(event_metadata)
performer = extract_performer(event_metadata)
venue_name = "The Rady Shell at Jacobs Park"
# Generate Event JSON-LD
event_schema = {
"@context": "https://schema.org",
"@type": "Event",
"name": f"{performer} at {venue_name}",
"description": event_metadata.get('description'),
"startDate": event_date.isoformat(),
"endDate": event_date.isoformat(),
"eventStatus": "EventScheduled",
"eventAttendanceMode": "OfflineEventAttendanceMode",
"location": {
"@type": "Place",
"name": venue_name,
"address": {
"@type": "PostalAddress",
"addressLocality": "San Diego",
"addressRegion": "CA",
"addressCountry": "US"
}
},
"offers": {
"@type": "Offer",
"url": event_url,
"price": "0",
"priceCurrency": "USD",
"availability": "https://schema.org/InStock",
"validFrom": "2024-01-01T00:00:00Z"
},
"organizer": {
"@type": "Organization",
"name": "The Rady Shell at Jacobs Park",
"url": "https://www.radyshell.org"
}
}
# Inject into before closing tag
schema_tag = f""
updated_html = html_content.replace("", f"{schema_tag}")
return updated_html
The script also generates LocalBusiness JSON-LD for the venue itself, enabling the Rady Shell's Google Business Profile to connect with event pages and consolidate review signals.
Step 3: Batch Processing and S3 Deployment
We processed 12 event pages across four subdomains. Each subdomain has its own S3 bucket and CloudFront distribution:
paulsimonradyshell.com→ S3 bucketpaulsimonradyshell.com→ CloudFront distribution ID (retrieved via AWS CLI)bobdylanradyshell.com→ S3 bucketbobdylanradyshell.com→ CloudFront distribution ID- Additional event subdomains mapped in similar pattern
Deployment workflow:
# Sync updated event pages to S3
aws s3 sync ./updated-event-pages/ s3://paulsimonradyshell.com/ \
--exclude "*" \
--include "index.html" \
--include "*event*.html"
# Invalidate CloudFront cache to serve fresh markup
aws cloudfront create-invalidation \
--distribution-id E1A2B3C4D5E6F7 \
--paths "/*"
Infrastructure Architecture: Multi-Domain Pattern
The infrastructure uses a hub-and-spoke model:
- Hub: Shared template system in
queenofsandiego.comrepository - Spokes: Individual S3 buckets for each concert-branded subdomain
- DNS: Route53 CNAME records pointing each subdomain to CloudFront distributions
- Cache: CloudFront distributions sit in front of S3 buckets with 1-hour TTL for HTML and 1-year TTL for assets
This pattern allows centralized template maintenance while keeping each event's static content isolated, reducing cross-contamination risk and enabling per-event cache invalidation strategies.
Why JSON-LD Over Other Schema Formats?
We chose JSON-LD for three reasons:
- Robustness: JSON-LD is decoupled from HTML structure—microdata markup breaks if you change tag classes; JSON-LD doesn't
- Google Preference: Google's own documentation recommends JSON-LD as the primary structured data format for Search
- Maintainability: Injecting a
<script>tag requires no HTML surgery; modifying existing markup risks breaking layout
Key Decisions and Trade-offs
Injection Point: We inject into <head> rather than <body> because search crawlers prioritize head content and it keeps semantic metadata semantically grouped.
Script Type: application/ld+json is the standard; older formats like RDFa or microdata require attribute changes throughout the HTML, increasing deployment risk.
Batch