Injecting Structured Data at Scale: Automating JSON-LD Deployment Across Event Subdomains
When you're managing a portfolio of event microsites—each with its own S3 bucket, CloudFront distribution, and Route53 configuration—the gap between "working locally" and "live in production" becomes a critical operational challenge. This post walks through how we automated structured data injection across 12 concert event pages, deployed them to distributed S3 buckets, and invalidated CloudFront caches—all while maintaining consistency across multiple domain structures.
The Problem: Silent SEO Loss on Event Pages
Concert event pages at sailjada.queenofsandiego.com and related event subdomains were ranking but not converting. Analysis revealed zero structured data (JSON-LD) across all active event pages. Search engines couldn't extract event metadata—date, location, ticket availability, performers—even though the HTML contained this information. This meant:
- Google couldn't display rich snippets or event carousels
- Schema.org validation would fail
- No machine-readable event data for knowledge graph enhancement
- Missed opportunity for voice search and assistant integration
The manual fix would have taken hours per site. Instead, we built a script.
Building the Injection Script: inject_structured_data.py
Created at /Users/cb/Documents/repos/tools/inject_structured_data.py, this script does three things:
- Scans all HTML files in event subdomain directories
- Extracts event metadata from page structure (event name, date, venue)
- Injects dual JSON-LD blocks:
Eventschema andLocalBusinessschema
Why both schemas? Event schema handles ticketing and scheduling; LocalBusiness handles venue identity and local SEO. Search engines use both signals when indexing local events.
The script parses the HTML head tag and injects structured data as the first element after <meta charset>. Placement matters—early insertion ensures validation tools read it first, and it doesn't interfere with analytics or email popup scripts.
Target Pages: The Event Portfolio
The script targeted 12 HTML files across three event subdomain structures:
sailjada.queenofsandiego.com(main concerts subdomain)- Rady Shell event microsites (architecture-specific event pages)
- Related concert event subdomains
Each page contained:
- Event title (extracted from
<h1>or page metadata) - Event date/time (from page content or event-specific data attributes)
- Venue name and address
- Performer/artist information (when applicable)
The script generated schema dynamically rather than hardcoding, ensuring future page additions inherit structured data automatically if they follow the same HTML structure.
Infrastructure: S3 Buckets and CloudFront Distribution Coordination
Event subdomains use a distributed deployment model:
| Subdomain | S3 Bucket | CloudFront Distribution |
|---|---|---|
sailjada.queenofsandiego.com |
s3://sailjada-qos-events | E2A1B3C4D5 (example) |
| Rady Shell concert pages | s3://radyshell-events | E2A1B3C4D6 (example) |
| Related event subdomains | s3://event-subdomains-shared | E2A1B3C4D7 (example) |
Each S3 bucket uses the same prefix structure: /events/, with individual event pages named by slug (e.g., /events/concert-2024-spring.html).
Deployment Pipeline
After injection, deployment followed this sequence:
# 1. Sync updated HTML files to S3
aws s3 sync ./sites/sailjada.queenofsandiego.com/events/ s3://sailjada-qos-events/events/ --acl public-read
# 2. Repeat for other event buckets
aws s3 sync ./sites/rady-shell-events/output/ s3://radyshell-events/events/ --acl public-read
# 3. Invalidate CloudFront caches
aws cloudfront create-invalidation --distribution-id E2A1B3C4D5 --paths "/*"
aws cloudfront create-invalidation --distribution-id E2A1B3C4D6 --paths "/*"
aws cloudfront create-invalidation --distribution-id E2A1B3C4D7 --paths "/*"
Why invalidate the entire distribution with /* instead of specific paths? Because:
- Event pages link to each other and share navigation
- Partial invalidation would miss interdependencies
- CloudFront charges per invalidation request, not per path—so one broad invalidation is more efficient than 12 targeted ones
- Event content changes infrequently; the edge cache cost is negligible
Schema Validation and Testing
Before production deployment, we validated structured data using:
- Google Rich Results Test: Verified Event schema recognition and rich snippet rendering
- Schema.org Validator: Confirmed JSON-LD syntax and required properties
- Curl + jq parsing: Automated validation of injected blocks
This caught a critical issue: LocalBusiness schema requires a url property, but our first pass omitted it. The fix was one-line—adding the parent domain URL to each schema block.
Key Decision: Dual Schema Approach
We chose to inject both Event and LocalBusiness schemas rather than picking one. Here's why:
- Event schema: Optimizes for Google Search event carousels, ticketing platforms, and event aggregators
- LocalBusiness schema: Signals venue identity for local search and maps integration
- Signal redundancy: If search engines ignore one schema, the other provides fallback structured data
- Future flexibility: Enables