Injecting Structured Data at Scale: Automating JSON-LD Deployment Across Event Subdomains

```html

When you're managing a portfolio of event microsites—each with its own S3 bucket, CloudFront distribution, and Route53 configuration—the gap between "working locally" and "live in production" becomes a critical operational challenge. This post walks through how we automated structured data injection across 12 concert event pages, deployed them to distributed S3 buckets, and invalidated CloudFront caches—all while maintaining consistency across multiple domain structures.

The Problem: Silent SEO Loss on Event Pages

Concert event pages at sailjada.queenofsandiego.com and related event subdomains were ranking but not converting. Analysis revealed zero structured data (JSON-LD) across all active event pages. Search engines couldn't extract event metadata—date, location, ticket availability, performers—even though the HTML contained this information. This meant:

Google couldn't display rich snippets or event carousels
Schema.org validation would fail
No machine-readable event data for knowledge graph enhancement
Missed opportunity for voice search and assistant integration

The manual fix would have taken hours per site. Instead, we built a script.

Building the Injection Script: `inject_structured_data.py`

Created at /Users/cb/Documents/repos/tools/inject_structured_data.py, this script does three things:

Scans all HTML files in event subdomain directories
Extracts event metadata from page structure (event name, date, venue)
Injects dual JSON-LD blocks: Event schema and LocalBusiness schema

Why both schemas? Event schema handles ticketing and scheduling; LocalBusiness handles venue identity and local SEO. Search engines use both signals when indexing local events.

The script parses the HTML head tag and injects structured data as the first element after <meta charset>. Placement matters—early insertion ensures validation tools read it first, and it doesn't interfere with analytics or email popup scripts.

Target Pages: The Event Portfolio

The script targeted 12 HTML files across three event subdomain structures:

sailjada.queenofsandiego.com (main concerts subdomain)
Rady Shell event microsites (architecture-specific event pages)
Related concert event subdomains

Each page contained:

Event title (extracted from <h1> or page metadata)
Event date/time (from page content or event-specific data attributes)
Venue name and address
Performer/artist information (when applicable)

The script generated schema dynamically rather than hardcoding, ensuring future page additions inherit structured data automatically if they follow the same HTML structure.

Infrastructure: S3 Buckets and CloudFront Distribution Coordination

Event subdomains use a distributed deployment model:

Subdomain	S3 Bucket	CloudFront Distribution
`sailjada.queenofsandiego.com`	s3://sailjada-qos-events	E2A1B3C4D5 (example)
Rady Shell concert pages	s3://radyshell-events	E2A1B3C4D6 (example)
Related event subdomains	s3://event-subdomains-shared	E2A1B3C4D7 (example)

Each S3 bucket uses the same prefix structure: /events/, with individual event pages named by slug (e.g., /events/concert-2024-spring.html).

Deployment Pipeline

After injection, deployment followed this sequence:

# 1. Sync updated HTML files to S3
aws s3 sync ./sites/sailjada.queenofsandiego.com/events/ s3://sailjada-qos-events/events/ --acl public-read

# 2. Repeat for other event buckets
aws s3 sync ./sites/rady-shell-events/output/ s3://radyshell-events/events/ --acl public-read

# 3. Invalidate CloudFront caches
aws cloudfront create-invalidation --distribution-id E2A1B3C4D5 --paths "/*"
aws cloudfront create-invalidation --distribution-id E2A1B3C4D6 --paths "/*"
aws cloudfront create-invalidation --distribution-id E2A1B3C4D7 --paths "/*"

Why invalidate the entire distribution with /* instead of specific paths? Because:

Event pages link to each other and share navigation
Partial invalidation would miss interdependencies
CloudFront charges per invalidation request, not per path—so one broad invalidation is more efficient than 12 targeted ones
Event content changes infrequently; the edge cache cost is negligible

Schema Validation and Testing

Before production deployment, we validated structured data using:

Google Rich Results Test: Verified Event schema recognition and rich snippet rendering
Schema.org Validator: Confirmed JSON-LD syntax and required properties
Curl + jq parsing: Automated validation of injected blocks

This caught a critical issue: LocalBusiness schema requires a url property, but our first pass omitted it. The fix was one-line—adding the parent domain URL to each schema block.

Key Decision: Dual Schema Approach

We chose to inject both Event and LocalBusiness schemas rather than picking one. Here's why:

Event schema: Optimizes for Google Search event carousels, ticketing platforms, and event aggregators
LocalBusiness schema: Signals venue identity for local search and maps integration
Signal redundancy: If search engines ignore one schema, the other provides fallback structured data
Future flexibility: Enables