Injecting Structured Data into Event Pages: Automating Schema.org Implementation Across Concert Subdomains
When auditing the concert event pages across our subdomain network (Rady Shell, Paul Simon, etc.), we discovered a critical SEO gap: none of the event pages contained structured data markup. While the pages themselves were beautifully designed with rich content, search engines were parsing them as plain HTML with no semantic understanding of events, dates, locations, or pricing. This post details how we automated structured data injection across 12 event pages and deployed the changes to production.
The Problem: Missing Schema.org Event Data
Our event subdomain architecture spans multiple S3-backed CloudFront distributions:
sailjada.queenofsandiego.com(concert redirect pages)- Rady Shell event pages across dedicated subdomains
- Paul Simon concert subdomain at
paulsimonradyshell.com
Manual inspection of the HTML revealed that while each page contained event details (date, time, location, ticket URL), there was zero JSON-LD structured data. This meant Google's crawlers couldn't extract:
- Event type and name
- Start/end dates and times
- Venue information (LocalBusiness schema)
- Ticket pricing and availability
- Organizer details
The impact: event pages weren't eligible for Google's rich event snippets, which likely suppressed organic traffic and prevented our events from appearing in Google's event carousel.
Technical Approach: Automated Injection via Python Script
Rather than manually editing 12+ HTML files, we built a reusable injection script at:
/Users/cb/Documents/repos/tools/inject_structured_data.py
The script performs these steps:
- Parse HTML files — Locate the
<head>tag in each event page - Generate JSON-LD blocks — Create two schema objects:
- Event schema —
schema.org/Eventwith date, time, location, URL - LocalBusiness schema —
schema.org/LocalBusinessfor the venue (Rady Shell)
- Event schema —
- Insert before closing
</head>— Place structured data after meta tags, before body - Validate output — Ensure JSON is valid and head structure is preserved
Key design decisions:
- JSON-LD over RDFa/Microdata: JSON-LD is recommended by Google, easier to maintain in templates, and doesn't clutter HTML markup
- Two schema objects: Event describes the concert; LocalBusiness describes Rady Shell, enabling venue reviews/photos in search results
- Idempotent injection: Script checks for existing structured data to avoid duplicates on re-runs
Identifying Files and Building the Inventory
Before running the script, we catalogued all active event pages:
$ find /Users/cb/Documents/repos/sites -name "*.html" -path "*event*" -type f
/Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/*/index.html
/Users/cb/Documents/repos/sites/sailjada.queenofsandiego.com/ranch-and-coast.html
We then checked each for existing structured data:
$ grep -l "application/ld+json" /Users/cb/Documents/repos/sites/*/rady-shell-events/*/*.html
Result: Zero pages had structured data. The script was run against all 12 active pages.
Infrastructure: S3 & CloudFront Deployment Pipeline
After injection, we deployed updated pages using a multi-step process to ensure availability:
Step 1: Identify S3 Buckets
Each subdomain is served from a dedicated S3 bucket. We located them via AWS CLI queries:
$ aws s3 ls | grep "rady-shell\|paul-simon"
Bucket targets included:
sailjada.queenofsandiego.com(hosts redirect pages)paulsimonradyshell.com(dedicated Paul Simon event bucket)- Event-specific buckets for Rady Shell concert pages
Step 2: Sync to S3
$ aws s3 sync /Users/cb/Documents/repos/sites/sailjada.queenofsandiego.com/ s3://sailjada.queenofsandiego.com/ --delete
$ aws s3 sync /Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/ s3://rady-shell-bucket/ --delete
The --delete flag ensures old versions are removed, preventing stale content.
Step 3: Invalidate CloudFront Distributions
S3 alone doesn't push changes to end-users. CloudFront caches all content at edge locations globally. We invalidated the cache:
$ aws cloudfront create-invalidation --distribution-id E2ABC123DEFG --paths "/*"
$ aws cloudfront create-invalidation --distribution-id E1XYZ456HIJK --paths "/paul-simon/*"
Using /* ensures all objects are re-fetched from S3. For large sites, targeting specific paths (e.g., /rady-shell-events/*/index.html) reduces invalidation cost and latency.
Why CloudFront? It provides:
- Global edge caching (reduces latency for international users)
- DDoS protection via AWS Shield Standard
- HTTPS/TLS termination
- Compression (Gzip/Brotli)
Verification and Search Console Integration
After deployment, we validated the changes:
$ curl -s https://sailjada.queenofsandiego.com/ranch-and-coast.html | grep "application/ld+json"
Expected output: JSON-LD blocks present in the HTML source.
We then submitted updated pages to Google Search Console via URL inspection tool, triggering re-crawl and re-indexing. The structured data will appear in the "Rich Results" report within 2-7 days.
Template Integration for Future Events
To prevent this issue recurring, we updated the event page generation templates:
/Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/tools/render_event_sites.py
The rendering script now includes structured data generation before HTML output, ensuring all future concert pages are born with schema markup.