Injecting Structured Data into Concert Event Pages: A JSON-LD Schema Deployment Strategy
During a recent audit of the JADA event subdomain infrastructure, we discovered that 12 active concert event pages across multiple subdomains were missing critical structured data markup. This absence meant search engines couldn't automatically parse event details, pricing, dates, or venue information—leaving significant SEO value and rich snippet potential on the table. This post details how we identified the gap, built an automated injection system, and deployed the changes across a distributed CloudFront + S3 architecture.
The Problem: Missing Schema Markup at Scale
Our event subdomain structure hosts multiple concert event sites, each with its own S3 bucket and CloudFront distribution:
paulsimonradyshell.com(S3 bucket:paulsimonradyshell.com)sailjada.queenofsandiego.com(S3 bucket:sailjada.queenofsandiego.com)- Additional event subdomains under the Rady Shell umbrella
When we crawled these pages, Google Search Console and manual inspection revealed zero instances of JSON-LD structured data (Event schema, LocalBusiness schema, or Organization schema). Without this markup, Google has to infer event details from page content alone, which is unreliable and misses rich snippet opportunities that drive click-through rates.
Solution: Automated Structured Data Injection
Step 1: Audit and Inventory
We first listed all HTML files in the event site repositories:
find /Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events -name "*.html" -type f
This identified 12 active event pages requiring schema injection. We then spot-checked a few with curl to confirm the absence of <script type="application/ld+json"> blocks:
curl -s https://paulsimonradyshell.com/index.html | grep -i "ld+json" || echo "No structured data found"
Step 2: Build the Injection Script
We created /Users/cb/Documents/repos/tools/inject_structured_data.py, a Python utility that:
- Parses HTML files using BeautifulSoup
- Detects if structured data already exists (to avoid duplication)
- Generates Event and LocalBusiness JSON-LD schemas from page metadata
- Injects the script block into the document
<head>(before the closing tag) - Writes the modified HTML back to disk
The script logic follows this pattern:
def inject_schema(html_file_path, event_data):
"""
Parse HTML, inject Event + LocalBusiness JSON-LD, return modified content.
Args:
html_file_path: Path to .html file
event_data: Dict with name, date, time, location, price, etc.
Returns:
Modified HTML string
"""
soup = BeautifulSoup(open(html_file_path), 'html.parser')
# Check if ld+json already exists
if soup.find('script', {'type': 'application/ld+json'}):
return None # Skip if present
# Build Event schema
event_schema = {
"@context": "https://schema.org",
"@type": "Event",
"name": event_data['name'],
"startDate": event_data['date'],
"endDate": event_data['date'],
"location": {
"@type": "Place",
"name": event_data['venue'],
"address": event_data['address']
},
"offers": {
"@type": "Offer",
"url": event_data['booking_url'],
"price": event_data['price'],
"priceCurrency": "USD",
"availability": "https://schema.org/PreOrder"
}
}
script_tag = soup.new_tag(
'script',
type='application/ld+json'
)
script_tag.string = json.dumps(event_schema)
soup.head.append(script_tag)
return str(soup.prettify())
Why JSON-LD in the head? We chose JSON-LD over microdata or RDFa because it's easier to maintain in templates, doesn't pollute the DOM, and is Google's preferred format. Placing it in the <head> ensures crawlers process it immediately, before rendering the page body.
Infrastructure: S3 + CloudFront Deployment Pipeline
After injection, we deployed the modified files using our existing sync infrastructure:
S3 Bucket Targets
Each event subdomain maps to a dedicated S3 bucket:
- paulsimonradyshell.com: S3 bucket
paulsimonradyshell.com, CloudFront distributionE1ABC2DEF3GHIJ - sailjada.queenofsandiego.com: S3 bucket
sailjada.queenofsandiego.com, CloudFront distributionE2XYZ3QWER4ASDF
We synced changes with:
aws s3 sync /Users/cb/Documents/repos/sites/queenofsandiego.com/rady-shell-events/paulsimon/ \
s3://paulsimonradyshell.com/ \
--profile production \
--delete \
--exclude ".git/*" \
--exclude "*.pyc"
CloudFront Cache Invalidation
To ensure browsers and crawlers receive the updated pages immediately, we invalidated the CloudFront cache for each distribution:
aws cloudfront create-invalidation \
--distribution-id E1ABC2DEF3GHIJ \
--paths "/*" \
--profile production
We repeated this for each event subdomain. Using /* ensures all HTML files are purged, forcing edge locations to refetch from the S3 origin.
Key Decisions and Rationale
1. Why Automate Rather Than Manual Injection?
With 12 pages across 4+ subdomains, manual copy-paste was error-prone and unsustainable. An automated script ensures:
- Consistency: Same schema structure across all pages
- Auditability: Git history tracks exactly what changed
- Repeatability: Future events can use the same tool
- Validation: The script can verify schema validity before writing
2. Why JSON-LD and Not Microdata?
We chose JSON-LD because:
- Clean separation of concerns: Markup lives in
<head>, not mixed into HTML structure - Easier template integration: Rendering tools like
render_event_sites.pycan inject JSON without parsing/modifying the body