Building a Real-Time Technical Blog Pipeline for Four Sailing Charter Sites
Overview: Granular Session Capture and Auto-Publishing
This session implemented a comprehensive technical blog system across four domains: tech.queenofsandiego.com, tech.sailjada.com, tech.dangerouscentaur.com, and tech.burialsatseasandiego.com. The goal was to create an automated pipeline that captures development work at a granular level—file paths, function names, infrastructure changes—and publishes detailed posts without requiring manual intervention or exposing credentials.
Technical Architecture
Core Components
- Session Transcript Parser (
/Users/cb/Documents/repos/tools/tech_blog_generator.py): Reads Claude Code session transcripts in JSONL format, extracts tool use events, file modifications, and command executions, filters sensitive data (credentials, API keys, tokens), and generates HTML blog posts - Infrastructure Initialization (
/Users/cb/Documents/repos/tools/tech_blog_init.py): Provisions S3 buckets, CloudFront distributions, ACM certificates, and DNS records for each tech blog domain, handling three different DNS providers (Route53 for sailjada.com, Namecheap for dangerouscentaur.com, GoDaddy for burialsatseasandiego.com) - Claude Code Stop Hook (
/Users/cb/.claude/hooks/tech_blog_stop.sh): Executes automatically when a Claude Code session ends, extracts the session transcript, generates the blog post, uploads to S3, and invalidates CloudFront cache - Settings Integration (
/Users/cb/.claude/settings.json): Registers the stop hook globally so it runs for all sessions targeting the repos directory
Data Flow
Claude Code Session
↓
Session ends → Stop hook triggered
↓
Parse transcript (JSONL format)
↓
Extract: files modified, commands run, infrastructure changes
↓
Filter sensitive data (regex patterns for creds, keys, tokens)
↓
Generate HTML post (markdown → HTML conversion)
↓
Upload to appropriate S3 bucket
↓
Invalidate CloudFront distribution
↓
DNS already propagated (pre-configured)
↓
Post live at tech.[domain].com
Infrastructure Decisions and Implementation
S3 Bucket Strategy
Each domain gets its own S3 bucket following naming convention [domain]-tech-blog:
queenofsandiego-tech-blog(Route53 in sailjada.com hosted zone)sailjada-tech-blog(Route53 in sailjada.com hosted zone)dc-tech-blog(Namecheap DNS, leveraging existingdc-siteswildcard CF distribution pattern)bats-tech-blog(GoDaddy DNS via API integration)
All buckets are configured as static website hosting with index.html as the default document. Bucket policies restrict public read access while allowing CloudFront OAI (Origin Access Identity) to retrieve objects.
CloudFront Distribution Strategy
Two approaches were used based on existing infrastructure:
- Route53-managed domains (queenofsandiego.com, sailjada.com): New CloudFront distributions created with ACM wildcard certificates (
*.queenofsandiego.com,*.sailjada.com) already validated and available. Alias records added to Route53 hosted zones pointing to CloudFront CNAME. - Namecheap-managed dangerouscentaur.com: Leveraged existing wildcard CloudFront distribution (ID:
E2Q4UU71SRNTMB) already pointing todc-sitesbucket by adding a new origin (dc-tech-blogbucket) and cache behaviors fortech.dangerouscentaur.com/*paths. - GoDaddy-managed burialsatseasandiego.com: Created new CloudFront distribution, obtained ACM certificate DNS validation records, added CNAME to GoDaddy DNS via API, waited for validation, then created distribution.
Why this approach? Reusing existing wildcard certs and CF distributions where possible reduces overhead and certificate management complexity. GoDaddy required manual API-driven DNS validation because GoDaddy doesn't support automatic DNS validation through AWS.
DNS Configuration
Route53 (sailjada.com hosted zone):
- queenofsandiego.com CNAME → d[distribution-id].cloudfront.net
- sailjada.com CNAME → d[distribution-id].cloudfront.net
Namecheap (dangerouscentaur.com):
- tech CNAME → d[existing-distribution-id].cloudfront.net
GoDaddy (burialsatseasandiego.com):
- ACM validation CNAME: _[random].[domain] → _[validation-token].acm-validations.aws
- tech A record → CloudFront distribution alias (once cert validated)
Session Transcript Parsing: Technical Details
JSONL Format Extraction
Claude Code sessions are stored as JSONL files in ~/.claude/projects. Each line is a JSON object representing an event. The parser filters for:
tool_useevents withname: "bash"→ commands executedtool_resultevents containing file paths and operation types (Write, Edit)textevents → reasoning and decision notes
Sensitive Data Filtering
Multiple regex patterns filter credentials before publication:
- AWS: Access keys (AKIA*), secret keys (40+ hex chars), API tokens, ARNs with account IDs redacted
- API Keys: Generic patterns for API keys (32+ char hex/alphanumeric), tokens, bearer tokens
- Passwords: Any line containing
password,passwd,pwd,secret,token,key - Email: Partial redaction of email addresses used in code
- Domain credentials: Namecheap, GoDaddy, Route53 tokens
The filter is aggressive: if a line contains sensitive patterns, the entire line is replaced with [sensitive data redacted]. This prevents accidental exposure while maintaining readability for non-sensitive commands.
HTML Generation
Posts are generated with this structure:
<h2>[Specific action]</h2>
<h3>What Was Done</h3>
- Bulleted list of files modified/created
- Count: X files
<h3>Commands Executed</h3>
- Filtered command list with brief