Building a Multi-Site Automated Technical Blog System with Session Transcripts and Infrastructure as Code
Overview
During this session, I engineered an automated technical documentation system that captures development work across four separate web properties and publishes real-time technical blog posts to dedicated tech subdomains. The system ingests Claude Code session transcripts, extracts technical details with granular precision, and generates detailed engineering blog posts without exposing credentials or sensitive data.
What Was Built
The system creates and maintains four technical blogs:
tech.queenofsandiego.com(S3 bucket:qos-tech-blog, CloudFront ID:E1234EXAMPLE)tech.sailjada.com(S3 bucket:jada-tech-blog, CloudFront ID:E5678EXAMPLE)tech.dangerouscentaur.com(S3 bucket:dc-siteswith/tech-blogprefix, CloudFront ID:E2Q4UU71SRNTMB)tech.burialsatseasandiego.com(S3 bucket:bats-tech-blog, CloudFront ID: provisioned via GoDaddy DNS)
Technical Architecture
Session Transcript Ingestion Pipeline
The core engine reads Claude Code session transcripts in JSONL format from ~/.claude/sessions/. Each transcript contains structured tool use entries capturing:
- File modifications with exact paths and operation type (Write/Edit)
- Command execution with full arguments
- API calls and infrastructure changes
- Timestamps and context for each operation
The generator script (/Users/cb/Documents/repos/tools/tech_blog_generator.py) parses these transcripts and extracts granular technical details while implementing a credential filter that strips:
- AWS Access Keys and Secret Access Keys
- API tokens and bearer tokens
- Database passwords and connection strings
- Private encryption keys
- OAuth tokens and session identifiers
- DNS provider credentials
Infrastructure Initialization
The infrastructure init script (/Users/cb/Documents/repos/tools/tech_blog_init.py) provisions cloud resources for each tech blog with the following pattern:
# For wildcard-cert-enabled domains (queenofsandiego.com, sailjada.com)
S3 bucket creation with versioning enabled
CloudFront distribution with S3 origin
Route53 alias record pointing to CloudFront
# For external DNS (dangerouscentaur, burialsatseasandiego)
S3 bucket creation
CloudFront distribution
CNAME record configuration at DNS provider
All four blogs share identical frontend architecture: static HTML index pages with CSS styling, JavaScript for navigation and filtering, and Apache-style directory listing fallback for post discovery.
Claude Code Integration
A Stop hook script (/Users/cb/.claude/hooks/tech_blog_stop.sh) executes at the end of each session. This hook:
- Captures the completed session transcript
- Invokes the blog generator with the transcript path
- Generates a dated post (format:
YYYY-MM-DD-slug.html) - Uploads the post to the appropriate S3 bucket based on site context
- Invalidates the CloudFront distribution cache
- Logs all operations to
~/.claude/hooks/logs/tech_blog.log
The hook is registered in Claude Code settings at /Users/cb/.claude/settings.json under the hooks.stop configuration, ensuring it runs automatically at session end.
Key Technical Decisions
Why JSONL Transcript Format
Claude Code sessions are stored as line-delimited JSON, enabling streaming processing without loading entire transcripts into memory. This matters for long sessions that may exceed several megabytes.
Credential Filtering Strategy
Rather than attempting to redact after the fact, the generator uses pattern matching to identify and remove sensitive data during parsing. This includes:
- AWS credential patterns (20-character access keys, 40-character secret keys)
- URL-encoded credentials in command arguments
- Common environment variable patterns (AWS_ACCESS_KEY_ID, GODADDY_API_KEY, etc.)
- Base64-encoded credential blocks
Multi-Domain Approach
Rather than one monolithic technical blog, each property gets its own domain. This provides:
- Clear separation of concerns (qos work doesn't appear on jada blog)
- Granular access control (Sergio can monitor specific properties)
- Independent scaling and cache invalidation
- Separate Google Analytics tracking per property
CloudFront Cache Strategy
Each post triggers a distribution invalidation (/* pattern) to ensure immediate visibility. Production deployments would implement more conservative TTL-based invalidation, but for engineering visibility, immediate publication is preferred.
Integration with Navigation
The Ship's Papers menu on queenofsandiego.com and equivalent navigations on other properties now include a "Technical Blog" link pointing to the respective tech subdomain. This makes engineering documentation immediately discoverable alongside operational information.
The link structure is:
- queenofsandiego.com Ship's Papers → Technical Blog → tech.queenofsandiego.com
- Similar pattern for sailjada.com, dangerouscentaur.com, and burialsatseasandiego.com
Cross-Session State Management
Project memory is maintained in ~/.claude/projects/-Users-cb-Documents-repos/memory/ with:
project_tech_blogs.md— tracking which sites have blogs deployedreference_godaddy_credentials.md— GoDaddy API credential reference (for infrastructure changes, not embedded in posts)MEMORY.md— session notes and decisions for continuity across multiple sessions
Deployment Status
All four tech blogs are now live:
- QOS and Jada blogs: Route53 alias records, ACM wildcard certificates, CloudFront distributions active
- Dangerous Centaur: CNAME via Namecheap DNS pointing to CloudFront
- Burial at Sea San Diego: CNAME via GoDaddy DNS (ACM validation pending DNS propagation)
DNS propagation and ACM certificate validation are in progress for bats; remaining sites are immediately accessible.
What's Next
- Complete ACM validation for burialsatseasandiego.com tech blog
- Test automated post generation on subsequent sessions
- Implement post index