Building Automated Technical Documentation: Multi-Domain Blog Generation Pipeline with Session Capture
What Was Done
Implemented an automated technical blog generation system that captures development work across four related domains
(tech.queenofsandiego.com, tech.sailjada.com, tech.dangerouscentaur.com, and tech.burialsatseasandiego.com)
and publishes detailed, granular technical posts immediately after each development session ends. This system ingests Claude AI session transcripts,
extracts technical details with full context, and generates blog posts that document exact file changes, infrastructure modifications, and architectural decisions.
Technical Architecture
The solution consists of three core components:
- Session Capture Hook (
/Users/cb/.claude/hooks/tech_blog_stop.sh): Executes when a Claude session terminates, extracting the session transcript and routing it to the appropriate blog generator - Blog Generator (
/Users/cb/Documents/repos/tools/tech_blog_generator.py): Parses JSONL-formatted session transcripts, identifies tool use patterns, file modifications, and command execution, then synthesizes these into structured blog posts - Infrastructure Initializer (
/Users/cb/Documents/repos/tools/tech_blog_init.py): Provisions S3 buckets, CloudFront distributions, and DNS records for each tech blog domain
Infrastructure Setup
Each tech blog runs on isolated AWS infrastructure:
- queenofsandiego.com tech blog: S3 bucket
qos-tech-blog, CloudFront distribution IDE1234ABCD(uses existing*.queenofsandiego.comwildcard ACM certificate from Route53) - sailjada.com tech blog: S3 bucket
jada-tech-blog, CloudFront distribution ID (uses existing*.sailjada.comwildcard ACM certificate from Route53) - dangerouscentaur.com tech blog: S3 bucket
dc-sites(shared with main site), CloudFront distribution IDE2Q4UU71SRNTMB(wildcard distribution), CNAME configured at Namecheap DNS provider - burialsatseasandiego.com tech blog: S3 bucket
bats-tech-blog, CloudFront distribution (ACM validation CNAME added via GoDaddy API), DNS nameservers managed at GoDaddy
The infrastructure init script detects existing wildcard certificates and distributions, avoiding duplicate resource creation. For domains without existing infrastructure, it generates ACM certificates and validates them via DNS CNAME records using appropriate DNS providers (Route53 for AWS-managed zones, GoDaddy API for external registrars, Namecheap for dangerouscentaur).
Session Transcript Processing
Claude Code stores session transcripts in JSONL format at ~/.claude/sessions/. Each line represents a discrete interaction event:
{"type": "user_message", "content": "..."}
{"type": "assistant_response", "content": "..."}
{"type": "tool_use", "name": "execute_command", "input": {"command": "..."}}
{"type": "tool_result", "content": "..."}
The blog generator extracts:
- File modifications: Exact paths from tool_use events (e.g.,
Write: /Users/cb/Documents/repos/tools/tech_blog_generator.py) - Command execution: Commands run with their output context, filtered to exclude credential exposure
- Architecture decisions: Extracted from assistant reasoning blocks and user requests
- Infrastructure changes: AWS resource IDs, S3 buckets, CloudFront distribution IDs, Route53 zones, ACM certificates, DNS validation records
The generator then synthesizes these granular details into a chronological narrative that explains not just WHAT was changed, but WHY specific technical decisions were made.
Credential Filtering and Security
Session transcripts may contain sensitive data. The generator implements multi-layer filtering:
- Regex patterns matching common credential formats (AWS key IDs, API tokens, passwords)
- Environment variable redaction (reads from
/Users/cb/Documents/repos/repos.envand strips matched values) - GoDaddy/Namecheap credential exclusion based on known secret names
- Explicit whitelisting of safe patterns (S3 bucket names, CloudFront dist IDs, Route53 zone IDs, file paths)
The system logs all filtering actions to ~/.claude/logs/tech_blog_filtering.log for audit purposes, ensuring no credentials are accidentally published while maintaining transparency about what was filtered.
Integration with Navigation
The "Ship's Papers" menu on queenofsandiego.com has been updated to include a "Technical Documentation" submenu. Similar navigation changes apply to the other domains' primary sites. This ensures Sergio and other stakeholders can easily discover and review the technical work being performed.
The HTML structure for the Ship's Papers dropdown was modified to include:
<li class="dropdown">
<a href="#" class="dropdown-toggle">Ship's Papers</a>
<ul class="submenu">
...existing items...
<li><a href="https://tech.queenofsandiego.com/">Technical Documentation</a></li>
</ul>
</li>
Stop Hook Implementation
The stop hook (tech_blog_stop.sh) is registered in ~/.claude/settings.json and executes after each session terminates:
#!/bin/bash
TRANSCRIPT_PATH="${CLAUDE_SESSION_DIR}/${CLAUDE_SESSION_ID}.jsonl"
PROJECTS_DIR="${CLAUDE_PROJECTS_DIR}/-Users-cb-Documents-repos"
PROJECT_CONTEXT="${PROJECTS_DIR}/memory/project_tech_blogs.md"
python3 /Users/cb/Documents/repos/tools/tech_blog_generator.py \
--transcript "${TRANSCRIPT_PATH}" \
--project-context "${PROJECT_CONTEXT}" \
--config /Users/cb/Documents/repos/tools/tech_blog_config.json
The hook determines which domain(s) are relevant by analyzing the project context memory file and file modification paths, then routes the generated post to the appropriate S3 bucket and invalidates the CloudFront distribution cache.
Key Decisions
- JSONL transcript parsing over GUI session history: Session transcripts contain complete, structured event data (tool use, file modifications, command results) that is impossible to extract from the GUI. JSONL format allows line-