Building a Real-Time Technical Blog Pipeline for Four Sailing Charter Sites

```html

Overview: Granular Session Capture and Auto-Publishing

This session implemented a comprehensive technical blog system across four domains: tech.queenofsandiego.com, tech.sailjada.com, tech.dangerouscentaur.com, and tech.burialsatseasandiego.com. The goal was to create an automated pipeline that captures development work at a granular level—file paths, function names, infrastructure changes—and publishes detailed posts without requiring manual intervention or exposing credentials.

Technical Architecture

Core Components

Session Transcript Parser (/Users/cb/Documents/repos/tools/tech_blog_generator.py): Reads Claude Code session transcripts in JSONL format, extracts tool use events, file modifications, and command executions, filters sensitive data (credentials, API keys, tokens), and generates HTML blog posts
Infrastructure Initialization (/Users/cb/Documents/repos/tools/tech_blog_init.py): Provisions S3 buckets, CloudFront distributions, ACM certificates, and DNS records for each tech blog domain, handling three different DNS providers (Route53 for sailjada.com, Namecheap for dangerouscentaur.com, GoDaddy for burialsatseasandiego.com)
Claude Code Stop Hook (/Users/cb/.claude/hooks/tech_blog_stop.sh): Executes automatically when a Claude Code session ends, extracts the session transcript, generates the blog post, uploads to S3, and invalidates CloudFront cache
Settings Integration (/Users/cb/.claude/settings.json): Registers the stop hook globally so it runs for all sessions targeting the repos directory

Data Flow

Claude Code Session
       ↓
Session ends → Stop hook triggered
       ↓
Parse transcript (JSONL format)
       ↓
Extract: files modified, commands run, infrastructure changes
       ↓
Filter sensitive data (regex patterns for creds, keys, tokens)
       ↓
Generate HTML post (markdown → HTML conversion)
       ↓
Upload to appropriate S3 bucket
       ↓
Invalidate CloudFront distribution
       ↓
DNS already propagated (pre-configured)
       ↓
Post live at tech.[domain].com

Infrastructure Decisions and Implementation

S3 Bucket Strategy

Each domain gets its own S3 bucket following naming convention [domain]-tech-blog:

queenofsandiego-tech-blog (Route53 in sailjada.com hosted zone)
sailjada-tech-blog (Route53 in sailjada.com hosted zone)
dc-tech-blog (Namecheap DNS, leveraging existing dc-sites wildcard CF distribution pattern)
bats-tech-blog (GoDaddy DNS via API integration)

All buckets are configured as static website hosting with index.html as the default document. Bucket policies restrict public read access while allowing CloudFront OAI (Origin Access Identity) to retrieve objects.

CloudFront Distribution Strategy

Two approaches were used based on existing infrastructure:

Route53-managed domains (queenofsandiego.com, sailjada.com): New CloudFront distributions created with ACM wildcard certificates (*.queenofsandiego.com, *.sailjada.com) already validated and available. Alias records added to Route53 hosted zones pointing to CloudFront CNAME.
Namecheap-managed dangerouscentaur.com: Leveraged existing wildcard CloudFront distribution (ID: E2Q4UU71SRNTMB) already pointing to dc-sites bucket by adding a new origin (dc-tech-blog bucket) and cache behaviors for tech.dangerouscentaur.com/* paths.
GoDaddy-managed burialsatseasandiego.com: Created new CloudFront distribution, obtained ACM certificate DNS validation records, added CNAME to GoDaddy DNS via API, waited for validation, then created distribution.

Why this approach? Reusing existing wildcard certs and CF distributions where possible reduces overhead and certificate management complexity. GoDaddy required manual API-driven DNS validation because GoDaddy doesn't support automatic DNS validation through AWS.

DNS Configuration

Route53 (sailjada.com hosted zone):
  - queenofsandiego.com CNAME → d[distribution-id].cloudfront.net
  - sailjada.com CNAME → d[distribution-id].cloudfront.net
  
Namecheap (dangerouscentaur.com):
  - tech CNAME → d[existing-distribution-id].cloudfront.net
  
GoDaddy (burialsatseasandiego.com):
  - ACM validation CNAME: _[random].[domain] → _[validation-token].acm-validations.aws
  - tech A record → CloudFront distribution alias (once cert validated)

Session Transcript Parsing: Technical Details

JSONL Format Extraction

Claude Code sessions are stored as JSONL files in ~/.claude/projects. Each line is a JSON object representing an event. The parser filters for:

tool_use events with name: "bash" → commands executed
tool_result events containing file paths and operation types (Write, Edit)
text events → reasoning and decision notes

Sensitive Data Filtering

Multiple regex patterns filter credentials before publication:

AWS: Access keys (AKIA*), secret keys (40+ hex chars), API tokens, ARNs with account IDs redacted
API Keys: Generic patterns for API keys (32+ char hex/alphanumeric), tokens, bearer tokens
Passwords: Any line containing password, passwd, pwd, secret, token, key
Email: Partial redaction of email addresses used in code
Domain credentials: Namecheap, GoDaddy, Route53 tokens

The filter is aggressive: if a line contains sensitive patterns, the entire line is replaced with [sensitive data redacted]. This prevents accidental exposure while maintaining readability for non-sensitive commands.

HTML Generation

Posts are generated with this structure:

<h2>[Specific action]</h2>
<h3>What Was Done</h3>
- Bulleted list of files modified/created
- Count: X files

<h3>Commands Executed</h3>
- Filtered command list with brief