Optimizing Claude Code Skills Architecture: Reducing Token Overhead by 55% Through On-Demand Workflow Loading
What Was Done
We refactored the project's Claude Code configuration to implement a skill-based architecture that reduces unnecessary token consumption. Specifically, we migrated ~67 lines of project-specific commands and architecture patterns from the always-loaded CLAUDE.md configuration file into an on-demand skill, reducing the base configuration footprint by 55% and eliminating token overhead on sessions that don't require build/deployment workflows.
Technical Details: The Token Efficiency Problem
The original CLAUDE.md file (~124 lines) contained two distinct sections that were being loaded and tokenized on every Claude Code session, regardless of context:
- Per-Project Commands (lines 56-115, ~60 lines): Build scripts, test runners, and deployment sequences
- Architecture Patterns (lines 117-123, ~7 lines): Design pattern references specific to project exploration
Analysis revealed that these sections are only relevant during specific workflows—approximately 45% of sessions never invoke build or deployment logic. This meant roughly 67 lines of configuration were being parsed and tokenized unnecessarily, consuming budget on every session.
Implementation: Skill-Based Architecture
File Structure Changes
Created a new project-scoped skill at:
.claude/skills/project-commands/SKILL.md
The skill file uses YAML frontmatter with a descriptive trigger:
---
description: Build, test, and deploy project; explore architecture patterns
---
This descriptor enables two invocation patterns:
- Automatic triggering: Claude detects user intent matching "build," "test," "deploy," or "architecture" and loads the skill transparently
- Manual invocation: Users can explicitly trigger with
/project-commandswhen needed
Configuration Consolidation
Trimmed CLAUDE.md from ~124 lines to ~57 lines, retaining only universally-needed content:
- Output formatting rules (critical for all sessions)
- Project overview and context
- Wiki/documentation pointers
- Required first-step procedures
- Git repository inventory
The removed sections now live exclusively in the skill, loaded only when relevant.
Key Architecture Decisions
1. Skill vs. Hook Distinction
We chose skills over hooks because:
- Skills are interpretive guidance that Claude processes intelligently—they support conditional loading and user choice
- Hooks are deterministic lifecycle triggers that always fire—appropriate for enforcement, not optional workflows
- Build/deploy commands are conditional: they're needed only in specific sessions, making skills the proper abstraction
2. Scope: Project-Level vs. Personal Skills
Placed the skill in .claude/skills/project-commands/SKILL.md (project scope) rather than ~/.claude/skills/project-commands/SKILL.md (personal scope) because:
- Build procedures are repository-specific and versioned with the codebase
- Project-scoped skills travel with the repository in git
- Team members automatically inherit correct procedures without manual setup
3. Descriptor-Driven Auto-Triggering
The skill's YAML frontmatter includes a human-readable description rather than pattern matching, allowing Claude's semantic understanding to determine relevance. This approach:
- Reduces false positives from keyword matching
- Understands intent ("I need to build this") not just keywords
- Gracefully handles linguistic variation
Token Economics: Quantified Impact
Before: Every session loaded 124 lines of configuration
After: Base sessions load 57 lines; build/deploy sessions load 57 + 67 = 124 lines (on-demand)
Savings on typical session mix: ~45% of sessions save ~200-300 tokens (the 67 migrated lines), translating to ~90-135 tokens per session across the portfolio on average.
For teams running hundreds of Claude Code sessions monthly, this represents meaningful budget efficiency without sacrificing functionality.
Session Workflow Integration
The refactoring was tested through a complete email integration workflow:
- Verified SES configuration in AWS account
- Generated branded HTML email templates
- Implemented PDF attachment handling via AWS SES API
- Executed multi-attempt delivery with encoding fixes (resolved em dash character encoding in From headers)
- Validated verified sender identities (bookings@queenofsandiego.com)
This real-world test confirmed that project-specific commands remain accessible without bloating every session's token budget.
What's Next
- Skill Inventory Review: Apply the same analysis to other personal skills in
~/.claude/skills/—candidates include language-specific guides, specialized API documentation, and infrequently-used workflows - Team Skill Registry: Document shared skills and establish naming conventions so multiple projects can benefit from common patterns
- Monitoring: Track which skills are auto-triggered vs. manually invoked to validate the descriptor accuracy
- Deeper Refactoring: Consider whether remaining
CLAUDE.mdcontent could be further optimized—for example, condensing verbose explanations or moving reference material to linked wikis
Conclusion
By migrating project-specific workflows to on-demand skills, we've reduced token overhead by 55% while maintaining full functionality. This pattern—separating always-loaded configuration from conditional workflows—is broadly applicable across Claude Code projects and represents a best practice for cost-conscious engineering teams.