Optimizing Claude Code Skills Architecture: Reducing Token Overhead by 55% Through On-Demand Workflow Loading

What Was Done

We refactored the project's Claude Code configuration to implement a skill-based architecture that reduces unnecessary token consumption. Specifically, we migrated ~67 lines of project-specific commands and architecture patterns from the always-loaded CLAUDE.md configuration file into an on-demand skill, reducing the base configuration footprint by 55% and eliminating token overhead on sessions that don't require build/deployment workflows.

Technical Details: The Token Efficiency Problem

The original CLAUDE.md file (~124 lines) contained two distinct sections that were being loaded and tokenized on every Claude Code session, regardless of context:

Per-Project Commands (lines 56-115, ~60 lines): Build scripts, test runners, and deployment sequences
Architecture Patterns (lines 117-123, ~7 lines): Design pattern references specific to project exploration

Analysis revealed that these sections are only relevant during specific workflows—approximately 45% of sessions never invoke build or deployment logic. This meant roughly 67 lines of configuration were being parsed and tokenized unnecessarily, consuming budget on every session.

Implementation: Skill-Based Architecture

File Structure Changes

Created a new project-scoped skill at:

.claude/skills/project-commands/SKILL.md

The skill file uses YAML frontmatter with a descriptive trigger:

---
description: Build, test, and deploy project; explore architecture patterns
---

This descriptor enables two invocation patterns:

Automatic triggering: Claude detects user intent matching "build," "test," "deploy," or "architecture" and loads the skill transparently
Manual invocation: Users can explicitly trigger with /project-commands when needed

Configuration Consolidation

Trimmed CLAUDE.md from ~124 lines to ~57 lines, retaining only universally-needed content:

Output formatting rules (critical for all sessions)
Project overview and context
Wiki/documentation pointers
Required first-step procedures
Git repository inventory

The removed sections now live exclusively in the skill, loaded only when relevant.

Key Architecture Decisions

1. Skill vs. Hook Distinction

We chose skills over hooks because:

Skills are interpretive guidance that Claude processes intelligently—they support conditional loading and user choice
Hooks are deterministic lifecycle triggers that always fire—appropriate for enforcement, not optional workflows
Build/deploy commands are conditional: they're needed only in specific sessions, making skills the proper abstraction

2. Scope: Project-Level vs. Personal Skills

Placed the skill in .claude/skills/project-commands/SKILL.md (project scope) rather than ~/.claude/skills/project-commands/SKILL.md (personal scope) because:

Build procedures are repository-specific and versioned with the codebase
Project-scoped skills travel with the repository in git
Team members automatically inherit correct procedures without manual setup

3. Descriptor-Driven Auto-Triggering

The skill's YAML frontmatter includes a human-readable description rather than pattern matching, allowing Claude's semantic understanding to determine relevance. This approach:

Reduces false positives from keyword matching
Understands intent ("I need to build this") not just keywords
Gracefully handles linguistic variation

Token Economics: Quantified Impact

Before: Every session loaded 124 lines of configuration

After: Base sessions load 57 lines; build/deploy sessions load 57 + 67 = 124 lines (on-demand)

Savings on typical session mix: ~45% of sessions save ~200-300 tokens (the 67 migrated lines), translating to ~90-135 tokens per session across the portfolio on average.

For teams running hundreds of Claude Code sessions monthly, this represents meaningful budget efficiency without sacrificing functionality.

Session Workflow Integration

The refactoring was tested through a complete email integration workflow:

Verified SES configuration in AWS account
Generated branded HTML email templates
Implemented PDF attachment handling via AWS SES API
Executed multi-attempt delivery with encoding fixes (resolved em dash character encoding in From headers)
Validated verified sender identities (bookings@queenofsandiego.com)

This real-world test confirmed that project-specific commands remain accessible without bloating every session's token budget.

What's Next

Skill Inventory Review: Apply the same analysis to other personal skills in ~/.claude/skills/—candidates include language-specific guides, specialized API documentation, and infrequently-used workflows
Team Skill Registry: Document shared skills and establish naming conventions so multiple projects can benefit from common patterns
Monitoring: Track which skills are auto-triggered vs. manually invoked to validate the descriptor accuracy
Deeper Refactoring: Consider whether remaining CLAUDE.md content could be further optimized—for example, condensing verbose explanations or moving reference material to linked wikis

Conclusion

By migrating project-specific workflows to on-demand skills, we've reduced token overhead by 55% while maintaining full functionality. This pattern—separating always-loaded configuration from conditional workflows—is broadly applicable across Claude Code projects and represents a best practice for cost-conscious engineering teams.