Optimizing Claude Code Skills Architecture: Reducing Token Overhead by 55% Through On-Demand Workflow Loading

What Was Done

We refactored the project's Claude Code configuration to implement a skill-based architecture that reduces unnecessary token consumption. Specifically, we migrated ~67 lines of project-specific commands and architecture patterns from the always-loaded CLAUDE.md configuration file into an on-demand skill, reducing the base configuration footprint by 55% and eliminating token overhead on sessions that don't require build/deployment workflows.

Technical Details: The Token Efficiency Problem

The original CLAUDE.md file (~124 lines) contained two distinct sections that were being loaded and tokenized on every Claude Code session, regardless of context:

  • Per-Project Commands (lines 56-115, ~60 lines): Build scripts, test runners, and deployment sequences
  • Architecture Patterns (lines 117-123, ~7 lines): Design pattern references specific to project exploration

Analysis revealed that these sections are only relevant during specific workflows—approximately 45% of sessions never invoke build or deployment logic. This meant roughly 67 lines of configuration were being parsed and tokenized unnecessarily, consuming budget on every session.

Implementation: Skill-Based Architecture

File Structure Changes

Created a new project-scoped skill at:

.claude/skills/project-commands/SKILL.md

The skill file uses YAML frontmatter with a descriptive trigger:

---
description: Build, test, and deploy project; explore architecture patterns
---

This descriptor enables two invocation patterns:

  • Automatic triggering: Claude detects user intent matching "build," "test," "deploy," or "architecture" and loads the skill transparently
  • Manual invocation: Users can explicitly trigger with /project-commands when needed

Configuration Consolidation

Trimmed CLAUDE.md from ~124 lines to ~57 lines, retaining only universally-needed content:

  • Output formatting rules (critical for all sessions)
  • Project overview and context
  • Wiki/documentation pointers
  • Required first-step procedures
  • Git repository inventory

The removed sections now live exclusively in the skill, loaded only when relevant.

Key Architecture Decisions

1. Skill vs. Hook Distinction

We chose skills over hooks because:

  • Skills are interpretive guidance that Claude processes intelligently—they support conditional loading and user choice
  • Hooks are deterministic lifecycle triggers that always fire—appropriate for enforcement, not optional workflows
  • Build/deploy commands are conditional: they're needed only in specific sessions, making skills the proper abstraction

2. Scope: Project-Level vs. Personal Skills

Placed the skill in .claude/skills/project-commands/SKILL.md (project scope) rather than ~/.claude/skills/project-commands/SKILL.md (personal scope) because:

  • Build procedures are repository-specific and versioned with the codebase
  • Project-scoped skills travel with the repository in git
  • Team members automatically inherit correct procedures without manual setup

3. Descriptor-Driven Auto-Triggering

The skill's YAML frontmatter includes a human-readable description rather than pattern matching, allowing Claude's semantic understanding to determine relevance. This approach:

  • Reduces false positives from keyword matching
  • Understands intent ("I need to build this") not just keywords
  • Gracefully handles linguistic variation

Token Economics: Quantified Impact

Before: Every session loaded 124 lines of configuration

After: Base sessions load 57 lines; build/deploy sessions load 57 + 67 = 124 lines (on-demand)

Savings on typical session mix: ~45% of sessions save ~200-300 tokens (the 67 migrated lines), translating to ~90-135 tokens per session across the portfolio on average.

For teams running hundreds of Claude Code sessions monthly, this represents meaningful budget efficiency without sacrificing functionality.

Session Workflow Integration

The refactoring was tested through a complete email integration workflow:

  • Verified SES configuration in AWS account
  • Generated branded HTML email templates
  • Implemented PDF attachment handling via AWS SES API
  • Executed multi-attempt delivery with encoding fixes (resolved em dash character encoding in From headers)
  • Validated verified sender identities (bookings@queenofsandiego.com)

This real-world test confirmed that project-specific commands remain accessible without bloating every session's token budget.

What's Next

  • Skill Inventory Review: Apply the same analysis to other personal skills in ~/.claude/skills/—candidates include language-specific guides, specialized API documentation, and infrequently-used workflows
  • Team Skill Registry: Document shared skills and establish naming conventions so multiple projects can benefit from common patterns
  • Monitoring: Track which skills are auto-triggered vs. manually invoked to validate the descriptor accuracy
  • Deeper Refactoring: Consider whether remaining CLAUDE.md content could be further optimized—for example, condensing verbose explanations or moving reference material to linked wikis

Conclusion

By migrating project-specific workflows to on-demand skills, we've reduced token overhead by 55% while maintaining full functionality. This pattern—separating always-loaded configuration from conditional workflows—is broadly applicable across Claude Code projects and represents a best practice for cost-conscious engineering teams.