Optimizing Claude Agent Performance: Migrating from Haiku 4.5 to Sonnet 4.6 in a Distributed Orchestrator Pattern

What Was Done

We upgraded the default Claude model configuration from Haiku 4.5 to Sonnet 4.6 across the JADA agent orchestration system, with particular focus on the orchestrator instance running on AWS Lightsail (jada-agent) in the us-east-1 region. This involved:

Updating ~/.claude/settings.json to specify claude-sonnet-4-6 as the default model
Verifying the orchestrator service status via systemctl status jada-agent.service
Confirming the Lightsail instance health and connectivity
Establishing resource limits to support increased token throughput

Technical Details: Model Selection and Architecture Impact

The decision to upgrade from Haiku 4.5 to Sonnet 4.6 centers on task decomposition capabilities in orchestrator patterns. Here's why this matters:

Haiku 4.5 is optimized for latency and cost, making it excellent for individual specialist agents that handle narrow, well-defined tasks. However, in an orchestrator pattern—where a single agent must break down complex workflows, route tasks to specialists, maintain context across multiple branches, and synthesize results—Haiku's reasoning depth becomes a bottleneck.

Sonnet 4.6 provides significantly better:

Context Window Management: Better ability to maintain state across multi-step orchestration workflows without losing semantic coherence
Task Decomposition: More sophisticated breaking of complex requirements into appropriately-scoped subtasks for specialist agents
Conditional Logic: Superior handling of branching logic, error cases, and fallback routing in orchestration chains
Consistency: More reliable output formatting when generating structured task definitions for downstream agents

Infrastructure Verification and Configuration

Before upgrading the model tier, we verified the underlying infrastructure could handle increased computational demand:

aws lightsail get-instance --instance-name jada-agent --region us-east-1

This command confirmed the Lightsail instance was in the running state with sufficient allocated resources. The orchestrator instance is configured to auto-restart the jada-agent.service on failure, which is critical because Sonnet's increased reasoning workload requires more stable resource management.

Service status verification:

ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@34.239.233.28 "systemctl status jada-agent.service"

This confirms the orchestrator service is actively running and listening for task submissions from the CLI layer.

File Descriptor Limits and Performance Tuning

The command ulimit -n 2147483646 deserves specific attention in this context. Here's what it does and why it matters:

ulimit is a shell builtin that modifies process resource limits within the current session
-n flag specifically targets the open file descriptor limit
2147483646 sets it to approximately 2^31 - 2, the practical maximum for a 32-bit signed integer

Why this is relevant for your orchestrator: when Sonnet processes complex tasks, it may spawn multiple concurrent API calls, maintain open connections to specialist agents, and handle streaming responses. Each socket and file handle consumes a file descriptor. The default Linux limit (often 1024) would quickly saturate. By raising this to near-maximum, you ensure the orchestrator can maintain many simultaneous connections without hitting resource exhaustion errors.

In practice, you'd set this in your systemd service file at /etc/systemd/system/jada-agent.service using the LimitNOFILE directive:

[Service]
LimitNOFILE=2147483646
ExecStart=/path/to/orchestrator/binary

Configuration Update Mechanism

The model configuration is persisted in two locations:

/Users/cb/.claude/settings.json — Local user configuration for interactive CLI sessions
/Users/cb/Documents/repos/.claude/settings.json — Repository-specific overrides for the claude --dangerously-skip-permissions workflow

The grep command used to verify the update:

grep -n '"model"' /Users/cb/.claude/settings.json

This confirms the model field now reads "model": "claude-sonnet-4-6" instead of the previous Haiku reference.

Current Session Behavior and Next Steps

Important caveat: configuration changes do not apply to already-running sessions. The orchestrator service on the Lightsail instance must be restarted to pick up the new model setting. Execute:

ssh ubuntu@34.239.233.28 sudo systemctl restart jada-agent.service

Verify the restart completed:

ssh ubuntu@34.239.233.28 systemctl is-active jada-agent.service

Should return active.

Cost and Performance Trade-offs

Sonnet 4.6 costs approximately 2-3x more per token than Haiku. For an orchestrator that may process dozens of tasks per session, token consumption compounds quickly. However, the quality-of-life improvement in task routing accuracy and context maintenance typically pays for itself through reduced error rates and fewer re-routes to specialist agents.

The orchestration pattern itself helps control costs: only the orchestrator uses Sonnet; specialist agents can remain on Haiku for their focused subtasks. This creates a cost-effective hybrid where expensive reasoning is applied only where it matters most.

Monitoring and Validation

After deployment, monitor the orchestrator's performance:

Check service logs: journalctl -u jada-agent.service -f
Monitor CPU/memory usage on the Lightsail instance dashboard
Track response latency for orchestration tasks (Sonnet will be slower, but should be within acceptable bounds)
Measure task success rates and error-routing frequency

If latency becomes problematic, consider implementing request batching in the orchestrator or adding read replicas of the specialist agent tier.

What's Next

The next phase involves performance baseline establishment. Run a representative set of complex orchestration workflows through both configurations, measuring token consumption, latency, and task success rates. Use this data to determine if Sonnet 4.6 is the right permanent choice, or if a hybrid approach (using Sonnet for only the most complex orchestration subtasks) would be more cost-effective.