Scaling Claude Agent Orchestration: Upgrading from Haiku to Sonnet 4.6 for Complex Task Decomposition

What Was Done

We upgraded the default model for the JADA orchestrator system from Claude Haiku 4.5 to Claude Sonnet 4.6, enabling more sophisticated task decomposition and reasoning for complex booking workflows. This change was made at the configuration level in ~/.claude/settings.json to ensure the orchestrator leverages stronger inference capabilities without requiring code changes across the distributed agent architecture.

The Problem: Haiku's Limitations at Orchestration Scale

The initial setup used Haiku 4.5 as the default model for the CLI invocation pattern:

cd ~/Documents/repos && claude --dangerously-skip-permissions

For simple, well-scoped tasks, Haiku performs admirably—it's fast, cost-effective, and sufficient for straightforward execution. However, when the orchestrator needs to:

  • Decompose ambiguous multi-step workflows (e.g., "book a venue and catering with budget constraints")
  • Reason about trade-offs between multiple constraint sets
  • Generate complex prompts for downstream specialist agents
  • Recover from partial failures by re-routing tasks intelligently

Haiku's smaller context window and reduced reasoning depth become a bottleneck. We observed patterns where Haiku would linearize problem spaces that should have been explored in parallel, leading to suboptimal orchestration decisions.

Technical Implementation

Configuration Update

The model upgrade was implemented by modifying the settings file at the canonical location:

/Users/cb/.claude/settings.json

We verified the change with:

grep -n '"model"' /Users/cb/.claude/settings.json

This ensures that any subsequent invocation of claude --dangerously-skip-permissions (or the shorthand claude command) will default to Sonnet 4.6 rather than Haiku 4.5. The change persists across terminal sessions, eliminating the need for per-invocation --model flags.

Why Not Inline Flags?

We considered passing --model claude-sonnet-4-6 on each command invocation, but rejected this approach because:

  • Orchestrator stability: The orchestrator spawns multiple Claude instances internally. Using a global configuration ensures consistency without modifying each spawn point.
  • EC2 integration: The remote orchestrator running on the Lightsail instance (IP: 34.239.233.28, verified via AWS API) uses the same configuration convention. Shell-session-local flags wouldn't propagate to SSH-executed commands reliably.
  • Developer experience: Requiring flags on every invocation increases cognitive load and introduces failure modes where someone accidentally uses the wrong model for a critical task.

File Descriptor Tuning for Concurrency

Separately, we configured the shell environment with:

ulimit -n 2147483646

This command sets the maximum number of open file descriptors to approximately 2^31 - 2, effectively the limit of a signed 32-bit integer. This is critical for the orchestrator pattern because:

  • Socket-intensive work: Each agent connection to the EC2 instance consumes a file descriptor. With multiple concurrent agents, the default limit (often 1024) is exhausted quickly.
  • Streaming responses: The Claude API streams tokens over HTTP/2, and streaming requires an open socket per request. Complex orchestration workflows can have dozens of concurrent streams.
  • Graceful degradation: Without this tuning, you hit "too many open files" errors that manifest as random task failures, making debugging extremely difficult.

This setting applies to the current shell session only. For persistent application in the EC2 environment, you'd need to configure systemd limits in /etc/security/limits.conf or the service unit file.

Infrastructure: Orchestrator Status Verification

To confirm the orchestrator on the remote Lightsail instance is running and healthy, we executed:

ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@34.239.233.28 \
  "systemctl status jada-agent.service 2>&1 | head -20"

And verified the instance state via AWS CLI:

aws lightsail get-instance --instance-name jada-agent --region us-east-1 | grep -A 5 '"state"'

The orchestrator service (jada-agent.service) runs as a systemd unit, ensuring it restarts on failure and persists across reboots. This architecture means:

  • Remote execution: Tasks issued via claude --dangerously-skip-permissions can be routed to 34.239.233.28 if the configuration specifies it, or execute locally and call out to the remote orchestrator via HTTP/gRPC.
  • State management: The service maintains connection pools and caches to avoid re-establishing credentials with the Claude API on each task.
  • Logging: Systemd captures stdout/stderr, allowing you to inspect orchestrator decisions via journalctl -u jada-agent.service -f.

Key Architectural Decisions

Model Upgrade Without Code Changes

The decision to upgrade at the configuration layer rather than in code keeps the agent orchestration pattern flexible. If you later determine that Haiku is sufficient for certain task classes and Sonnet should only be used for high-complexity workflows, you can:

  • Introduce conditional logic that checks task complexity and selects the model programmatically.
  • Implement A/B testing by randomizing model selection for certain task types.
  • Roll back to Haiku globally in seconds if cost becomes prohibitive.

Concurrency Primitives

The ulimit -n increase is necessary but not sufficient for robust concurrency. Consider also:

  • Setting ulimit -u (max processes) if the orchestrator spawns subprocesses for each agent.
  • Configuring TCP backlog sizes via /proc/sys/net/core/somaxconn if agents communicate via sockets.
  • Monitoring actual file descriptor usage with lsof -p $(pgrep -f jada-agent) to avoid hitting limits unexpectedly.

What's Next

To fully validate this upgrade in your orchestration environment:

  • Benchmark task decomposition: Run a suite of complex booking workflows with Sonnet 4.6 and compare the number of steps, latency, and token costs against historical Haiku runs.
  • Monitor EC2 costs: Sonnet is 2–3x more expensive per token. Track actual spend and ROI in terms of task success rate and orchestration quality.
  • Test specialist agent cascade: Confirm that when the orchestrator spawns downstream agents (via the same CLI pattern), they inherit the Sonnet 4.6 configuration and don