Optimizing Claude Agent Performance in a Multi-Tier Orchestration System: Model Selection, File Descriptor Limits, and EC2 Integration

When scaling an agentic system across distributed infrastructure, seemingly small configuration decisions compound into significant performance and cost implications. This post documents a recent optimization cycle where we addressed model selection for our JADA orchestrator, tuned system resource limits, and evaluated the cost-benefit tradeoffs of upgrading from Claude Haiku 3.5 to Sonnet 4.6 in a multi-agent workflow running across EC2 instances in us-east-1.

The Problem: Inadequate Model Capacity in a Multi-Agent Orchestrator

Our JADA booking system uses a hierarchical agent architecture where:

A central orchestrator decomposes complex booking requests into subtasks
Specialist agents handle domain-specific logic (availability checking, payment processing, conflict resolution)
Results cascade back up through the orchestrator for synthesis and client response

The orchestrator was initialized with Claude Haiku 3.5 as a cost optimization, but field observation indicated task decomposition was too shallow for complex multi-step workflows. Haiku would occasionally fail to identify interdependencies between subtasks, leading to redundant API calls and incomplete context propagation to specialist agents.

System Resource Constraints: File Descriptor Limits

Before addressing model selection, we encountered a foundational infrastructure issue. During local development and testing, we discovered that the default file descriptor limit on our development machines was insufficient for agents managing multiple concurrent connections:

ulimit -n

This returned 256 on macOS and 1024 on our test EC2 instance—inadequate for an orchestrator spawning multiple agent processes, each maintaining persistent connections to Claude API endpoints, Redis cache layers, and PostgreSQL databases.

We applied:

ulimit -n 2147483646

This sets the maximum open file descriptors to 2,147,483,646 (approximately 2^31 - 2, the maximum representable value in a 32-bit signed integer). The reasoning:

Why this value? Rather than a conservative limit like 65535, we set it to the kernel maximum. This prevents resource exhaustion becoming a bottleneck during traffic spikes and provides headroom for future scaling without repeated tuning.
Why file descriptors matter: Each agent process, API connection, socket, and pipe consumes a file descriptor. With 5–10 concurrent agent instances per orchestrator, plus Redis connections, this compounds quickly.
Persistence: This is a per-session limit. To persist across reboots and new sessions, we added to /etc/security/limits.conf on the EC2 instance running the orchestrator:

*    soft  nofile  2147483646
*    hard  nofile  2147483646

And verified via:

ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@34.239.233.28 "ulimit -n"

Model Upgrade: From Haiku 3.5 to Sonnet 4.6

With resource limits established, we addressed the core issue: model capacity. We updated the default model in the Claude client configuration:

# Before
cat /Users/cb/.claude/settings.json | grep -i model
# Output: "model": "claude-haiku-3-5-sonnet"

# After
cat /Users/cb/.claude/settings.json | grep -i model
# Output: "model": "claude-sonnet-4-6"

The settings file location: ~/.claude/settings.json (also checked secondary locations like ~/Documents/repos/.claude/settings.json for repo-specific overrides).

This affects all subsequent invocations of:

cd ~/Documents/repos && claude --dangerously-skip-permissions

Why Sonnet 4.6 over Opus 4.7?

Cost: Sonnet 4.6 is approximately 2–3x more expensive than Haiku but 30–40% cheaper than Opus on a per-token basis.
Task decomposition: Sonnet 4.6 excels at breaking down complex workflows into well-structured subtasks with clear dependencies—essential for orchestrator performance.
Latency: While slower than Haiku, Sonnet's latency profile (typically 2–4 seconds for complex prompts) is acceptable for orchestration workflows where decisions can be cached or batched.
Overkill avoidance: Opus 4.7 is optimized for extremely long-context reasoning and creative synthesis. Our orchestrator's bottleneck is task decomposition, not novelty—Sonnet is sufficient.

Integration with EC2-Based Orchestrator

The central question: does upgrading the local development model configuration affect the EC2 instance running the production orchestrator?

Answer: No, not directly. But here's the architecture:

# Development environment (local laptop)
/Users/cb/.claude/settings.json → claude CLI tool

# Production orchestrator (EC2)
/home/ubuntu/jada-orchestrator/.claude/settings.json → orchestrator's internal Claude client

We have two separate configurations. The development environment now defaults to Sonnet 4.6, but the EC2 instance still uses its own settings file. We verified the EC2 orchestrator status:

aws lightsail get-instance --instance-name jada-agent --region us-east-1 2>&1 | grep -A 5 '"state"'
# Output: "state": "running"

And confirmed systemd service status:

ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@34.239.233.28 "systemctl status jada-agent.service 2>&1 | head -20"

To propagate the model upgrade to production, we need to:

Update /home/ubuntu/jada-orchestrator/.claude/settings.json on the EC2 instance to also specify Sonnet 4.6
Restart the orchestrator service: systemctl restart jada-agent.service
Verify specialist agents are receiving properly decomposed tasks by inspecting logs in CloudWatch

Key Decisions and Tradeoffs

Cost impact: Upgrading all orchestrator invocations from Haiku to Sonnet increases per-request costs by ~2.5x. However, improved task decomposition reduces redundant specialist agent calls by an estimated 30–40%, partially offsetting the model upgrade cost.

Session lifecycle: Configuration changes in ~/.claude/settings.json take effect only in new terminal sessions. Existing shells continue using the previous model until reinitialized.

Environment variable overrides: We confirmed no environment variables override the settings file:

env | grep -i claude | grep -i model
# Output: (empty—no overrides in place)

Verification and Next Steps

Before full production rollout, we recommend:

← All posts