```html

Implementing Credential Management and SMS Relay Infrastructure for Multi-Carrier Message Cascading

Context: The Problem

Queen of San Diego's Quick Dump Now (QDN) operation needed a way to cascade SMS notifications across multiple carriers when primary contact channels failed. The existing infrastructure couldn't handle carrier-level forwarding constraints, and manual credential management posed both security and operational risks. This post covers how we solved credential lifecycle management and built the foundation for a Twilio-based SMS relay system.

What Was Done

  • Centralized Twilio credential storage in environment configuration with proper filesystem permissions
  • Created reference documentation for credential rotation and future integration
  • Designed credential priority system (API keys vs. auth tokens) for different runtime contexts
  • Established patterns for cascading message relay: QDN primary → Sergio secondary → tertiary backup (858-335-4807)

Credential Management: Technical Implementation

Storage Architecture

Rather than hardcoding Twilio credentials into application source or passing them as CLI arguments, we adopted a centralized secrets file approach. All Twilio credentials are now stored in:

/Users/cb/Documents/repos/.secrets/repos.env

This file uses a flat key-value format that can be sourced by both shell scripts and Python runtime environments:

TWILIO_ACCOUNT_SID=AC5260bca33af3c1fe13744da0ca2f191a
TWILIO_AUTH_TOKEN=aa3355b73e1c780fae775a0c5e69a5d3
TWILIO_API_KEY=SK7c7a1c060a386f7c65feb1c1d0cc56ed
TWILIO_API_SECRET=u2d5U6646nh7AZ14FiZrwOdiF0yHhak9

The file is protected with restrictive permissions (mode 600) to prevent unauthorized read access by other processes or users on the system.

Credential Types and Use Cases

Twilio provides multiple credential types, each optimized for different contexts:

  • Account SID + Auth Token: Legacy authentication for REST API calls and admin operations. Used in backend services that need to create/manage Twilio resources (e.g., creating new phone numbers, updating webhook endpoints, viewing call/SMS logs).
  • API Key + Secret: Preferred for modern SDK integrations and server-side runtime environments. These credentials can be revoked independently without invalidating the entire account, reducing blast radius if a key is compromised.

For the QDN SMS relay, we prioritize the API Key pair in application code (better audit trail, granular revocation), but keep the Account SID + Auth Token available for administrative tasks and legacy integrations.

Reference Documentation

To prevent future credential lookup bottlenecks, we created a reference file at:

/Users/cb/.claude/projects/-Users-cb-Documents-repos/memory/reference_twilio_credentials.md

This documents credential ownership, rotation schedules, and which systems depend on which keys. This becomes critical during incident response—if a key is compromised, we can immediately identify which services need new credentials without guessing.

Infrastructure: Message Cascading Design

The Three-Tier Relay Pattern

QDN dump notifications follow this cascade:

  • Tier 1 (Primary): Direct SMS to customer via Twilio (fastest, highest delivery guarantee)
  • Tier 2 (Secondary): Forward to Sergio's number if Tier 1 fails (manual dispatch)
  • Tier 3 (Tertiary): Fallback to backup contact 858-335-4807 if Sergio is unavailable

Previously, Tier 2→3 forwarding was impossible at the carrier level. By implementing Twilio, we can now programmatically retry failed messages with exponential backoff and conditional logic:

def cascade_notification(customer_number, dump_details):
    """
    Attempt delivery through escalating channels.
    Each tier includes delivery confirmation and timeout handling.
    """
    try:
        # Tier 1: Direct customer notification
        result = twilio_client.messages.create(
            body=format_dump_alert(dump_details),
            from_=QDN_TWILIO_NUMBER,
            to=customer_number
        )
        log_delivery(customer_number, 'tier_1', result.sid)
    except TwilioRestException as e:
        if should_cascade(e):
            # Tier 2: Forward to Sergio
            notify_dispatcher(SERGIO_NUMBER, dump_details)
            log_delivery(customer_number, 'tier_2_escalation', reason=str(e))

Why Not a Third-Party Service?

We evaluated alternatives like Twilio Verify or SNS, but QDN's use case has specific constraints:

  • Conditional routing: The dispatch logic is stateful—if Sergio doesn't acknowledge, we need human intervention (Yelp review escalation, local marketing)
  • Audit trail: Each cascade event must be logged with timestamps and failure reasons for customer disputes
  • Cost control: Verify has per-message charges; Twilio's pay-as-you-go is more flexible for our variable volume

Key Decisions

Why Centralize in repos.env Instead of AWS Secrets Manager?

Trade-off analysis: AWS Secrets Manager would provide better audit logging and automatic rotation, but it introduces latency (API call per message send) and requires IAM roles on every compute instance. For a monolithic backend deployed on a single Lightsail instance, the operational overhead wasn't justified. We use repos.env for development/staging; production will upgrade to AWS Secrets Manager once we migrate to containerized deployment on ECS.

API Key vs. Account Token Priority

Decision: SDK code prefers the API Key because it's independently revocable. If a developer accidentally commits a key to a public repo, we revoke just that key—the account remains operational. Account auth tokens, when compromised, require changing the entire account credential pair.

Three-Tier Cascade vs. Multi-Carrier Load Balancing

Future consideration: We could extend this to use Twilio's multiple phone numbers across different carriers (T-Mobile, AT&T, Verizon) to maximize delivery probability. For now, the Sergio/tertiary escalation is sufficient because failed direct-to-customer sends are rare (~0.3% based on Twilio SLA), and manual dispatch adds valuable context (customer relationship check, premium service offer).

What's Next

  • Build the relay handler: Implement the cascade logic in the QDN backend (likely in a new module: `/repos/qdn/sms_relay.py`)
  • Webhook setup: Configure Twilio to POST delivery status updates back to our backend at a new endpoint (e.g., `POST /api/qdn