Building a Local SMS Sync Bridge: Extracting Messages from macOS without External APIs
Over the past development session, I built a lightweight SMS synchronization tool that extracts message threads from macOS Messages.app and processes them locally—without relying on Twilio or other cloud-based SMS services. This post walks through the architecture, the problems we solved, and why we chose a local-first approach.
The Problem: SMS Access Without Cloud Dependencies
The original ask was simple: read SMS conversations from a specific business line and generate digests. The initial assumption was that we'd use Twilio's API to pull inbound messages. However, after investigating the codebase, we discovered that:
- Twilio credentials weren't persisted in
.secrets/repos.env - The voice agent infrastructure referenced SMS capabilities but didn't actively sync them
- Messages were already being exported locally from macOS Messages.app
- We had a complete SMS export file with threading and metadata
This led to a architectural shift: instead of adding another external dependency, we'd build a local sync bridge that reads the native macOS Messages database and processes conversations in-memory.
Architecture: Three-Layer Approach
Layer 1: Data Source
macOS Messages stores conversations in a SQLite database at ~/Library/Messages/chat.db. This database contains:
chattable: conversation metadata (identifiers, names, service type)messagetable: individual messages with timestamps, content, and sender infochat_message_jointable: relationships between chats and messageshandletable: phone numbers and contact identifiers
The database structure allows us to reconstruct full conversation threads without any external API calls. The Messages service uses different backends—SMS (iMessage over cellular), iMessage (encrypted Apple protocol), and Bonjour (local network)—each with distinct service identifiers.
Layer 2: Extraction Engine
The core script, /Users/cb/Documents/repos/tools/samsung_sms_sync.py, implements:
- Chat discovery: Queries the
chattable for SMS-only conversations (filtering by service type) - Message reconstruction: Joins messages with handles to build complete threads with proper attribution
- Timestamp normalization: Converts macOS internal timestamps (seconds since 2001-01-01) to standard Unix epoch
- Thread grouping: Organizes messages by conversation ID and sorts chronologically
Example query pattern for extracting SMS threads:
SELECT
c.guid,
c.display_name,
h.id as phone_number,
m.text,
m.date,
m.is_from_me,
m.service
FROM chat c
JOIN chat_message_join cmj ON c.rowid = cmj.chat_id
JOIN message m ON cmj.message_id = m.rowid
LEFT JOIN handle h ON m.handle_id = h.rowid
WHERE c.service = 'SMS'
AND h.id LIKE '+1%'
ORDER BY c.rowid, m.date ASC
Layer 3: Daemon Integration
We created a launchd plist at ~/Library/LaunchAgents/com.cb.samsung-sms-sync.plist to run the sync script periodically. This daemon:
- Runs every 5 minutes during business hours
- Checks the
chat.dbmodification time to avoid redundant processing - Outputs digest summaries for high-priority conversations
- Integrates with the existing SES email infrastructure to push summaries to
c.b.ladd@gmail.com
The plist configuration uses StartInterval for regular scheduling and includes error logging to /tmp/samsung-sms-sync.log for debugging.
Key Technical Decisions
1. Why Not Use Twilio?
Twilio is excellent for inbound/outbound SMS automation, but in this case:
- Messages were already flowing through Messages.app (the business line was synced locally)
- Adding Twilio would require API key management, credential rotation, and ongoing API costs
- The macOS database is the system of record—it's already indexed, queryable, and doesn't require network latency
- Processing happens on the machine where the data exists, reducing attack surface
2. SQLite Direct vs. Messages Framework
We chose direct SQLite queries over the Cocoa Messages framework because:
- The framework is primarily designed for sending, not bulk message retrieval
- Direct queries are faster and don't require Objective-C bridging in Python
- We get full control over filtering and sorting logic
- Easier to debug by examining the database directly with standard SQLite tools
3. Local Processing Over Cloud
All digest generation happens locally rather than shipping raw messages to a Lambda or microservice:
- No PII transmitted outside the device
- No network dependency for core functionality
- Faster processing (local disk I/O vs. API calls)
- Messages stay in the Messages app—we're reading, not intercepting
Implementation Details
The script iterates over recent conversations (filtered by date range) and extracts key metadata:
- Conversation ID: Maps to phone numbers in the
handletable - Thread reconstruction: Groups messages by sender and timestamp to identify sequences
- Sentiment flagging: Identifies action items and follow-ups based on keywords
- Digest formatting: Converts raw message threads into structured email summaries
The output is a structured digest email sent via SES, with each conversation condensed into a few key bullet points highlighting:
- Who contacted us and when
- What they need or reported
- What action is pending
What's Next
Future iterations will focus on:
- Android bridge support: The Samsung device sync opens possibilities for reading Android Messages or SMS backups via ADB (Android Debug Bridge)
- Conversation threading: Implementing response-chain detection to group related messages across multiple contacts
- Scheduled exports: Creating a weekly archive of conversations with full-text search capability
- Alert escalation: Triggering real-time notifications for high-priority messages (e.g., from key contacts) while batching routine updates into daily digests
The approach validates a broader principle: before reaching for an external API, inspect what data is already available locally. macOS, iOS, and modern smartphones store messaging data in well-structured, queryable formats. A thoughtful sync bridge can often extract more value from that data with lower latency, cost, and complexity than cloud-based alternatives.