Building a Local SMS Ingestion Pipeline: Querying macOS Messages Database for System Integration
One of the persistent operational challenges in a distributed business system is capturing and centralizing communication data without introducing security complexity or dependency on third-party SMS aggregation services. This post documents the technical approach we implemented to read SMS messages from a local macOS Messages database and surface them to our system in a structured, queryable format.
The Problem: SMS as a System Blindspot
Our JADA booking and operations workflow relies on multiple communication channels—Google Calendar, Stripe webhooks, Google Sheets for booking requests, and email. However, SMS remained isolated: guests text phone number 619-986-7344, but those messages live only in the phone's native Messages app. There was no way to programmatically access or log this data without:
- Paying for a third-party SMS aggregation service (Twilio, Bandwidth, etc.)
- Forwarding all texts to email (adds friction for guests)
- Manually copying message content into our system (error-prone, non-scalable)
The operational cost was real: we couldn't quickly answer "did this guest text us?" or surface SMS context when reviewing a booking.
Solution Architecture: macOS Messages Database Query
macOS stores iMessage and SMS data in a local SQLite database at:
~/Library/Messages/chat.db
This database is accessible to scripts running with user-level permissions and contains complete message history with timestamps, sender handles, message bodies, and metadata. Our approach:
- Query the local SQLite database for messages to/from a target phone number
- Parse and structure the results into JSON
- Write to local storage (
/Users/cb/bin/read-sms) for easy invocation and caching - Make queryable by phone number, date range, or sender
Technical Implementation
Script: /Users/cb/bin/read-sms
We created an executable script that queries the Messages database and returns structured SMS data. The script:
#!/bin/bash
# Query macOS Messages database for SMS by phone number
# Usage: read-sms [phone_number] [optional: limit]
PHONE_NUMBER="${1:-619-986-7344}"
LIMIT="${2:-100}"
MESSAGES_DB="$HOME/Library/Messages/chat.db"
sqlite3 "$MESSAGES_DB" << EOF
.mode json
SELECT
message.date,
message.text,
handle.id AS sender_handle,
message.is_from_me,
chat.display_name
FROM message
JOIN handle ON message.handle_id = handle.ROWID
JOIN chat_message_join ON message.ROWID = chat_message_join.message_id
JOIN chat ON chat_message_join.chat_id = chat.ROWID
WHERE handle.id LIKE '%${PHONE_NUMBER}%'
ORDER BY message.date DESC
LIMIT ${LIMIT};
EOF
Key design decisions:
- SQLite mode: JSON — Output is already JSON-formatted for consumption by downstream tools (no manual parsing needed)
- Phone number fuzzy match — The
LIKEclause handles variations (with/without hyphens, country codes) - is_from_me field — Distinguishes inbound (guest SMS) from outbound (our responses)
- Timestamp in native macOS format — Stored as seconds since 2001-01-01; conversion to ISO-8601 can happen downstream if needed
- No authentication required — The database is readable by the logged-in user; no API keys or credentials needed
Database Schema Context
The Messages database schema (simplified for this use case):
message— Core message records (text, date, is_from_me flag, handle_id)handle— Phone numbers and email addresses; id field contains the actual phone numberchat— Conversation threads; display_name field shows the contact or group namechat_message_join— Junction table linking messages to chats
This multi-table join is necessary because macOS Messages tracks conversations separately from individual messages, allowing a single message to be part of multiple chats.
Operational Integration
Once the script was in place, we verified it against known test data:
# Query all SMS from phone number 619-986-7344 (last 50 messages)
/Users/cb/bin/read-sms 619-986-7344 50
# Result: JSON array of message objects with date, text, sender_handle, is_from_me
This allows manual inspection when needed and provides a foundation for automated ingestion:
- Integration with booking checks — During charter payment verification, we can now programmatically check if a guest texted us before payment
- Audit trail — Messages can be logged to a local file or database for compliance and dispute resolution
- Alerting — A cron job could periodically check for new SMS and trigger Slack notifications or calendar entries
Security and Limitations
Data residency: All data remains on the local macOS machine. No SMS content is transmitted to cloud services, Stripe, or third parties.
Permissions: The script requires read access to ~/Library/Messages/chat.db. On modern macOS versions, this may prompt for Full Disk Access permission once; after that, it runs without prompts.
Scope limitations:
- Only works on a Mac with the Messages app synced from an iPhone (iCloud Messages sync required)
- Does not capture SMS from non-iMessage sources on Android or standalone SMS gateways
- Historical message availability depends on iCloud sync scope
Key Decisions and Trade-offs
- Why not use a service like Twilio? Cost (~$1-2/SMS), vendor lock-in, and overkill for a single phone number. Our solution has zero recurring cost.
- Why SQLite directly instead of a wrapper API? Simplicity and auditability. The SQL query is transparent and can be modified without deploying code changes.
- Why JSON output? Compatibility with downstream tools (jq, Python json module, AWS Lambda, etc.). Makes it easy to pipe to other scripts.
- Why local storage only? Guest privacy and compliance. SMS is PII; keeping it local and under our control reduces attack surface and regulatory burden.
What's Next
Immediate next steps:
- Automated logging: Add a cron job that runs
read-smsevery 30 minutes and appends new messages to/Users/cb/logs/sms.jsonl - Integration with booking workflows: When verifying charter payments, query SMS to confirm guest contact before sending payment reminders
- Webhook trigger: