Building a Local SMS Sync Bridge: Extracting SMS Data from Samsung Devices Without Twilio
This post documents a practical solution for syncing SMS messages from Samsung Android devices to a macOS development environment, bypassing Twilio infrastructure entirely. The challenge: extract SMS threads from a Samsung device and integrate them into an existing conversation digest system without requiring cloud SMS APIs.
The Problem Statement
The existing infrastructure relied on Twilio for SMS handling—specifically the JADA business line (+16199867344). However, extracting historical SMS threads required access to device-level message databases rather than cloud API logs. The goal was to build a lightweight daemon that could:
- Read SMS data directly from a Samsung device's local storage
- Parse conversation threads and metadata
- Generate digest emails without Twilio credentials
- Run as a background service on macOS via LaunchAgent
Technical Architecture
Data Source: Samsung devices store SMS messages in the standard Android SQLite database located at /data/data/com.android.providers.telephony/databases/mmssms.db. This database contains two primary tables: sms (for text messages) and mms (for multimedia messages).
Transport Layer: The solution uses ADB (Android Debug Bridge) over USB to establish a connection to the Samsung device and pull the raw database file. ADB tunneling allows the local macOS machine to query the remote database without requiring cloud infrastructure.
Processing Pipeline: The extracted data flows through three stages:
- Extraction: Pull
mmssms.dbfrom device via ADB - Parsing: Query conversations by date range, phone number, and message type
- Delivery: Generate plaintext digest and send via AWS SES
Implementation Details
File: /Users/cb/Documents/repos/tools/samsung_sms_sync.py
The main script handles the full workflow:
#!/usr/bin/env python3
"""
SMS sync daemon for Samsung devices.
Pulls conversation threads and generates digests.
"""
import sqlite3
import subprocess
import json
from datetime import datetime, timedelta
from pathlib import Path
class SamsungSMSSync:
def __init__(self, device_id=None):
self.device_id = device_id
self.db_path = Path.home() / ".cache" / "samsung_sms_sync" / "mmssms.db"
self.db_path.parent.mkdir(parents=True, exist_ok=True)
def pull_database(self):
"""Fetch mmssms.db from connected Samsung device via ADB."""
adb_cmd = [
"adb",
"pull",
"/data/data/com.android.providers.telephony/databases/mmssms.db",
str(self.db_path)
]
subprocess.run(adb_cmd, check=True)
def query_conversations(self, phone_number=None, days_back=7):
"""Query SMS threads by phone number and date range."""
conn = sqlite3.connect(self.db_path)
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cutoff_time = int((datetime.now() - timedelta(days=days_back)).timestamp() * 1000)
if phone_number:
cursor.execute("""
SELECT address, date, body, type, read, error_code
FROM sms
WHERE address = ? AND date > ?
ORDER BY date DESC
""", (phone_number, cutoff_time))
else:
cursor.execute("""
SELECT address, date, body, type, read
FROM sms
WHERE date > ?
ORDER BY date DESC, address
""", (cutoff_time,))
rows = cursor.fetchall()
conn.close()
return [dict(row) for row in rows]
def format_digest(self, conversations):
"""Convert raw SMS data into readable digest format."""
digest = []
for conv in conversations:
timestamp = datetime.fromtimestamp(conv['date'] / 1000)
direction = "→" if conv['type'] == 1 else "←"
digest.append(f"{timestamp.strftime('%Y-%m-%d %H:%M')} {direction} {conv['address']}\n{conv['body']}")
return "\n\n".join(digest)
Key Database Schema Notes:
sms.address— Phone number (normalized format)sms.date— Timestamp in milliseconds since epochsms.type— 1 = sent, 2 = received, 3 = draft, etc.sms.read— 0 = unread, 1 = readsms.body— Message text (may be truncated in MMS table for long threads)
LaunchAgent Daemon Configuration
File: /Users/cb/Library/LaunchAgents/com.cb.samsung-sms-sync.plist
Label
com.cb.samsung-sms-sync
ProgramArguments
/usr/local/bin/python3
/Users/cb/Documents/repos/tools/samsung_sms_sync.py
--daemon
--interval
3600
StartInterval
3600
StandardOutPath
/var/log/samsung_sms_sync.log
StandardErrorPath
/var/log/samsung_sms_sync.error.log
RunAtLoad
The StartInterval key schedules the daemon to run every 3600 seconds (1 hour). This interval balances freshness against battery drain on the connected device.
Integration with Existing Infrastructure
The SMS digest is delivered via AWS SES (Simple Email Service), maintaining consistency with the existing notification pipeline. The script queries the local SMS database, formats conversation threads, and invokes the SES API using cached AWS credentials from ~/.aws/credentials.
Email delivery:
def send_digest_email(phone_number, digest_text, recipient_email):
"""Send SMS digest via AWS SES."""
import boto3
client = boto3.client('ses', region_name='us-west-2')
response = client.send_email(
Source='alerts@sailjada.com',
Destination={'ToAddresses': [recipient_email]},
Message={
'Subject': {'Data': f'SMS Digest: