Building Real-Time Maintenance Task Notifications for Queen of San Diego: A Full-Stack Architecture

When Travis added new maintenance tasks to the maintenance.queenofsandiego.com tool via SMS, there was no way for the operations team to discover them—they existed in a database with no surfacing mechanism. This post details how we built a complete notification pipeline to surface new tasks, route them based on criticality, and keep Sergio informed without notification fatigue.

The Problem Statement

The maintenance tool at maintenance.queenofsandiego.com (CloudFront distribution backed by S3 at s3://queen-maintenance-tool/) was collecting task data, but had zero visibility into newly-added items. The requirements were:

  • Surface new tasks to the team immediately
  • Notify Sergio of critical tasks in real-time, aggregate non-critical tasks daily
  • Test all changes in staging before production
  • Integrate with Google Calendar for the Jada Maintenance calendar associated with jadasailing@gmail.com

Architecture Overview: The Three-Layer Approach

We implemented a three-layer notification system:

  • Layer 1 (Frontend): Modified staging HTML at /Users/cb/Documents/repos/sites/queenofsandiego.com/tools/maintenance/staging-index.html to surface newly-added tasks with visual indicators
  • Layer 2 (GAS Middleware): New route handlers in BookingAutomation.gs and standalone persistence module MaintenancePersistence.gs
  • Layer 3 (AWS Lambda): Serverless function for notification orchestration, respecting the existing Lambda deployment pattern used by tips-box

Infrastructure Components

1. Google Apps Script Modules

Created MaintenancePersistence.gs to handle task storage and retrieval:

// File: MaintenancePersistence.gs
function logNewMaintenanceTask(taskData) {
  const sheet = SpreadsheetApp.getActive()
    .getSheetByName('Tasks');
  const timestamp = new Date();
  
  sheet.appendRow([
    timestamp,
    taskData.title,
    taskData.criticality, // 'critical', 'high', 'medium', 'low'
    taskData.assignee,
    taskData.description,
    false // notified flag
  ]);
  
  return { success: true, timestamp: timestamp };
}

function getUnnotifiedTasks() {
  const sheet = SpreadsheetApp.getActive()
    .getSheetByName('Tasks');
  const data = sheet.getDataRange().getValues();
  
  return data.filter(row => row[5] === false)
    .map(row => ({
      timestamp: row[0],
      title: row[1],
      criticality: row[2],
      assignee: row[3],
      description: row[4]
    }));
}

Modified BookingAutomation.gs to add maintenance routing:

// In existing doPost handler
function doPost(e) {
  const action = e.parameter.action;
  
  if (action === 'log_maintenance') {
    return handleMaintenanceLog(e);
  }
  // ... existing routes
}

function handleMaintenanceLog(e) {
  const taskData = JSON.parse(e.postData.contents);
  const result = logNewMaintenanceTask(taskData);
  
  // Trigger notification Lambda asynchronously
  triggerNotificationLambda(taskData);
  
  return ContentService.createTextOutput(
    JSON.stringify(result)
  ).setMimeType(ContentService.MimeType.JSON);
}

2. Lambda Notification Function

Deployed a Lambda function (following the existing tips-box pattern) with environment variables:

  • Function Name: jada-maintenance-notifier
  • Runtime: Python 3.11
  • IAM Role: Reused the existing jada-lambda-execution-role with SES and Calendar API permissions added
  • Timeout: 60 seconds
  • Memory: 512 MB

The Lambda function implements two notification strategies:

# Pseudo-code for notification logic
def lambda_handler(event, context):
    task = event['task_data']
    criticality = task['criticality']
    
    if criticality == 'critical':
        # Send immediately via SES
        send_email_notification(task, 'immediate')
        # Create calendar event
        create_calendar_event(task, 'urgent')
    else:
        # Queue for daily digest at 5 PM PT
        store_in_digest_queue(task)
    
    return { 'statusCode': 200 }

3. Frontend Surface Changes

Added a new "New Tasks" section to the staging HTML with live refresh:

// Key additions to staging-index.html
const tasksDiv = document.getElementById('new-tasks-panel');

function refreshNewTasks() {
  fetch('/api/tasks?status=new')
    .then(r => r.json())
    .then(tasks => {
      tasksDiv.innerHTML = tasks.map(t => `
        <div class="task-card priority-${t.criticality}">
          <span class="badge">${t.criticality.toUpperCase()}</span>
          <h4>${t.title}</h4>
          <p>${t.description}</p>
          <small>Added ${formatTime(t.timestamp)}</small>
        </div>
      `).join('');
    });
}

setInterval(refreshNewTasks, 30000); // Poll every 30 seconds

Key Architectural Decisions & Rationale

Decision: Criticality-Based Notification Routing

Why: Research from high-performing ops teams (Google SRE, Etsy) shows that notification fatigue reduces response time to actual critical issues. By batching non-critical alerts into a daily digest, we reduce Sergio's interrupt rate while ensuring urgent maintenance gets immediate attention.

Implementation: Lambda checks the criticality field—only critical tasks trigger immediate email+calendar events; high, medium, and low are queued for the 5 PM PT digest.

Decision: Google Sheets as Backing Store

Why: The team already uses Google Sheets for task tracking. Using the same Google account (jadasailing@gmail.com) means no new authentication system and built-in audit trail. The sheet integrates with Calendar API for event creation.

Decision: Staging/Production Separation via Email Alias

Why: We can't easily fork the maintenance tool into separate environments. The pragmatic solution: in staging, all notifications route to jadasailing@