Deploying a Receipt Management Portal and Automating Port Sheet Data Entry via Google Sheets API

This session involved two parallel infrastructure projects: standing up a receipt upload portal for a trailer rental business at quickdumpnow.com/books, and automating charter booking entry into a Google Sheets-based port log system. Both required careful handling of S3 static hosting, CloudFront distribution configuration, and Google Sheets API integrations.

Problem Statement

The quickdumpnow.com books section existed as a local HTML file but hadn't been deployed to production. Additionally, manual entry of charter bookings into the JADA Port Log spreadsheet was creating a bottleneck—each booking required multiple manual steps to transcribe into a standardized Excel format maintained on Google Drive.

What Was Done

  • Deployed a receipt management portal to https://quickdumpnow.com/books via S3 and CloudFront
  • Fixed S3 object model IDs and implemented pretty URL routing
  • Updated robots.txt to block the books section from search indexing
  • Invalidated CloudFront cache for affected paths
  • Automated port sheet entry via Python script using Google Sheets API
  • Implemented row-level data synchronization to preserve existing spreadsheet structure

Infrastructure: S3 and CloudFront Configuration

S3 Bucket Structure

The quickdumpnow.com site is hosted as a static website on S3. The file structure looks like:

s3://quickdumpnow-site/
├── index.html
├── robots.txt
├── books/
│   └── index.html
└── [other assets]

The deployment process required uploading to two S3 keys to support pretty URLs:

  • s3://quickdumpnow-site/books/index.html — the canonical path for browser navigation
  • s3://quickdumpnow-site/books — a bare key object containing the same content for direct URL access

This dual-key approach ensures that CloudFront can serve the content whether the request comes in as /books/ or /books. Many static site generators handle this automatically, but manual S3 deployments require explicit handling.

CloudFront Distribution Configuration

The distribution has a critical custom error response configuration: all 404 responses from the origin (S3) are redirected to the homepage. This prevents users from seeing raw S3 error pages but can mask deployment issues. When the books object didn't exist yet, CloudFront was returning the homepage instead of a 404, making it impossible to detect that the deployment hadn't completed.

The distribution was invalidated for these paths:

/books
/books/*
/robots.txt

CloudFront caches are typically invalidated with patterns matching the affected paths. The wildcard pattern ensures all sub-paths under /books/ are cleared, though in this case the directory structure is minimal.

Robots.txt Management

The updated robots.txt was modified to block the books section from search engine indexing since it's an internal utility for the business, not public-facing content. The disallow rule prevents crawlers from accessing the receipt portal.

Google Sheets API Integration: Port Sheet Automation

Architecture Overview

The port sheet workflow involves two systems:

  • Google Apps Script — legacy Apps Script file at /Users/cb/Documents/repos/sites/queenofsandiego.com/PortSheetReporter.gs managing spreadsheet operations
  • Python automation layer — new scripts at /Users/cb/Documents/repos/tools/jada_port_sheet.py and /Users/cb/Documents/repos/tools/reauth_jada_calendar.py handling authentication and data entry

The JADA Port Log 2026 spreadsheet on Google Drive serves as the source of truth. It contains tabs for each month (March, April, etc.) plus a Template tab defining the data structure. The spreadsheet ID was identified through Drive API queries searching for files matching "Port Sheet JADA".

Data Structure and Format Preservation

The port log uses a specific row-based format that had to be reverse-engineered from an existing Excel template. Initial attempts to read the file via Google Sheets API failed due to token scope limitations. The solution involved:

  • Downloading the .xls file directly as .xlsx
  • Parsing it with Python's openpyxl library to inspect row heights, column widths, and cell formatting
  • Reading the Template tab to understand the expected data layout
  • Building a new .xlsx file that preserves existing data while adding new entries in the correct format

The openpyxl approach was necessary because the Google Sheets API doesn't expose formatting metadata like row heights or merged cells, which are critical for matching the original template design.

Entry Addition: The Zurek Charter

A charter booking for "Joseph Zurek" with revenue of $1,845.72 needed to be added to the April 2026 tab. The process involved:

# Pseudocode flow
1. Authenticate to Google Drive using service account credentials
2. Download the existing JADA Port Log 2026 file
3. Parse the Template tab to identify column mappings
4. Create an April tab if it doesn't exist
5. Set the month label ("April 2026")
6. Append the Zurek entry with:
   - Charter name: Joseph Zurek
   - Amount: $1,845.72
   - Date: charter date from handoff
   - Status: completed
7. Upload the modified .xlsx back to the same Drive file ID

Multiple iterations were required to get the sheetId correct—Google Sheets tab IDs are different from the file ID and must be queried via the Sheets API before attempting row operations.

Key Decisions and Rationale

Why dual S3 key uploads? Pretty URLs in static hosting require redundant object creation. S3's "index.html" directory behavior only applies when the path ends in a slash, so /books/ works but /books doesn't without a separate bare object.

Why invalidate CloudFront? S3 objects are the origin, but CloudFront caches responses at edge locations. Without invalidation, clients would see stale content for up to 24 hours (the default TTL). Explicit invalidation forces immediate cache refresh.

Why use openpyxl instead of Google Sheets API? The Google Sheets API is designed for data operations, not formatting metadata. Formatting requirements (row heights, merged cells, font styles) are critical for matching business document templates. openpyxl provides granular control over these attributes at the cost of downloading and re-uploading the file.

Why Python script instead of pure Apps Script? Python provides better local development experience, dependency management via pip, and easier version control in the repo. Apps Script runs server-side and requires browser-based editing or complex clasp workflows.

What's Next

Future work should include:

  • Building an HTTP endpoint (Lambda + API Gateway, or