Deploying a Receipt Management Portal and Automating Port Sheet Data Entry via Google Sheets API
This session involved two parallel infrastructure projects: standing up a receipt upload portal for a trailer rental business at quickdumpnow.com/books, and automating charter booking entry into a Google Sheets-based port log system. Both required careful handling of S3 static hosting, CloudFront distribution configuration, and Google Sheets API integrations.
Problem Statement
The quickdumpnow.com books section existed as a local HTML file but hadn't been deployed to production. Additionally, manual entry of charter bookings into the JADA Port Log spreadsheet was creating a bottleneck—each booking required multiple manual steps to transcribe into a standardized Excel format maintained on Google Drive.
What Was Done
- Deployed a receipt management portal to
https://quickdumpnow.com/booksvia S3 and CloudFront - Fixed S3 object model IDs and implemented pretty URL routing
- Updated
robots.txtto block the books section from search indexing - Invalidated CloudFront cache for affected paths
- Automated port sheet entry via Python script using Google Sheets API
- Implemented row-level data synchronization to preserve existing spreadsheet structure
Infrastructure: S3 and CloudFront Configuration
S3 Bucket Structure
The quickdumpnow.com site is hosted as a static website on S3. The file structure looks like:
s3://quickdumpnow-site/
├── index.html
├── robots.txt
├── books/
│ └── index.html
└── [other assets]
The deployment process required uploading to two S3 keys to support pretty URLs:
s3://quickdumpnow-site/books/index.html— the canonical path for browser navigations3://quickdumpnow-site/books— a bare key object containing the same content for direct URL access
This dual-key approach ensures that CloudFront can serve the content whether the request comes in as /books/ or /books. Many static site generators handle this automatically, but manual S3 deployments require explicit handling.
CloudFront Distribution Configuration
The distribution has a critical custom error response configuration: all 404 responses from the origin (S3) are redirected to the homepage. This prevents users from seeing raw S3 error pages but can mask deployment issues. When the books object didn't exist yet, CloudFront was returning the homepage instead of a 404, making it impossible to detect that the deployment hadn't completed.
The distribution was invalidated for these paths:
/books
/books/*
/robots.txt
CloudFront caches are typically invalidated with patterns matching the affected paths. The wildcard pattern ensures all sub-paths under /books/ are cleared, though in this case the directory structure is minimal.
Robots.txt Management
The updated robots.txt was modified to block the books section from search engine indexing since it's an internal utility for the business, not public-facing content. The disallow rule prevents crawlers from accessing the receipt portal.
Google Sheets API Integration: Port Sheet Automation
Architecture Overview
The port sheet workflow involves two systems:
- Google Apps Script — legacy Apps Script file at
/Users/cb/Documents/repos/sites/queenofsandiego.com/PortSheetReporter.gsmanaging spreadsheet operations - Python automation layer — new scripts at
/Users/cb/Documents/repos/tools/jada_port_sheet.pyand/Users/cb/Documents/repos/tools/reauth_jada_calendar.pyhandling authentication and data entry
The JADA Port Log 2026 spreadsheet on Google Drive serves as the source of truth. It contains tabs for each month (March, April, etc.) plus a Template tab defining the data structure. The spreadsheet ID was identified through Drive API queries searching for files matching "Port Sheet JADA".
Data Structure and Format Preservation
The port log uses a specific row-based format that had to be reverse-engineered from an existing Excel template. Initial attempts to read the file via Google Sheets API failed due to token scope limitations. The solution involved:
- Downloading the .xls file directly as .xlsx
- Parsing it with Python's
openpyxllibrary to inspect row heights, column widths, and cell formatting - Reading the Template tab to understand the expected data layout
- Building a new .xlsx file that preserves existing data while adding new entries in the correct format
The openpyxl approach was necessary because the Google Sheets API doesn't expose formatting metadata like row heights or merged cells, which are critical for matching the original template design.
Entry Addition: The Zurek Charter
A charter booking for "Joseph Zurek" with revenue of $1,845.72 needed to be added to the April 2026 tab. The process involved:
# Pseudocode flow
1. Authenticate to Google Drive using service account credentials
2. Download the existing JADA Port Log 2026 file
3. Parse the Template tab to identify column mappings
4. Create an April tab if it doesn't exist
5. Set the month label ("April 2026")
6. Append the Zurek entry with:
- Charter name: Joseph Zurek
- Amount: $1,845.72
- Date: charter date from handoff
- Status: completed
7. Upload the modified .xlsx back to the same Drive file ID
Multiple iterations were required to get the sheetId correct—Google Sheets tab IDs are different from the file ID and must be queried via the Sheets API before attempting row operations.
Key Decisions and Rationale
Why dual S3 key uploads? Pretty URLs in static hosting require redundant object creation. S3's "index.html" directory behavior only applies when the path ends in a slash, so /books/ works but /books doesn't without a separate bare object.
Why invalidate CloudFront? S3 objects are the origin, but CloudFront caches responses at edge locations. Without invalidation, clients would see stale content for up to 24 hours (the default TTL). Explicit invalidation forces immediate cache refresh.
Why use openpyxl instead of Google Sheets API? The Google Sheets API is designed for data operations, not formatting metadata. Formatting requirements (row heights, merged cells, font styles) are critical for matching business document templates. openpyxl provides granular control over these attributes at the cost of downloading and re-uploading the file.
Why Python script instead of pure Apps Script? Python provides better local development experience, dependency management via pip, and easier version control in the repo. Apps Script runs server-side and requires browser-based editing or complex clasp workflows.
What's Next
Future work should include:
- Building an HTTP endpoint (Lambda + API Gateway, or