Debugging a Cascading Deployment Failure: Race Conditions, Template Escaping, and S3 Staging Recovery
This post documents how an attempted fix for a booking calendar race condition in sailjada.com created a broader deployment failure across multiple properties, and the systematic debugging approach used to identify and contain the damage.
The Initial Problem Statement
An earlier session identified a race condition in the booking modal on sailjada.com where jadaOpenBook() was opening the availability calendar before the async data fetch completed. The fix appeared straightforward: add a loading state check before opening the modal.
However, during deployment and testing, multiple issues emerged:
- Python format-string placeholders (
{{variable}}) remained in deployed JavaScript contexts - 22 pages received modifications with inconsistent testing
- Staging deployments were made to s3://queenofsandiego.com/_staging/ without clear rollback procedures
- No verification that the fix actually resolved the original race condition
- Related properties (queenofsandiego.com) received untested deployments
Technical Root Causes Identified
Template Escaping Confusion
The sailjada.com site uses Python Jinja2 templates with double-brace syntax: {{ variable }} for template variable substitution. The codebase contains legitimate CSS uses:
/* Valid CSS grid template areas */
grid-template-areas: "{{ grid_area_name }}";
However, the fix introduced JavaScript with similar syntax:
// Invalid JavaScript - this is template syntax, not JS
if ({{ isLoading: false }}) { ... }
The distinction is critical: CSS and static HTML sections can contain unescaped double-braces during template rendering. JavaScript blocks cannot, since they execute before template variables are substituted. The fix needed to either:
- Use properly escaped syntax:
<script>var data = '{{ json_variable }}';</script> - Move the state check to an already-rendered JavaScript variable
- Or, conditionally render the entire script block server-side
Cascade Effect Across Multiple Properties
The 4.5 session made changes across these locations:
/Users/cb/Documents/repos/sites/sailjada.com/index.html(11 edits)/Users/cb/Documents/repos/sites/sailjada.com/releases/rc1/index.html(2 edits)/Users/cb/Documents/repos/sites/queenofsandiego.com/tools/charter_price_scraper.py(6 edits)/Users/cb/Documents/repos/sites/queenofsandiego.com/charters/REFERRAL_PROMPT.txt(1 write)
The cross-property changes suggest the session was exploring booking system internals and may have over-applied fixes to unrelated code paths.
Diagnostic Approach
Step 1: Git History and Production Comparison
We established a baseline by fetching the production version from CloudFront/S3:
# Fetch current production state
aws s3 cp s3://sailjada.com/index.html ./prod_index.html
# Compare line counts
wc -l prod_index.html local_index.html
This revealed that the local version had 23+ modified files, with line count differences suggesting entire sections were rewritten rather than surgical fixes.
Step 2: Pattern Analysis for Broken Code
We searched for all occurrences of the broken patterns:
grep -r "jadaBookingState" /Users/cb/Documents/repos/sites/
grep -r "{{" /Users/cb/Documents/repos/sites/sailjada.com/ --include="*.html"
Results showed jadaBookingState (a state variable that doesn't exist in production) in multiple files, and double-brace patterns that were either legitimate CSS or broken JavaScript.
Step 3: Staging Inventory and Diff Analysis
We identified staged files across multiple paths:
# List all staged files
aws s3 ls s3://queenofsandiego.com/_staging/ --recursive
aws s3 ls s3://queenofsandiego.com/sailjada/ --recursive
For each staged file, we performed targeted diffs to understand scope:
diff -u prod_index.html staged_index.html | head -100
This revealed the staged version contained the broken jadaBookingState code and incomplete escaping.
Remediation Strategy
Restore from Production
Since production was verified as stable, we restored all 23 locally-modified sailjada.com files from the S3 production bucket:
aws s3 sync s3://sailjada.com/ /Users/cb/Documents/repos/sites/sailjada.com/ --delete
This preserved the production booking system (which uses jadaOpenBook() with proper async handling) and removed all jadaBookingState references.
Cleanup of Staging Deployments
We deleted the problematic staging deployment:
aws s3 rm s3://queenofsandiego.com/_staging/sailjada/ --recursive
And verified no other staging artifacts remained that contained the broken code.
Related Files Review
We examined the queenofsandiego.com changes (charter_price_scraper.py, etc.) to determine if they were independent improvements or part of the cascading fix. Given their unrelated purpose, we preserved those changes pending separate review.
Key Decisions and Rationale
- Restore Over Merge: Attempting to cherry-pick the "good parts" of the fix would have been error-prone. Since production was stable, a full restore followed by a planned, tested re-fix was safer.
- No Direct Production Push: We did not push any changes directly to production CloudFront/S3. All work was restored to local development, ready for proper testing and code review.
- Separate Concerns: Changes to queenofsandiego.com tools were preserved separately, as they are unrelated to the sailjada booking system.
- Template Escaping Documentation: This issue highlights the need for clear standards on how template variables should be rendered in different contexts (CSS, HTML attributes, JavaScript).
What's Next
- Re-apply the race condition fix in a controlled manner, with explicit testing of the loading state before modal open
- Add unit tests for the booking modal to verify async behavior
- Establish a staging review checkpoint before any multi-file deployments
- Document the template escaping patterns used across sailjada.com, queenofsandiego.com, and other properties
- Review and test the queenofsandiego.com charter scraper changes independently
Status: