A few months after a layoff, I was at an inflection point: keep looking for a full-time role, or go all-in on freelancing. A conversation with a client helped me decide, but not in the way I expected.
Her partner, Nicole Garon, had recently taken ownership of Beech Street Squash in Ontario, a club her family has run since 1988, when her parents founded it as the first venue to bring squash to the region. Nicole grew up on those courts as a player, coach, and cheerleader. As a competitor she captured the Ontario Masters Championship without dropping a game, captaining her team to an undefeated 4-0 record at the Women's Masters Team Championships. When she took over ownership, she brought that same competitive instinct with her: a place where, as she puts it, competition, courage, and camaraderie show up every day. The idea came up casually: could something be built to bring friendly competition to the club's forty-odd members? A ladder system, somewhere to track challenges, record results, keep the standings honest.
When Nicole introduced the concept to her membership, she described it in three words that turned out to be a better product brief than anything I could have written myself:
"Simple. Competitive. Mildly dangerous to friendships."
Nicole Garon, Beech Street Squash
Before any formal commitment, I had two motivations running in parallel. The first was the client: build something good enough to surprise them, and let the product speak for what I could deliver as a freelancer. The second was personal: how far had AI-assisted development actually come? Could a production-quality system be built solo, in days, with Claude Code as a genuine collaborator, not just for boilerplate, but for architecture decisions, UX choices, the whole thing?
The squash ladder was the right-sized test. Real constraints, real users, a domain with actual rules, and a client with real expectations.
The Constraints That Shaped Every Decision
Four constraints defined the project before a line of code was written.
Forty members isn't a lot, but when you're ranked 12th and you challenge the 10th seed, it matters to you. The UX had to respect that.
No staging environment to babysit. Infrastructure had to be set-and-forget from the start.
Members weren't going to install an app. They'd tap a link from a chat message and expect it to just work in Safari or Chrome.
Production infrastructure from day two: Railway, PostgreSQL, a custom domain, and transactional email actually landing in real inboxes. Test accounts stood in for real members, but the deployment was genuine. No localhost-only prototype phase, no "we'll fix it before launch" safety net.
How It Was Built
The first day went to the identity layer. ASP.NET Identity handles password hashing, claims, and refresh tokens out of the box. SSO (Google, Apple, Facebook) and TOTP multi-factor auth went in immediately. Auth is the hardest thing to bolt on later, and getting it right early meant the rest of the project could move fast.
A dark theme shipped on day one too, as a deliberate baseline rather than an afterthought.
Day two had the least visible user output but the highest structural impact.
- SQLite → PostgreSQL. Railway's ephemeral filesystem made SQLite a liability. EF Core migrations made the switch straightforward.
- Custom Dockerfile over nixpacks. Railway's auto-detection was convenient but opaque. A custom Dockerfile gave full control over build artifacts and the nginx/API split.
- Separate deployments for frontend and API. Static nginx frontend, separate ASP.NET Core API, independently deployable and independently scalable.
- Email via Resend with Google Workspace SMTP as a fallback.
The most productive day by commit count, and where the product started to feel like itself.
Player profiles got rank history charts and a match log, the two things a competitive player actually wants to see. The challenge mechanics follow traditional ladder rules: position-range limits, auto-cancel of conflicting challenges, and traditional position-shift on result. Implementing these correctly earned trust from members who knew how ladders were supposed to work.
The challenge popup. When you challenge someone, the UI shows a face-off between the two player avatars, styled like a fighting game matchup screen. A small detail, but one that makes the system feel alive rather than transactional.
Day four shifted from building features to making the system feel finished, and thinking about the network effect. When a match result is recorded, a public card is generated. Members can share it directly to a chat or social profile. This is how the ladder gets organic visibility outside the app.
The match result card wasn't the only public-facing surface added on day four. Each player also got a permanent public profile page, a shareable URL showing their rank history, win/loss record, and full match log. No login required to view it.
A player can link it in a bio, share it after a satisfying climb up the ladder, or hand it to a prospective opponent as a calling card. Alongside the match result card, it gives the ladder a presence outside the app itself, something that exists in the world, not just behind a login screen.
Two-step result confirmation was added to prevent both good-faith errors and bad-faith score inflation. Before a result is finalised, the opponent can confirm or dispute it. The email notifications across the full lifecycle:
Mobile compaction. This is where the mobile debt got repaid. The login page had three SSO providers displayed as full-width pills, which felt heavy on a 390px screen. Six commits later, they were icon-only circles.
The final piece of the trust layer: allowing the losing player to submit their own version of a disputed result, rather than just flagging disagreement. The admin sees both accounts side by side.
Once the feature set stabilised, the question became: what does this codebase look like to a developer who didn't build it? Five days of velocity had left some structural debt worth addressing before the system grew further. For this pass I used Opus 4.6 rather than Sonnet. The task required tracing ownership boundaries across 43 files and restructuring them without touching behaviour. That kind of sustained architectural reasoning is where the stronger model earns its keep.
On the backend, business logic had accumulated in the wrong layers. Validation rules lived inline in endpoint
handlers. Domain logic had leaked into repositories. Program.cs had grown to 861 lines. The fix was
a proper service layer: StatisticsService replaced three identical win/loss/streak calculations
copy-pasted across endpoints. ChallengeService extracted the validation rules so they could be tested
in isolation. RankingService pulled rank-shift orchestration out of the repository, where it never belonged.
Every endpoint that was querying the database directly was migrated back to the repository interfaces it was supposed
to be using.
The rank-shifting algorithm also got a fix: it had been calling SaveChangesAsync() once per player
shifted inside a loop — N+2 database round-trips per confirmed match. A single ExecuteSqlRawAsync
statement cut that to 3 calls regardless of how many positions shifted.
A test project was added for the first time: 36 tests covering the extracted services, including every challenge validation rule and streak calculation edge case.
On the frontend, the same DRY problems existed. Three near-identical Chart.js implementations were collapsed
into a single renderRankChart() utility. Eight identical modal close patterns were replaced with
setupModalClose(). All localStorage key strings were centralised into a StorageKeys
constants file. An incomplete local escape helper in admin.js — missing & and
<> escaping — was replaced with the full shared function.
43 files changed. Zero behaviour changes.
The refactor also exposed several pre-existing edge cases in the ranking logic that were fixed over the following two days: a unique-constraint collision in the rank-shift algorithm, a missing state guard on the Challenge button, and a JWT leak in the developer error page. None were regressions introduced by the refactor; better isolation and test coverage just made them easier to surface and pin down.
What Went Wrong
Mobile responsiveness was retrofitted, not designed in. By day four, over six commits were dedicated to fixing mobile layout issues. A single early session testing on an actual device would have prevented most of them. The audience profile, casual mobile browser users, was known from the start. The gap was execution, not planning.
Docker build strategy had to change mid-stream. Railway's nixpacks seemed like a zero-config win until deployment failures started and the generated build was opaque to debug. Switching to a custom Dockerfile added a few hours of work on day two but gave back full control over build artifacts, nginx config, and the API separation.
A committed config file took down production. The frontend config.js held the API URL.
In local dev, serve.ps1 was designed to swap in localhost:5217 before starting the server
and restore the placeholder when it exited. The problem: a deployment was pushed without closing the local server first.
The finally block never ran. localhost:5217 shipped to production. Every API call failed.
The root cause was structural: a generated file was being tracked in git, which made it possible to commit a
locally-mutated version by accident. The fix was to remove config.js from git tracking entirely, have the
Dockerfile always generate it from a template at build time, and add CI guards that fail if the template ever loses its
placeholder. If a file's correct state depends on a cleanup script running, it should not be tracked in source control.
Production restored in under 10 minutes.
What Worked Well
Issue-driven development from day one. Every feature, every bug, every chore had a GitHub issue before a branch was cut. Every branch produced exactly one squash-merged PR that closed exactly one issue.
Looking back at why something was built the way it was is a matter of finding the issue. That discipline isn't automatic. It was a deliberate choice to maintain even when moving fast, and it's what makes this project navigable as a case study rather than an opaque artifact.
Production from day one created urgency around correctness and forced infrastructure decisions to be made early rather than deferred. Running in production also generated real feedback much faster than any amount of internal testing.
Domain-specific UX decisions. The system uses the terminology squash players actually use. The challenge popup borrows from fighting games. These choices required knowing the domain, and they're what separate a generic CRUD app from something that feels made for its community.
The AI-Assisted Development Angle
Claude Code was in the loop for the full five days, not just for boilerplate generation, but as a thinking partner for architecture decisions, debugging Railway deployment failures, UX discussions, and writing the issue descriptions that the whole workflow depended on.
Two numbers on that screenshot are worth clarifying. The "longest session: 1d 11h 25m" is not what it looks like. That was an idle Claude Code session left open overnight with the computer on. It is not representative of actual active coding time, which ran in multi-hour focused blocks. The stat measures wall-clock time, not engagement.
The "Favorite model: Sonnet 4.6" reflects how most of the build went. The exception was the code quality pass, which ran on Opus 4.6. That was a deliberate choice for a task requiring the full ownership structure of 43 files held in context simultaneously. For everything else, Sonnet handled it. Token cost is higher on Opus, and most of the day-to-day work didn't justify that overhead.
The 3.3M token figure also warrants honest context. I was new to Claude Code when this project started, and my usage wasn't optimized. Context windows accumulated longer than they needed to. Sessions stayed open across multi-hour blocks when shorter, more focused conversations would have served better. I hadn't yet developed habits around keeping prompts lean, pruning context proactively, or knowing when to start a fresh session versus continuing an existing one. A developer who has already internalized those patterns would likely ship the same result with meaningfully fewer tokens. The count reflects the learning curve as much as the project's actual complexity.
The honest assessment: AI-assisted development at this level of integration is genuinely different from using autocomplete or asking for code snippets. The friction points are real: context management, knowing when to trust the output, knowing when to push back. But the ceiling is higher than I expected going in. Five days for a system with this feature set wouldn't have been possible otherwise.
What's Next
The POC isn't just about squash. If the club embraces it (and the early signs are encouraging), the architecture is already positioned for what comes next.
Other sports, other clubs, a whitelabeled offering. The ladder is the proof of concept. The real question is what it can become.
The App Today (May 2026)
193 pull requests later, here is what it looks like.