For Builders — Matthew

00 — The Meta-Story

Who Built This

Matthew is a technology leader who manages engineering teams at a SaaS company. He started this platform on February 22, 2026. Today it runs 62 Lambdas, 121 MCP tools, a 72-page website, and a CI/CD pipeline. How? Every single conversation was with Claude.

This is what happens when a domain expert — someone who knows their health data, their goals, and their constraints — pairs with an AI that can write production code. The human sets the architecture. The AI writes the implementation. The human reviews the output. The AI iterates. Zero Stack Overflow.

00b — The Partnership

What Claude Did vs. What Matt Did

Claude wrote

Every Lambda function (62 and counting)
The full CDK infrastructure (8 stacks)
The observatory CSS design system
The MCP tool registry (121 tools)
The correlation engine
All 68 site pages

Matt defined

Every architecture decision (45 ADRs)
The editorial design language
The data model and source priorities
The Board of Directors system (34 personas)
The Henning Brandt evidence standard
What questions to ask the data

01 — Audience

Who This Is For

You're a developer, hobbyist, or technical leader who wants to build a personal data system — health tracking, quantified self, home automation, or any domain where you're collecting data from multiple sources and want AI to reason about it. You don't need a team. You don't need a budget. You need a pattern.

This page documents the architectural pattern I followed: what I chose, what I avoided, what broke, and what I'd do differently if I started over tomorrow.

02 — Architecture decisions

What I Chose (and What I Didn't)

Every decision here was optimized for a single operator. No team coordination overhead, no multi-tenant complexity, no premature abstraction. The philosophy: the simplest thing that works, run by one person, at near-zero cost.

// chose

DynamoDB single-table, no GSIs

One table. PK = USER#matthew#SOURCE#{source}, SK = DATE#YYYY-MM-DD. Every query is a known key pattern. No GSIs means no extra cost and no index propagation lag.

// trade-off: ad-hoc queries are impossible. You must know your access patterns upfront. For N=1 data, you always do.

// avoided

PostgreSQL / RDS

RDS starts at ~$15/month even for the smallest instance, runs 24/7, and requires patching. For write-once-read-many time-series health data, it's overkill. DynamoDB on-demand: $0.48/month.

// would reconsider if: ad-hoc analytical queries became critical, or multi-user support was needed

// chose

Lambda + EventBridge (no containers)

Every function is event-driven. Ingestion runs on cron (06:45–11:00 AM PT). Compute runs in sequence. Zero idle cost. Cold starts are <2s and irrelevant for batch processing.

// trade-off: 15-min max timeout, no long-running jobs, package size limits. All manageable for this workload.

// avoided

ECS / Fargate / EC2

Always-on compute makes no sense for a system that runs ~100 invocations/day. Even the smallest Fargate task is ~$10/month. The platform's entire Lambda bill is $0.12/month.

// would reconsider if: real-time streaming ingestion or websocket features were needed

// chose

CDK (TypeScript) for IaC

8 CDK stacks define all infrastructure. IAM roles are least-privilege per Lambda. EventBridge schedules, DynamoDB table, S3 buckets, CloudFront — all in code, all version-controlled.

// trade-off: CDK learning curve is steep. But "infrastructure as code" means rebuilding from scratch takes minutes, not days.

// avoided

Terraform / SAM / manual console

Terraform is cloud-agnostic but adds a state file management burden. SAM is fine but CDK gives you real programming constructs (loops, conditionals). Manual console is how incidents happen.

// would reconsider if: multi-cloud was a requirement (it never will be for a personal project)

// chose

MCP (Model Context Protocol) for AI

121 tools exposed via MCP. Claude calls them in natural language. No SQL, no dashboards, no manual queries. The AI is the interface. OAuth 2.1 + HMAC auth on the Lambda Function URL.

// trade-off: vendor lock-in to Anthropic's protocol. But MCP is open spec, and the tool implementations are just Python functions.

// avoided

Custom dashboard / Grafana / Retool

Dashboards require maintenance, are hard to make flexible, and can't answer questions you didn't anticipate. "What correlates with my bad sleep last week?" is a natural language question, not a dashboard filter.

// exception: the public website (averagejoematt.com) is a read-only dashboard for visitors. But I query via Claude.

03 — The stack

The Exact Stack

No frameworks. No ORMs. No dependency trees. The entire platform runs on Python stdlib + boto3 + one Claude API call per digest.

Language

Python 3.12 — stdlib only for all Lambdas. No pip dependencies. urllib for HTTP, json for parsing, boto3 from Lambda runtime.

Infrastructure

AWS CDK (TypeScript) — 8 stacks: Compute, Data, Web, Alarms, IAM, Shared Layer, Site, Monitoring.

Database

DynamoDB — single table, on-demand, PK+SK only, no GSIs, KMS encrypted, PITR 35-day, deletion protection.

Object store

S3 — raw JSON archive (raw/{source}/{type}/{Y}/{M}/{D}.json), config files, static site hosting.

AI

Claude (Anthropic) — Sonnet for analysis, Haiku for classification. ~$3/month. MCP for tool calls.

CI/CD

GitHub Actions — OIDC federation (no static keys), lint → test → plan → deploy with manual approval gate, auto-rollback on smoke test failure.

Monitoring

CloudWatch + X-Ray — 66 alarms, synthetic canary every 4h, dead-letter queues on all async invocations.

Auth

Secrets Manager + KMS — 10 secrets, in-memory Lambda caching, OIDC for CI/CD, OAuth 2.1 for MCP.

Frontend

Vanilla HTML/CSS/JS — no React, no build step, no bundler. S3 + CloudFront. Site API via Lambda Function URL.

04 — Lessons learned the hard way

What Broke (and What I Learned)

These aren't hypothetical. Each lesson cost at least one incident, one late-night debug session, or one "how did I miss that" moment.

01

MCP registration integrity requires automated validation

MCP is a new protocol — the registry integrity pattern prevents deployment of unimplemented tools. A CI gate (test_mcp_registry.py) cross-references every registered tool name against its implementing function. Zero tolerance for registration-without-implementation.

// pattern: automated registry integrity test runs on every deploy

02

Mixed-ownership S3 prefixes require deployment boundaries

When static site files and Lambda-generated files coexist in the same S3 bucket, sync --delete creates a mixed-ownership problem — deployment removes files it didn't create. ADR-032 established deployment boundaries: separate prefixes per owner, bucket policy blocks DeleteObject on protected paths, and a safe_sync wrapper enforces the rules.

// pattern: ADR-032 safe_sync.sh wrapper — never sync --delete to bucket root

03

CI/CD is non-optional, even for solo projects

I chose manual deploys for velocity in weeks 1-2, accepting the risk for faster iteration. By week 3, the error rate made it clear: even for a single engineer, CI/CD isn't optional. Architecture Review #13 formalized this as the top finding. The pipeline now enforces lint, test, plan, manual approval, deploy, smoke test, and auto-rollback.

// pattern: GitHub Actions with OIDC federation, manual approval gate, auto-rollback

04

Lambda@Edge deploys to us-east-1 regardless of your home region

Spent 3 hours debugging why Lambda@Edge couldn't read secrets. Everything was in us-west-2. Lambda@Edge runs in us-east-1. Secrets must be there too.

// also: CloudWatch billing alarms must use SNS in us-east-1

05

macOS ships bash 3.2 — no associative arrays

Wrote a deploy script using declare -A for a Lambda mapping. Worked on Linux, crashed on macOS. Bash 3.2 doesn't support associative arrays (Apple ships ancient bash due to GPL v3 licensing).

// fix: use parallel indexed arrays or inline Python blocks in bash scripts

06

Secrets governance requires dependency mapping

In any Lambda-based system, the relationship between secrets and their consumers isn't visible from the AWS console. ADR-014 established the governance pattern: document which Lambdas consume which secrets, enforce via automated cross-reference, and never bundle secrets unless consumed by the same Lambda set.

// pattern: ADR-014 secrets dependency mapping — automated validation in CI

07

Correlation ≠ Causation (and your AI will forget this)

Early daily briefs said things like "your high HRV caused better sleep quality." The correlation engine found a relationship. Claude narrated it as causal. Had to add explicit correlational framing instructions to every AI prompt.

// fix: system prompt mandates "correlates with" / "is associated with" — never "causes" or "leads to"

08

Personal data systems need explicit domain boundaries

Health data, behavioral data, and productivity data each belong in separate partitions — even when all three are yours. Cross-contamination creates governance complexity that's trivial to prevent upfront and painful to untangle later. The platform enforces strict data domain boundaries: each source writes to its own partition, and no employer or third-party system is ever ingested.

// pattern: single-table DynamoDB with source-prefixed partition keys — no cross-domain writes

05 — Build timeline

How Fast It Grew

From zero to 62 Lambdas in 5 weeks. Every version was built with Claude as the sole engineering partner.

Feb 22

Day 1 — First Lambda, first DynamoDB write

Whoop ingestion. Single Lambda, single table. 15 minutes from idea to working code.

Week 1

8 data sources online

Whoop, Withings, Strava, Apple Health, MacroFactor, Habitify, Garmin, Eight Sleep. All on EventBridge crons.

Week 2

Daily Brief email + MCP server

Claude synthesizes all data into a coaching email every morning. MCP tools let me query data in natural language.

Week 3

Character Sheet engine + public website

7-pillar scoring system with EMA smoothing, level/tier transitions. averagejoematt.com goes live with live data.

Week 4

Intelligence layer: correlations + hypothesis engine

23-pair Pearson correlation matrix with BH-FDR correction. Weekly AI hypothesis generation from data patterns.

Week 5

CI/CD, architecture reviews, 103 MCP tools

GitHub Actions pipeline with OIDC. 17 architecture reviews by 14-member AI board. Challenge system with XP gamification.

Week 6+

Observatory editorial design, 67-page site, launch prep

Full editorial observatory pattern across 6 health domains. Usability study with 15 simulated participants. Site-api split for isolated AI endpoints. 62 Lambdas, 121 MCP tools, 26 data sources. $19/month.

06 — If I started over tomorrow

What I'd Do Differently

01 CI/CD from day one. Not day 30. The 8 deployment incidents were entirely preventable. GitHub Actions + OIDC takes 2 hours to set up and saves hundreds of hours.

02 Start with 3 data sources, not 8. Whoop + Habitify + one nutrition tracker is enough to build the full pipeline pattern. Adding sources later is trivial once the pattern exists.

03 Design the MCP tools as the primary interface from the start. I built dashboards first, then realized Claude was a better interface. Would skip straight to MCP and build the website as a secondary output.

04 Write Architecture Decision Records from day one. ADRs are 5-minute documents that save hours of "why did I do it this way?" 3 months later. The platform now has 45 ADRs and they're invaluable.

05 Pre-compute everything, query nothing live. The Compute → Store → Read pattern means MCP tools read from pre-calculated results, never from raw data. This makes the system fast, cheap, and predictable.

07 — Start here

Your First Weekend

You can have a working personal data system in one weekend. Here's the minimum viable path:

1

Saturday morning

npx cdk init. One DynamoDB table, one Lambda, one EventBridge cron. Pick your most interesting data source (Whoop, Oura, Garmin). Write the ingestion Lambda. Deploy. Verify data in DynamoDB console.

2

Saturday afternoon

Add a second data source. Write an MCP tool that reads both. Test with Claude Desktop. Ask it: "How did I sleep this week?" The moment Claude answers from your own data is when it clicks.

3

Sunday

Write a Daily Brief Lambda that reads DynamoDB and calls Claude to synthesize an email. Schedule it with EventBridge. Tomorrow morning, your AI sends you a coaching brief. You now have a platform. Everything else is iteration.

Want to see exactly how each piece works? The full architecture diagram, all 62 Lambda definitions, the MCP tool catalog, and every EventBridge schedule are documented on the platform page. The cost page shows exactly what you'll spend.

See the architecture → See the cost breakdown → See the AI layer → Subscribe to follow the build →

Who Built This

What Claude Did vs. What Matt Did

Who This Is For

What I Chose (and What I Didn't)

The Exact Stack

What Broke (and What I Learned)

How Fast It Grew

What I'd Do Differently

Your First Weekend

Why the Repo Is Private