Methodology defines how inputs become scores → Character →
The Build

The Methodology

One subject. Full data transparency. A reproducible framework for N=1 science — because the right experiment on the right person can still tell you something true.

“My numbers won't tell you much about your body. But the framework might tell you something about yours.”
— Matthew
// 01 — foundation

Why N=1?

Population studies are built to find average effects across groups. The signal they surface is real — but it describes a statistical composite that may not resemble you. Average protein recommendations were derived from people who are not you. Average sleep targets were measured on people with different chronotypes, genetics, and stress loads.

The N=1 approach accepts this limitation and turns it into a feature. One subject means every measurement is directly relevant. One subject means no between-person variance to average out. One subject means you can run interventions and observe effects on a body you actually have to live in.

The trade-off is external validity — you cannot generalize my results to you. That’s honest, and it’s fine. The value is the framework, not the conclusions.

1
Subject
26
Data Sources
100+
Daily Metrics
Days Tracked
// 02 — analysis engine

Correlation Engine

The core analytical layer is a rolling correlation engine that continuously calculates Pearson r coefficients across metric pairs. Rather than reporting static relationships, every correlation is recalculated over a 90-day sliding window — capturing how relationships evolve as the body and behaviors change.

Running 23 metric pairs simultaneously creates a multiple comparisons problem: the more tests you run, the more likely one produces a spurious positive by chance alone. The engine applies Benjamini-Hochberg false discovery rate correction to every batch — a method that controls the expected proportion of false positives among all reported discoveries, rather than protecting against any single false positive as Bonferroni does. It is less conservative and more statistically appropriate for exploratory research.

Window
90-day rolling — recalculated nightly
Metric pairs
23 pairs across sleep, recovery, nutrition, training, glucose
Correction method
Benjamini-Hochberg FDR — controls false discovery rate across all simultaneous comparisons
Minimum observations
10 paired data points required before a correlation is reported
Signal threshold
|r| > 0.4 = meaningful signal worth surfacing to the user
Significance
p-value reported alongside r; FDR-adjusted q-value used for filtering
Storage
Results written nightly to DynamoDB; served via /explorer/
Case study

Methodology in Action

Here’s how one finding moved through the pipeline — from raw signal to validated insight.

01 — Raw signal
Correlation engine flags
Sleep hours ↔ next-day HRV: r=+0.58, p=0.003, n=47 paired days. BH-FDR adjusted q=0.014 — survives multiple comparison correction.
02 — Hypothesis
AI generates testable claim
“Nights with ≥7.5h sleep produce next-morning HRV readings ≥15% above the 30-day baseline in this individual.” Protocol: track for 21 days, binary classification.
03 — Observation
Data accumulates automatically
21 days tracked: 13 nights ≥7.5h, 8 nights <7.5h. Of the 13 long-sleep nights, 10 produced HRV ≥15% above baseline (77% hit rate). Of the 8 short nights, only 2 did (25%).
04 — Action
Becomes a protocol change
Hypothesis confirmed. Sleep protocol updated: 7.5h minimum becomes Tier 0 habit. Eight Sleep bedtime alarm set to 10:15 PM. Character sheet “sleep” pillar weight increased.

// This is N=1 data. The threshold (7.5h) and effect size (77% vs 25%) are specific to this physiology. Your numbers will differ. The process is what’s portable.

// 03 — data sources

What We Track

26 data sources. Each ingested daily by a dedicated AWS Lambda function on a fixed UTC cron schedule. Raw JSON archived to S3 permanently; normalized metrics written to DynamoDB single-table with gap-aware backfill.

Wearable
Whoop
Recovery score · HRV · Resting HR · Sleep score · Sleep stages · Strain
Scale
Withings
Weight · Body fat % · Lean mass · Fat mass · Bone mass · Blood pressure
Habits
Habitify
65 habits across 5 pillars · Tier 0–2 streaks · Vice resistance rate · P40 score
Nutrition
MacroFactor
Calories · Protein · Carbs · Fat · Adaptive TDEE · Deficit sustainability
Training
Garmin
Training load · Zone 2 minutes · ACWR · VO2max estimate · Steps · Active calories
Sleep
Eight Sleep
Bed temperature · Sleep timing · Sleep stages (supplemental) · HRV (supplemental)
CGM
Stelo (CGM)
Continuous glucose · Postprandial spikes · Fasting levels · Time in range
Phone
Apple Health
Steps · Active energy · Stand hours · HRV (secondary) · Mindful minutes
Manual
Labs
Lipids · Metabolic panel · Inflammation · Hormones · 40+ biomarkers per draw
Manual
Genome
110 SNPs analysed · Methylation variants · Vitamin metabolism · Absorption predispositions
Manual
Supplements
Daily stack log · Genome-informed decisions · Compliance streak · Timing notes
Manual
Mood & Journal
Morning + evening entries · AI mood classification · Avoidance flags · Stress scores
Manual
Decisions
Protocol adherence decisions · Subjective ratings · Narrative context for anomalies
Training
Strava
Activity uploads · Route data · Heart rate zones · Effort analysis · Training log
Productivity
Todoist
Task completion rate · Daily throughput · Project progress · Cognitive load proxy
Schedule
Google Calendar
Meeting load · Free-block ratio · Travel days · Schedule density · Recovery windows
Journal
Day One
Journal entries · AI sentiment analysis · Reflection frequency · Narrative context
Environment
Weather
Temperature · Humidity · UV index · Air quality · Barometric pressure
Body Comp
DEXA
Bone density · Regional body fat · Visceral fat · Lean mass distribution · Trend tracking
// 04 — limitations

Honest Limitations

Every framework has edges. These are the places where this one is weakest, stated plainly rather than footnoted away.

01
Sample size of one. No findings here are statistically generalizable. Correlation patterns that hold in this data may not hold for any other human body. Report conclusions accordingly.
02
Correlation is not causation. The engine surfaces relationships, not mechanisms. A r=0.6 between sleep and recovery tells you these move together. It does not tell you which drives which, or whether both are driven by a third variable not yet in the model.
03
Observer effect. Tracking a behaviour changes it. The act of monitoring sleep quality, logging food, and scoring habits introduces feedback loops that make it difficult to establish true baselines. The experiment cannot be cleanly isolated from the experimenter.
04
Regression to the mean. An extreme value (very poor sleep, very high stress) is statistically likely to be followed by a more average value regardless of any intervention. Any metric tracked immediately after a bad stretch will appear to improve. This platform attempts to account for this by using rolling windows rather than point-to-point comparisons.
05
Hawthorne effect. Behavior changes when it is observed and reported publicly. The existence of this website, the weekly journal, and the subscriber audience creates performance pressure that may not reflect how the subject would behave absent an audience.
06
The subject is also the engineer. Confirmation bias in data interpretation is mitigated by the Board of Directors framework, independent editorial review (Elena Voss), and FDR-corrected statistical methods — but cannot be fully eliminated.
// 05 — AI governance

AI Advisory System

Every architecture decision, content strategy call, and health protocol change is reviewed by a system of 34 AI-generated personas organized into three boards:

Health Board
6 personas. Advises on protocols, supplements, and health interpretation. Includes deliberate tension pairs (conservative vs aggressive, clinical vs holistic).
Technical Board
12 personas. Reviews every architecture decision, deploy, and data model change. 19 formal reviews completed. The friction is the quality control.
Product Board
8 personas. Advises on UX, audience, content engine, and growth. Personas fight about priorities so decisions aren’t made by default.

When personas disagree, the throughline tiebreaker applies: “Does this help a visitor connect the story from any page to any other page?” If yes, it ships. If not, it waits.

See the full advisory roster →

// 06 — evidence badges

Evidence Badge System

Every claim on this site carries a badge indicating how much data backs it. The thresholds follow the Henning Brandt standard: with N=1 research, observation count determines confidence.

Observations Confidence Badge
<12 Preliminary N=1 · PRELIMINARY
12–29 Low confidence N=1 · EMERGING
30–59 Moderate confidence N=1 · CONFIRMED
60+ High confidence N=1 · ESTABLISHED

Badges appear on pull-quotes, discovery cards, and protocol findings across the site. They’re a constant reminder that all findings on this platform are from a single person’s data.

// 07 — the point

The Value Isn’t in the Conclusions

If you’re reading this hoping to find a protocol you can copy, you will be disappointed — or at least you should be. My sleep responses are not yours. My glucose is not yours. My correlation between protein intake and HRV tells you almost nothing about your own body.

But the framework is portable. The question “what does my data actually show about my sleep?” is one you can ask about yourself. The practice of tracking something for 90 days before drawing conclusions, running FDR correction to avoid over-interpreting noise, requiring a minimum observation count before reporting — these habits of statistical hygiene cost nothing and apply universally.

See the engine running live — live correlation data — updated nightly.
Open Explorer →