// 01 — foundation
Why N=1?
Population studies are built to find average effects across groups. The signal they surface
is real — but it describes a statistical composite that may not resemble you. Average
protein recommendations were derived from people who are not you. Average sleep targets
were measured on people with different chronotypes, genetics, and stress loads.
The N=1 approach accepts this limitation and turns it into a feature. One subject means
every measurement is directly relevant. One subject means no between-person variance to
average out. One subject means you can run interventions and observe effects on a body
you actually have to live in.
The trade-off is external validity — you cannot generalize my results to you.
That’s honest, and it’s fine. The value is the framework, not the conclusions.
// 02 — analysis engine
Correlation Engine
The core analytical layer is a rolling correlation engine that continuously calculates
Pearson r coefficients across metric pairs. Rather than reporting static relationships,
every correlation is recalculated over a 90-day sliding window — capturing how
relationships evolve as the body and behaviors change.
Running 23 metric pairs simultaneously creates a multiple comparisons problem: the more
tests you run, the more likely one produces a spurious positive by chance alone. The engine
applies Benjamini-Hochberg false discovery rate correction to every batch — a method
that controls the expected proportion of false positives among all reported discoveries,
rather than protecting against any single false positive as Bonferroni does. It is less
conservative and more statistically appropriate for exploratory research.
Window
90-day rolling — recalculated nightly
Metric pairs
23 pairs across sleep, recovery, nutrition, training, glucose
Correction method
Benjamini-Hochberg FDR — controls false discovery rate across all simultaneous comparisons
Minimum observations
10 paired data points required before a correlation is reported
Signal threshold
|r| > 0.4 = meaningful signal worth surfacing to the user
Significance
p-value reported alongside r; FDR-adjusted q-value used for filtering
Storage
Results written nightly to DynamoDB; served via /explorer/
Case study
Methodology in Action
Here’s how one finding moved through the pipeline — from raw signal to validated insight.
01 — Raw signal
Correlation engine flags
Sleep hours ↔ next-day HRV: r=+0.58, p=0.003, n=47 paired days. BH-FDR adjusted q=0.014 — survives multiple comparison correction.
02 — Hypothesis
AI generates testable claim
“Nights with ≥7.5h sleep produce next-morning HRV readings ≥15% above the 30-day baseline in this individual.” Protocol: track for 21 days, binary classification.
03 — Observation
Data accumulates automatically
21 days tracked: 13 nights ≥7.5h, 8 nights <7.5h. Of the 13 long-sleep nights, 10 produced HRV ≥15% above baseline (77% hit rate). Of the 8 short nights, only 2 did (25%).
04 — Action
Becomes a protocol change
Hypothesis confirmed. Sleep protocol updated: 7.5h minimum becomes Tier 0 habit. Eight Sleep bedtime alarm set to 10:15 PM. Character sheet “sleep” pillar weight increased.
// This is N=1 data. The threshold (7.5h) and effect size (77% vs 25%) are specific to this physiology. Your numbers will differ. The process is what’s portable.
// 03 — data sources
What We Track
26 data sources. Each ingested daily by a dedicated AWS Lambda function on a
fixed UTC cron schedule. Raw JSON archived to S3 permanently; normalized metrics
written to DynamoDB single-table with gap-aware backfill.
Wearable
Whoop
Recovery score · HRV · Resting HR · Sleep score · Sleep stages · Strain
Scale
Withings
Weight · Body fat % · Lean mass · Fat mass · Bone mass · Blood pressure
Habits
Habitify
65 habits across 5 pillars · Tier 0–2 streaks · Vice resistance rate · P40 score
Nutrition
MacroFactor
Calories · Protein · Carbs · Fat · Adaptive TDEE · Deficit sustainability
Training
Garmin
Training load · Zone 2 minutes · ACWR · VO2max estimate · Steps · Active calories
Sleep
Eight Sleep
Bed temperature · Sleep timing · Sleep stages (supplemental) · HRV (supplemental)
CGM
Stelo (CGM)
Continuous glucose · Postprandial spikes · Fasting levels · Time in range
Phone
Apple Health
Steps · Active energy · Stand hours · HRV (secondary) · Mindful minutes
Manual
Labs
Lipids · Metabolic panel · Inflammation · Hormones · 40+ biomarkers per draw
Manual
Genome
110 SNPs analysed · Methylation variants · Vitamin metabolism · Absorption predispositions
Manual
Supplements
Daily stack log · Genome-informed decisions · Compliance streak · Timing notes
Manual
Mood & Journal
Morning + evening entries · AI mood classification · Avoidance flags · Stress scores
Manual
Decisions
Protocol adherence decisions · Subjective ratings · Narrative context for anomalies
Training
Strava
Activity uploads · Route data · Heart rate zones · Effort analysis · Training log
Productivity
Todoist
Task completion rate · Daily throughput · Project progress · Cognitive load proxy
Schedule
Google Calendar
Meeting load · Free-block ratio · Travel days · Schedule density · Recovery windows
Journal
Day One
Journal entries · AI sentiment analysis · Reflection frequency · Narrative context
Environment
Weather
Temperature · Humidity · UV index · Air quality · Barometric pressure
Body Comp
DEXA
Bone density · Regional body fat · Visceral fat · Lean mass distribution · Trend tracking
// 04 — limitations
Honest Limitations
Every framework has edges. These are the places where this one is weakest,
stated plainly rather than footnoted away.
01
Sample size of one. No findings here are statistically
generalizable. Correlation patterns that hold in this data may not hold for
any other human body. Report conclusions accordingly.
02
Correlation is not causation. The engine surfaces relationships,
not mechanisms. A r=0.6 between sleep and recovery tells you these move together.
It does not tell you which drives which, or whether both are driven by a third
variable not yet in the model.
03
Observer effect. Tracking a behaviour changes it. The act of
monitoring sleep quality, logging food, and scoring habits introduces feedback loops
that make it difficult to establish true baselines. The experiment cannot be
cleanly isolated from the experimenter.
04
Regression to the mean. An extreme value (very poor sleep, very
high stress) is statistically likely to be followed by a more average value regardless
of any intervention. Any metric tracked immediately after a bad stretch will appear
to improve. This platform attempts to account for this by using rolling windows rather
than point-to-point comparisons.
05
Hawthorne effect. Behavior changes when it is observed and reported
publicly. The existence of this website, the weekly journal, and the subscriber
audience creates performance pressure that may not reflect how the subject would
behave absent an audience.
06
The subject is also the engineer. Confirmation bias in data interpretation
is mitigated by the Board of Directors framework, independent editorial review (Elena Voss),
and FDR-corrected statistical methods — but cannot be fully eliminated.
// 05 — AI governance
AI Advisory System
Every architecture decision, content strategy call, and health protocol change is reviewed
by a system of 34 AI-generated personas organized into three boards:
Health Board
6 personas. Advises on protocols, supplements, and health interpretation. Includes deliberate tension pairs (conservative vs aggressive, clinical vs holistic).
Technical Board
12 personas. Reviews every architecture decision, deploy, and data model change. 19 formal reviews completed. The friction is the quality control.
Product Board
8 personas. Advises on UX, audience, content engine, and growth. Personas fight about priorities so decisions aren’t made by default.
When personas disagree, the throughline tiebreaker applies: “Does this help a visitor connect the story from any page to any other page?” If yes, it ships. If not, it waits.
See the full advisory roster →
// 06 — evidence badges
Evidence Badge System
Every claim on this site carries a badge indicating how much data backs it. The thresholds
follow the Henning Brandt standard: with N=1 research, observation count determines confidence.
| Observations |
Confidence |
Badge |
| <12 |
Preliminary |
N=1 · PRELIMINARY |
| 12–29 |
Low confidence |
N=1 · EMERGING |
| 30–59 |
Moderate confidence |
N=1 · CONFIRMED |
| 60+ |
High confidence |
N=1 · ESTABLISHED |
Badges appear on pull-quotes, discovery cards, and protocol findings across the site. They’re
a constant reminder that all findings on this platform are from a single person’s data.
// 07 — the point
The Value Isn’t in the Conclusions
If you’re reading this hoping to find a protocol you can copy, you will be
disappointed — or at least you should be. My sleep responses are not yours.
My glucose is not yours. My correlation between protein intake and HRV tells you
almost nothing about your own body.
But the framework is portable. The question “what does my data actually show
about my sleep?” is one you can ask about yourself. The practice of tracking
something for 90 days before drawing conclusions, running FDR correction to avoid
over-interpreting noise, requiring a minimum observation count before reporting —
these habits of statistical hygiene cost nothing and apply universally.
See the engine running live — live correlation data — updated nightly.
Open Explorer →