The Learning Algorithm v2
How to build permanent knowledge in ~7.5 minutes per fact — and why most education gets it backwards. Now with FSRS adaptive scheduling, real-world evidence, a 5-minute diagnostic, and the math behind product tiers.
The Premise
Most education is built on a lie: that more time in a seat equals more learning. Six hours a day, five days a week, nine months a year — and kids still forget most of what they "learned" by September.
The problem isn't effort. It's architecture.
What if you could learn anything permanently in about 8 minutes of actual work per concept — spread across 60 days using a precise algorithm? What if the same system that helps you learn 20 nutrition facts could scale to 10,000 medical school concepts, or an entire K-12 curriculum?
That's not hypothetical. That's what the research says. And now we have the real-world data to prove the problem it solves.
Part 1: The Evidence — Why This Matters Now
The 99.4% Illusion
Here's a number that should alarm every educator: students achieve 99.4% accuracy inside their learning app. They look like they're mastering the material. Teachers see green dashboards. Parents see "A's."
Then the standardized test arrives. Only 12-26% pass.
99.4% in the app. 12-26% on the test. That's not a gap — that's a chasm. And it's not a fluke. It's the predictable result of an architecture that confuses performance with learning.
The Three Root Causes
When a student fails a test they "passed" in practice, one of three things happened:
| Root Cause | What It Means | Frequency |
|---|---|---|
| Forgot | They learned it, but the memory decayed | ~73% |
| Not fluent | They could do it slowly but not under test conditions | ~3% |
| Never learned | The instruction never produced real understanding | ~24% |
73% of failure is forgetting. Not bad teaching. Not lazy students. Just... no one scheduled the review. The instruction worked. The memory decayed. And nobody came back to reinforce it.
The Doom Loop
Without a retention system, education creates a doom loop:
Student passes skill → Never reviewed → Forgets →
Test reveals gap → Reassigned same skill → Student passes again →
Never reviewed → Forgets again → Repeat
In one real system, 79% of students were stuck in this exact loop — attempting the same skills 3+ times, earning zero credit each time. One student completed 742 practice problems in 28 days and still couldn't pass the test. That's not a student problem. That's a system design failure.
The total cost: 860 hours of student time spent doing work, answering questions, getting nothing. Not because they weren't trying — 99.4% accuracy proves they were trying. The system just didn't do anything with their effort.
The Waste Decomposition
Of all completions that earn zero credit:
| Category | % | Description |
|---|---|---|
| Genuine struggle | 56% | Students truly don't know the material yet |
| Near-miss threshold failures | 25% | Got 3 of 5 right instead of 4 — close but no credit |
| Tech bugs | 19% | System errors, broken integrations |
44% of the waste is fixable without changing the curriculum at all — just by fixing the threshold and the tech bugs. The other 56% requires the learning algorithm.
Part 2: The Atom of Knowledge
Knowledge Components (KCs)
Everything you can learn can be broken into Knowledge Components — atomic, testable pieces of knowledge. A KC passes three tests:
- One testable piece — Can you write a single question that tests this and only this?
- Psychologically real — Do learners' behaviors cluster around this specific knowledge?
- Portable — Does mastery "travel" across different contexts?
"Understand chemistry" is not a KC. That's a vague aspiration. "Atoms contain protons, neutrons, and electrons" is a KC. You can test it. Students either know it or they don't. And knowing it helps you in any chemistry context.
The distinction matters because the KC is the atom of everything downstream — scheduling, assessment, mastery tracking, and adaptive learning all depend on well-formed KCs.
Five Types of Knowledge
Not all knowledge is the same. Different KC types require different teaching and testing strategies:
| Type | What It Is | Example | Why It Matters |
|---|---|---|---|
| Fact | A single piece of declarative knowledge | "Atoms are the smallest unit of an element" | The building blocks |
| Concept | A category with defining features | "What makes something a compound vs. a mixture?" | Lets you classify new things |
| Procedure | A sequence of steps | "How to balance a chemical equation" | Lets you DO things |
| Discrimination | Distinguishing confusable cases | "Mitosis vs. Meiosis — which is which?" | Prevents the most common errors |
| Integrative | Coordinating multiple sub-skills | "Explain how the water cycle works" | Proves deep understanding |
The undervalued type is discrimination. Most education treats confusable pairs as an afterthought. But research shows that gateway distinctions — "Is this A or B?" — carry the most learning value for novices and slash downstream confusion. In our system, discrimination KCs are first-class citizens: 15% of every curriculum and treated as hard prerequisites.
Changing one word can completely change a problem. "5 times as many" means multiply. "5 more than" means add. A student who can't discriminate between those phrases will get every comparison problem wrong — not because they can't multiply, but because they can't tell when to.
At Scale: 1,392 KCs and 12,894 Questions
This isn't theoretical. We've decomposed an entire Language Arts curriculum (grades 3-8) into 1,392 Knowledge Components with 12,894 validated questions. Each KC has been typed, sequenced by prerequisite, and mapped to assessment patterns. Quality control validation shows:
- 95.3% of KCs are fully playable (render correctly, have valid answers)
- 99.96% question renderer pass rate across all question types
- 34/34 engine unit tests passing for the scheduling algorithm
The point: this isn't a framework. It's a validated system with real content and real quality guarantees.
Part 3: The Knowledge Graph — Learning as a Video Game Map
Why Structure Matters
KCs don't exist in isolation. They have prerequisite relationships. You can't learn to balance chemical equations until you understand what atoms and molecules are. Knowledge has structure, and that structure determines the optimal learning order.
We visualize this as a video game world map:
- Worlds = Thematic clusters of related KCs (8-20 each)
- Chapters = Sub-groups within worlds (2-5 KCs each)
- Paths = Prerequisite relationships (what must be learned before what)
- Mastery Gates = Achieve 85%+ to unlock the next area
- Mastery Castle = Final assessment over everything
┌─────────────────┐
│ MASTERY │
│ CASTLE │
└────────┬────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼──────┐ ┌─────▼──────┐
│ World 5 │ │ World 6 │ │ World 7 │
│ Advanced │ │ Advanced │ │ Advanced │
└─────┬─────┘ └─────┬──────┘ └─────┬──────┘
│ │ │
└─────────────────┼──────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼──────┐ ┌──────▼─────┐
│ World 2 │ │ World 3 │ │ World 4 │
│Foundation │ │ Foundation │ │ Foundation │
└─────┬─────┘ └─────┬──────┘ └─────┬──────┘
│ │ │
└─────────────────┼──────────────────┘
│
┌────────▼────────┐
│ World 1 │
│ START HERE │
└─────────────────┘
Why the Map Model Works
- Leverages spatial memory — The brain has powerful navigational memory systems. Placing knowledge in a "space" you can mentally travel through aids retention.
- Makes prerequisites visible — You can see what depends on what. You can't reach World 4 without passing through the foundations.
- Creates progression signals — Like a game, you can see where you've been (mastered), where you are, where you're going (unlocked), and what's still locked.
- Chunks appropriately — 3-6 worlds with 3-5 chapters each stays within cognitive load limits (~4 items in working memory).
- Builds competence, which builds motivation — Each world completion is a visible achievement. Competence is the strongest predictor of self-determined motivation in educational settings. Success builds motivation, not the reverse.
The Always-Visible Destination
Here's where most adaptive learning systems go wrong. They pick the "next best step" for you — but you never see the map. You just follow instructions, one step at a time, like a social media feed.
The difference between an algorithm that serves you and one that uses you is simple: Who chose the destination?
Social media algorithms optimize for "next best content" — but best for whom? Each piece feels relevant, but after 10,000 steps, you've been driven somewhere you never chose to go.
A learning system should work like GPS:
- You set the destination — "I want to master 6th grade science"
- The system calculates the route — Knowledge graph, prerequisites, spacing schedule
- The system gives turn-by-turn directions — Today's KCs, today's reviews
- The journey is visible — You can see the whole map, not just the next step
- Trust is earned — The system's incentives are aligned with getting you there
GPS makers succeed when you arrive. Social platforms succeed when you never leave. We want learners to arrive.
Part 4: The 5-Minute Diagnostic — Where Do You Actually Start?
The Cold Start Problem
Before V2, the algorithm had no answer for a simple question: where does a new student begin? Without knowing what they already know, you either waste time re-teaching mastered material or skip prerequisites they need.
The Binary Probe Algorithm
The solution exploits the prerequisite DAG. Instead of testing every KC (which would take hours), you test the hardest KC in each group and infer everything below it.
Step 1: World Sweep (8 questions, ~2 minutes)
Test the capstone KC of each world.
PASS → entire world = known. Skip it.
FAIL → flag world for drill.
Step 2: Chapter Binary Search (flagged worlds only, ~2-3 minutes)
For each flagged world, binary search its chapters.
Test the last KC of the middle chapter.
PASS → lower chapters = known. Probe upper half.
FAIL → upper chapters = unknown. Probe lower half.
2-3 questions per flagged world → chapter-level placement.
The Math
| Student Type | World Sweep | Worlds Flagged | Chapter Drill | Total Questions | Time at 15 sec/question |
|---|---|---|---|---|---|
| Strong | 8 questions | 2 | 4 questions | 12 | ~3 min |
| Typical | 8 questions | 4 | 8-10 questions | 16-18 | ~4-4.5 min |
| Weak | 8 questions | 7 | 14 questions | 22 | ~5.5 min |
276 KCs diagnosed from ~18 questions. That's a 15:1 information ratio.
Why Discrimination KCs Are the Perfect Diagnostic Probe
This is where the KC typology pays off. A discrimination question — "Is this a simile or a metaphor?" — tests two underlying fact KCs in one question. A diagnostic built on discrimination KCs is roughly 2x more efficient than one built on random sampling.
- "Fragment or complete sentence?" → tests sentence structure knowledge
- "Regular or irregular past tense?" → tests both verb formation rules
- "Times as many vs. more than?" → tests both operations
One question, two inferences. The diagnostic is the ignition. The reps are the ongoing calibration.
Three Diagnostic Modes
| Mode | Questions | Time | Precision | Best For |
|---|---|---|---|---|
| Quick Scan | 8-16 | 3-5 min | World-level | Placement, triage |
| Prerequisite Probe | 16-22 | 4-6 min | Chapter-level | Cold students |
| Retention Check | 15-30 | 8-12 min | KC-level (targeted) | Warm students with history |
The Retention Check is uniquely powerful: it takes a student's completion history, samples from their oldest completions (most likely forgotten), and separates "genuinely retained" from "passed and forgot" — exactly the 73% retention problem.
Part 5: The Depth of Knowledge Ladder
Hard Does Not Mean Complex
Memorizing 100 dates is hard. Analyzing why an event happened is complex. More volume doesn't mean deeper learning. Rigor is about depth of thinking, not amount of suffering.
Webb's Depth of Knowledge (DOK) gives us four levels:
| Level | Name | What It Tests | Example |
|---|---|---|---|
| DOK 1 | Recall | "What is X?" | Multiple choice, flashcards |
| DOK 2 | Apply | "How does X relate to Y?" | Classification, comparison |
| DOK 3 | Analyze | "Why does X matter for Z?" | Case studies, design challenges |
| DOK 4 | Teach | "Explain X to someone who doesn't know it" | Content creation, peer teaching |
Most education stops at DOK 1-2. Students can recognize the right answer on a multiple-choice test but can't actually explain the concept or apply it in a new context. True mastery requires reaching DOK 3-4.
The "You Teach" Principle
The ultimate test of understanding is teaching. When you teach, you must:
- Retrieve — Pull knowledge from memory
- Organize — Structure it for an audience
- Synthesize — Connect pieces coherently
- Transfer — Apply to a new context (teaching)
- Create — Produce original content
If you can explain it simply to someone who doesn't know it, you truly understand it. That's DOK 4. That's mastery.
The Five-Phase Learning Journey
Every world in the knowledge graph progresses through five phases, each building cognitive depth:
ENCOUNTER → RECOGNIZE → RECALL GATE → APPLY → MASTERY GATE
DOK 0 DOK 1 DOK 1-2 DOK 2-3 DOK 4
↓ ↓
"Can you "Can you
say it?" teach it?"
Phase 1: Encounter (DOK 0→1) — Read or watch the fact with explanation. Initial exposure. No test — just absorb.
Phase 2: Recognize (DOK 1) — Multiple choice, true/false, matching. Can you identify the correct answer when shown options?
Phase 3: Recall Gate (DOK 1-2) — Say it back WITHOUT hints or options. "What do you remember about X?" This is the first real test — can you retrieve from memory, not just recognize? Pass at 70% to unlock application content.
Phase 4: Apply (DOK 2-3) — Scenario-based decisions, constraint solving. Use knowledge in realistic context. "Design a meal plan with 30g protein." "Choose the best option for this situation."
Phase 5: Mastery Gate (DOK 4) — Explain it to someone who doesn't know it. Articulate, organize, transfer. "Teach me about protein timing as if I've never heard of it." Pass at 85% — that's true mastery.
Part 6: The Spacing Algorithm — The Engine
The Forgetting Curve
In 1885, Hermann Ebbinghaus showed that without intervention, we forget at a predictable rate:
| Time After Learning | Memory Remaining |
|---|---|
| 20 minutes | 58% |
| 1 hour | 44% |
| 1 day | 33% |
| 1 week | 25% |
| 1 month | 21% |
But Ebbinghaus also discovered the cure: each retrieval at the right moment resets and flattens the curve. After 5-6 well-spaced retrievals, memory becomes durable for years.
FSRS: The Adaptive Engine
Traditional spaced repetition uses fixed interval tables (Day 1, 3, 7, 14, 30, 60). Every student, every concept, same schedule. That wastes time — easy material gets over-reviewed and hard material gets under-reviewed.
We use FSRS (Free Spaced Repetition Scheduler), a modern algorithm developed by Jarrett Ye at MaiMemo Inc. and published at ACM KDD. It tracks three variables for every KC:
- Stability (S): How strong the memory is. Specifically, how many days until you'd have a 90% chance of recalling it. S = 45 means you'd still remember it 45 days from now with 90% probability.
- Difficulty (D): How inherently hard this KC is for you. Easy concepts build stability quickly. Hard ones build slowly.
- Retrievability (R): The probability you can recall it right now. This decays over time — the forgetting curve, personalized to each KC.
After each review, FSRS updates S and D based on how well you did, then schedules the next review at the moment your retrievability would drop to 90%. The result: every KC gets its own adaptive schedule.
Easy KC: Introduction → Review Day 3 → Day 10 → Day 30 → Day 70 → Graduated Hard KC: Introduction → Review Day 2 → Day 5 → Day 11 → Day 22 → Day 45 → Day 85 → Graduated
Benchmarks show FSRS produces 20-30% fewer reviews than fixed schedules for the same retention. That's 20-30% more time for new learning.
Four-Point Rating (Richer Signal)
Instead of binary pass/fail, each review produces a 4-point signal:
| Rating | When | What FSRS Does |
|---|---|---|
| Again | Got it wrong | Stability drops sharply (lapse). Short interval to re-learn. |
| Hard | Got it right, but slowly | Stability grows a little. Moderate interval. |
| Good | Got it right, normal speed | Stability grows normally. Standard interval. |
| Easy | Got it right, instantly | Stability grows fast. Long interval — skip ahead. |
This gives the algorithm 4x more information per interaction than pass/fail, which is how it can schedule more precisely with fewer total reviews.
Three DOK Tracks Running in Parallel
Each KC runs through three separate FSRS memory tracks simultaneously, one per DOK level:
| Track | Gate | Pass Criteria | Graduate When |
|---|---|---|---|
| DOK 1-2 | Recall Gate | 70% of checklist | Stability ≥ 90 days |
| DOK 2-3 | Apply | Correct reasoning | Stability ≥ 90 days |
| DOK 4 | Mastery Gate | 85% + explanations | Stability ≥ 90 days |
Each track has its own (S, D, R) state and follows FSRS independently. But here's the cascading credit rule: passing DOK 4 updates all lower levels too. If you can teach it, you obviously can recall it and apply it. This means advanced learners build stability across all levels simultaneously.
Graduation means: your stability has reached 90 days — even if you never review again for three months, you'd still have a 90% chance of recalling it. That's durable long-term memory.
Failure Handling
FSRS handles failures with more nuance than "drop back one level":
- Again (wrong): FSRS computes a new post-lapse stability. If you had high stability before (you knew it but forgot), recovery is faster. If stability was low (it never really stuck), recovery starts nearly from scratch.
- Hard (slow but correct): Still counts as a success — stability grows, just more slowly than Good.
The system is always calibrating to the edge of your memory — scheduling the next review at the moment retrieval is hard but possible. That productive struggle is the signal that learning is happening.
Part 7: The Math of Learning — Including Product Tiers
Time Per KC
Time per KC = Introduction + (~4 adaptive reviews)
= 3 min + (4 × 1.5 min)
= ~7.5 minutes total work
Each KC requires about 7.5 minutes of actual engagement, spread across ~5 FSRS-scheduled sessions over 60-90 days. FSRS reduces the average from ~6 reviews (fixed schedule) to ~5 because it skips unnecessary reviews for well-learned material.
The Core Equation
New KCs per day ≈ Daily minutes ÷ 9.0
The 9.0 denominator comes from: 3 min introduction + ~4 expected review waves × 1.5 min each = 9.0 min per KC at steady state. With FSRS, this is an expected value — fast learners do better, struggling learners do more reviews.
Product Tiers: The Scalpel vs. The Chainsaw
Not every student needs the same dose. The math reveals natural product tiers:
| Tier | Daily Time | New KCs/Day | 276 KCs Takes | Pace | What It Is |
|---|---|---|---|---|---|
| Full | ~25 min | ~2.8 | ~130 days (1 semester) | 2x | Full grade-level remediation |
| Rx | ~5 min | ~0.6 | ~90 days (targeted 20-40 KCs) | Scalpel | Diagnostic-identified gaps only |
| Maintain | ~5 min | Reviews only | Indefinite | Maintenance | Post-mastery review to keep graduated skills durable |
Full is the chainsaw — one grade level per semester, the core product for students who are behind.
Rx is the scalpel — a diagnostic says "this kid has 30 specific gaps in apostrophes, prefixes, and subject-verb agreement." 30 KCs at 0.6 new/day = a semester-long prescription. "Take this 5-min daily pill for your specific gaps."
Maintain is the lock — after graduating a grade level at 25 min/day, 5 min/day keeps it locked in while the student starts the next grade. FSRS makes Maintain especially efficient: it only schedules reviews for KCs whose retrievability has actually dropped near the threshold, rather than cycling through everything on a fixed schedule.
The 5-minute pitch works — but for a scalpel, not a chainsaw. The question is whether a kid with Grade 3 gaps needs the scalpel (a few specific holes) or the chainsaw (the whole grade). The diagnostic answers that question.
What This Means in Practice
| Daily Time | New KCs/Day | KCs/Year | Equivalent |
|---|---|---|---|
| 5 min | 0.6 | ~219 | Targeted gaps only |
| 12 min | 1.3 | ~475 | 1 grade level per year |
| 25 min | 2.8 | ~1,022 | 1 grade in a semester |
| 36 min | 4 | ~1,460 | ~9 courses per year |
The 1/5 Rule
At steady state, about 1/5 of your time goes to new learning. The other 4/5 is review (maintaining what you've already learned). FSRS improves this ratio from 1/6 (fixed schedule) to 1/5 by eliminating unnecessary reviews, but the fundamental truth remains: most of learning is remembering. Without the review time, knowledge decays. With it, knowledge lasts years.
Same Time, Different Distribution
Here's the comparison that should end every debate about cramming:
| Strategy | Total Study Time | Test Day Recall | 1 Week Later |
|---|---|---|---|
| Cram (40 min night before) | 40 min | 70% | 25% |
| 2 sessions (20 min each) | 40 min | 78% | 45% |
| 5 sessions (8 min each, FSRS-scheduled) | 40 min | 90% | 78% |
The FSRS-scheduled approach beats cramming even though total time is identical. The difference is in the distribution — and FSRS puts each session at the optimal moment for that specific KC's difficulty.
Part 8: One KC, Infinite Questions
Assessment Design Patterns
A well-formed KC isn't just a sentence — it's defined by its outcome space (what mastery looks like) and evidence (how you test it). We formalize this with Assessment Design Patterns from Mislevy's Evidence-Centered Design:
Claim: What mastery looks like (one sentence)
Evidence: Observable behaviors that prove mastery
Task Variables: What CAN change between questions (numbers, context, format)
Forbidden Shortcuts: What MUST NOT appear (kills validity)
Common Errors: What mistakes learners actually make (targeted by distractors)
Why This Matters
A KC without a pattern can produce questions that mention the topic but don't actually test the right thing. The pattern is the guardrail.
Take a discrimination KC: "Students can distinguish 'times as many' (multiply) from 'more than' (add)." Without a pattern, an AI might generate "What does 'times as many' mean?" — technically about the topic, but it doesn't test the discrimination. With a pattern, the AI knows it must generate side-by-side comparisons where both operations give plausible answers.
From one well-formed KC, you can generate unlimited questions by varying:
- Numbers — Change 3x5 to 7x8 to 12x4
- Context — Sports, cooking, animals, Iowa State Fair
- Format — Multiple choice, error analysis, "spot the difference," open response
- DOK Level — Recognition → Application → Teaching
The KC is the invariant. Everything else is a variable. That's what makes KCs the right unit of curriculum design — and what makes question variety possible at scale.
The Key Insight
LLMs are stochastic UIs, not pedagogies. The AI generates the surface form of a question. The KC + assessment pattern is the pedagogy. Without the pattern, the AI is just producing plausible-looking educational content. With the pattern, it's producing valid assessment items that actually test the right thing.
Part 9: Quality Control — Defending the System
Why QC Matters
A learning algorithm is only as good as its content. If questions are broken, answers are wrong, or the renderer can't display them, the entire system fails silently — students get frustrated, data becomes unreliable, and trust evaporates.
The Validation Pipeline
We run three layers of quality control:
Layer 1: Content Validation — Every question is automatically checked for:
- Valid correct answer that matches an available option
- Non-empty choice sets
- Questions that match their declared type
- Answers extractable from feedback text (16 regex patterns rescue questions with embedded answers)
Layer 2: Engine Tests — 34 unit tests verify the scheduling algorithm:
- Correct spacing intervals
- Pass/fail state transitions
- Cascading credit across DOK levels
- Drop-back behavior on consecutive failures
- Session building with the right mix of reviews and new KCs
Layer 3: Renderer Tests — Every question is rendered through the actual display pipeline:
- Question text renders without errors
- All options are visible and selectable
- Multi-select questions handle compound answers (MC1|MC2|MC3)
- Labeled answers (subject##MC1) parse correctly
The Numbers
| Metric | Result |
|---|---|
| KC playability rate | 95.3% (1,326 of 1,392) |
| Question render pass rate | 99.96% (12,889 of 12,894) |
| Engine test pass rate | 100% (34/34) |
| Answer auto-extraction rescue | 2,389 questions rescued from unplayable |
These numbers aren't aspirational. They're measured. And they give us the confidence to say: if a student fails a review, it's because they don't know the material — not because the question was broken.
Part 10: The Three High-Utility Strategies
The learning algorithm sits on top of three strategies that Dunlosky's 2013 meta-analysis rated as the highest-utility learning techniques across hundreds of studies:
1. Retrieval Practice (The Testing Effect)
Actively pulling information from memory via self-testing strengthens memory traces more than restudying. Students who test themselves retain 50%+ more after 1 week than those who reread their notes (Roediger & Karpicke, 2006).
Every review in our system is a test, not a re-read. You attempt to recall FIRST, then check the answer. The retrieval attempt itself is what builds the memory.
2. Spaced Practice (The Spacing Effect)
Spreading learning over time rather than massing it together produces dramatically better long-term retention. This is what FSRS implements adaptively. The same 40 minutes of study time, distributed across ~5 optimally-timed sessions, produces 3x the retention of 40 minutes crammed the night before. FSRS goes further than fixed schedules by personalizing the intervals to each KC's difficulty and the learner's history.
3. Interleaving
Mixing different problem types rather than blocking by type forces learners to discriminate — to identify WHICH strategy to use. Taylor & Rohrer (2010) found interleaved groups scored 77% vs 38% for blocked groups.
Our cumulative review pattern implements all three simultaneously:
Day 1: Learn World 1
Day 2: Review W1 → Learn World 2
Day 3: Review W1+W2 → Learn World 3
Day 4: Review W1+W2+W3 → Learn World 4
Day ~7: FSRS schedules W1 reviews (R approaching 0.90 threshold)
Day ~14: FSRS schedules W1+W2 reviews
Day ~30: FSRS schedules reviews across all worlds
Each session mixes old and new material (interleaving), FSRS schedules reviews at the optimal moment for each KC (spacing), and "review" means testing yourself with a 4-point rating (retrieval practice). One algorithm activates all three strategies.
Part 11: What NOT to Do
The same research that identifies high-utility strategies also identifies low-utility ones that most students rely on:
| Low-Utility Technique | Why It Fails |
|---|---|
| Rereading | Passive. Creates illusion of knowing without actual retention. |
| Highlighting | No generation or retrieval. False sense of productivity. |
| Summarization | Only effective if trained. Most people do it poorly. |
| Cramming | Same time investment produces 3x worse retention than spacing. |
The uncomfortable truth: The study techniques that FEEL the most productive (rereading, highlighting) are among the LEAST effective. And the technique that FEELS the hardest (testing yourself when you're not sure of the answer) is the MOST effective.
The uncertain feeling is the feature, not the bug. Hard retrieval = stronger memory.
Part 12: Event-Driven Learning — The Thin Client
Results Should Drive Content
V2 introduces a key architectural insight: you don't need to rebuild the entire curriculum to fix learning. You need to add the retention layer that makes existing curriculum stick.
When an existing learning platform reports that a student completed a lesson, the FSRS algorithm creates a memory state for that KC and begins scheduling adaptive reviews. That's it. The original instruction was adequate — 99.4% in-app accuracy proves the teaching works. What's missing is what happens AFTER.
External platform completion → Event → FSRS creates memory state (S₀, D₀) →
Adaptive reviews scheduled by retrievability threshold →
~5 reviews over 60-90 days → S ≥ 90 days → GRADUATED
The learning algorithm runs as a thin layer on top of existing education infrastructure. Only two things are custom:
- The spacing scheduler — When to review what
- The event bridge — Listening for completion signals and queuing reviews
Everything else — content, authentication, enrollment, grading — comes from the existing platform. The algorithm is the missing 5-minute daily retention layer that makes instruction stick.
Stop Testing, Start Predicting
Each review isn't just practice — it calibrates the FSRS memory model. After 3-4 reviews, the system knows each KC's stability and difficulty with high confidence. It can compute retrievability at any future date — including test day.
The Day 14 review is the prediction inflection point. By Day 39 of a grade, FSRS has enough data on every KC to predict who will pass and who needs intervention — 16 days before the test. That changes everything: instead of discovering gaps on test day, you prevent them.
Part 13: Learning Through People — The 4 E's
The Problem with Traditional Learning
Schools treat knowledge as content to transfer. Read chapter 5. Watch the lecture. Take the test. But that's not how humans naturally learn anything that matters.
You didn't learn to shoot a basketball from a rulebook — you watched someone you admired. Nobody learns leadership from a dictionary definition — they study leaders. We learn through people.
This is persona-based learning: pick someone who's done what you want to do, then learn through them. Their routines become your templates. Their mistakes become your warnings. Their mindset becomes your model.
The neuroscience backs this up. Research on "Jennifer Aniston Neurons" (Caltech) shows individual neurons fire only for specific people, not abstract concepts. Your brain uses people as its primary indexing system. Character-driven stories release oxytocin — bonding chemicals that make learning stick. Pure facts don't.
The 4 E's Framework
The problem with persona-based learning is that most people stop at admiration. They consume documentaries and biographies and then nothing changes. Watching isn't learning. Admiration isn't acquisition.
The 4 E's turn passive consumption into active skill-building:
E1: Experiment — Try what they did. Not "be disciplined like them" — too vague. Something concrete: Dan Gable woke up at 6am. Your experiment: wake up at 6am for 14 days. The word "experiment" changes everything. Scientists don't "fail" experiments. They run them. Either way, you learned something.
E2: Explain — Answer four questions: Why does this work? How is it going? How did it go? Will you continue? This is where copying becomes understanding.
E3: Expense — Count the real cost. Not just money — time, energy, opportunity cost. The Expense phase forces honesty and prevents cargo-culting successful people.
E4: External — Share what you learned. Write a post, make a video, tell a friend. Teaching others deepens your own understanding (the Protege Effect).
The Double Benefit
Each E trains both the domain skill AND a critical meta-skill:
| Phase | Domain Learning | Meta-Skill |
|---|---|---|
| Experiment | Whatever you're testing | Agency — acting without permission |
| Explain | Why it works | Metacognition — reflecting on your own thinking |
| Expense | Whether it's worth it | Financial literacy — ROI thinking |
| External | How to share it | Public communication — building in public |
Part 14: The Content Flywheel and Creator Economy
The KC Is the Invariant. Content Is the Variable.
The learning algorithm already establishes that one KC can produce infinite questions (via assessment design patterns). The same principle applies to instruction: one KC can have infinite content — different videos, articles, explanations — and the algorithm should serve whichever piece produces the best learning outcomes.
This creates a content creator economy where creators earn money based on how well their content teaches, not how many people click on it. Like TikTok's algorithm, but with learning outcomes as the objective function instead of engagement. Every piece of content gets a cold start (~100 student exposures), the system measures KC pass rates, and winners get more distribution. Revenue comes from school tuition, and creators earn proportional to learning produced. Existing content (YouTube, Khan Academy) can be wrapped without the creator opting in — money accrues until they claim it.
For the full specification of the content creator economy, see The Content Creator Economy.
Curriculum IS Content
Most schools create content for marketing. A learning algorithm built on personas and the "You Teach" mastery standard creates something different: educational content that is inherently shareable.
The Flywheel
The most powerful feature of the "You Teach" requirement is that student content feeds back into the system:
1. Student studies persona (LEARN)
2. Student applies lessons to their life (APPLY)
3. Student creates "You Teach" content (PUBLISH)
4. Best student content becomes teaching material for next cohort (CONTRIBUTE)
5. Next cohort sees relatable peer content, not just polished curriculum
6. Repeat — content library grows with every student
Mastery Badges Drive Contribution
| Badge | Requirement |
|---|---|
| Student | Complete research on persona |
| Practitioner | Apply 3 lessons to your life |
| Creator | Publish 1 piece of "You Teach" content |
| Contributor | Content gets used in the curriculum |
The Contributor badge is the highest achievement — your teaching becomes part of the learning system itself.
Part 15: The Ultimate Goal — Making the System Unnecessary
The Autotelic Vision
The word "autotelic" comes from Greek: auto (self) + telos (goal). An autotelic person pursues an activity for its own sake. They carry a "Game Engine" in their pocket and can create challenge and feedback from any situation.
Progressive Autonomy Transfer
| Phase | Who Designs the Game | What Happens |
|---|---|---|
| High Scaffolding (Beginner) | System selects content, provides feedback, calibrates difficulty | Learner's job: show up and engage |
| Shared Control (Intermediate) | System suggests options, learner chooses | Explicit teaching of self-evaluation |
| Learner Control (Advanced) | Learner selects what to learn, system delivers instantly | System available as tool, not driver |
| Autotelic (Mastery) | Learner designs own games, system becomes optional | Learner can teach others |
Why This Matters Now
When information is free (AI can explain anything), the premium shifts to the human capacity for self-directed action:
- Identify what you want to learn
- Design your own learning game
- Execute without being told
- Iterate based on feedback
AI makes the dependent learner less valuable — AI can follow instructions better. AI makes the autotelic learner more valuable — they use AI as a tool, not a crutch.
The perfect adaptive learning system is one that eventually becomes unnecessary. We start by being their Game Master. We end by teaching them to be their own.
Part 16: The Complete System
The Architecture
LAYER 1: KNOWLEDGE GRAPH (KCs)
1,392+ KCs across a full K-8 curriculum
Connected by prerequisites
Each KC carries an assessment design pattern
12,894 validated questions at 99.96% render rate
↓
LAYER 2: THE DIAGNOSTIC
5-minute binary probe through the DAG
276 KCs placed from ~18 questions
Discrimination KCs as 2x-efficient probes
Three modes: Quick Scan, Prerequisite Probe, Retention Check
↓
LAYER 3: PERSONA LENSES
Each persona has their own take on a KC
Same knowledge, different doors
Multiple perspectives = DOK 3 thinking
↓
LAYER 4: COURSES
Curated, sequenced views into the knowledge graph
Organized as video game world maps
Three product tiers: Full (25 min), Rx (5 min), Maintain (5 min)
↓
LAYER 5: THE ALGORITHM (FSRS)
Adaptive spacing via Stability, Difficulty, Retrievability
Three FSRS memory tracks per KC (Recall, Apply, Teach)
Graduate when Stability ≥ 90 days (~5 reviews)
4-point rating (Again/Hard/Good/Easy) for richer signal
Event-driven integration with existing platforms
↓
LAYER 6: QC & PREDICTION
95.3% KC playability, 99.96% render rate, 34/34 engine tests
Day 14 = prediction inflection point
System knows who will pass 16 days before the test
↓
LAYER 7: THE LEARNING JOURNEY
Encounter → Recognize → Recall Gate → Apply → Mastery Gate
The 4 E's: Experiment → Explain → Expense → External
Contextual feeds at moment of need
↓
LAYER 8: THE CONTENT FLYWHEEL
Student creates "You Teach" content (DOK 4)
Best content becomes teaching material for next cohort
System gets better with every student
↓
LAYER 9: AUTOTELIC DEVELOPMENT
Progressive autonomy transfer
Student learns to be their own Game Master
System becomes optional
The Numbers
| Metric | Value |
|---|---|
| Time per KC | ~7.5 minutes total |
| Expected days to graduate | 60-90 (adaptive per KC) |
| Expected reviews per DOK track | ~5 (range 3-8) |
| DOK tracks per KC | 3 |
| Steady-state review ratio | ~4/5 review, ~1/5 new |
| Sessions for permanent memory | 5-6 |
| Review efficiency vs fixed schedule | ~25% fewer reviews |
| Diagnostic time (typical) | 4-5 minutes |
| Diagnostic questions (typical) | 16-18 |
| KCs diagnosed per question | ~15:1 ratio |
| KC playability rate | 95.3% |
| Question render pass rate | 99.96% |
| Engine test pass rate | 100% (34/34) |
| Optimal daily time (Full tier) | 25 minutes |
| KCs per year at 25 min/day | ~1,022 |
The Research Foundation
This isn't invented from thin air. Every component maps to validated research:
| Component | Research |
|---|---|
| FSRS algorithm | Ye (2022, 2023) — ACM KDD, IEEE TKDE — 20-30% fewer reviews than SM-2 |
| Spacing schedule | Cepeda et al. (2006) — meta-analysis of 317 experiments |
| Retrieval practice | Roediger & Karpicke (2006) — the testing effect |
| Interleaving | Taylor & Rohrer (2010) — discrimination training |
| Cognitive load | Sweller (1988, 2011) — working memory limits |
| Mastery learning | Bloom (1968, 1984) — the 2 Sigma Problem |
| Knowledge structure | Doignon & Falmagne — Knowledge Space Theory |
| DOK framework | Webb (1997) — Depth of Knowledge |
| Self-determination | Deci & Ryan — competence drives motivation |
| Autotelic development | Csikszentmihalyi (1993) — Flow and self-direction |
| Assessment design | Mislevy — Evidence-Centered Design |
| KC modeling | Koedinger — cognitive tutoring systems |
| Forgetting curve | Ebbinghaus (1885) — memory decay |
| Long-term spacing | Bahrick et al. (1993) — 9-year family study |
| Desirable difficulty | Bjork (1994) — hard-but-possible retrieval = strongest encoding |
| Near-permanent retention | Bahrick (1993) — 5-7 spaced retrievals produce decade-long retention |
The Punchline
Education doesn't have a time problem. It has an architecture problem.
99.4% in-app accuracy and 12-26% test pass rates prove it. 860 hours of wasted student time proves it. 79% of students stuck in doom loops proves it. The teaching works. The forgetting is what kills it.
The algorithm isn't complicated:
- Break knowledge into atoms (KCs — 1,392 and counting)
- Diagnose in 5 minutes (binary probe through the prerequisite DAG)
- Arrange them in a map (knowledge graph with prerequisites)
- Build depth, not just breadth (DOK 1 → DOK 4)
- Space the practice adaptively (FSRS: Stability, Difficulty, Retrievability — 25% fewer reviews)
- Test, don't re-read (retrieval practice at every review, 4-point rating for richer signal)
- Choose the right dose (Full 25 min, Rx 5 min, or Maintain 5 min)
- Validate relentlessly (99.96% render rate, 34/34 engine tests)
- Predict before the test (Day 14 = inflection point)
- Learn through people (persona-based, 4 E's)
- Teach to master ("You Teach" = DOK 4 = true mastery)
- Make it visible (destination-first, always-visible map)
- Make it unnecessary (progressive autonomy → autotelic learners)
The goal of education shouldn't be to teach content. It should be to develop people who can teach themselves content — forever, without external prompting.
The goal of education is to develop autotelics. And the learning algorithm is how you get there.
Built on: FSRS (Ye, 2022-2023 — ACM KDD, IEEE TKDE), Cepeda et al. (2006), Roediger & Karpicke (2006), Dunlosky et al. (2013), Ebbinghaus (1885), Bahrick et al. (1993), Bjork (1994), Sweller (1988, 2011), Bloom (1968, 1984), Webb (1997), Mislevy (Evidence-Centered Design), Koedinger (KC modeling), Csikszentmihalyi (1993), Deci & Ryan (Self-Determination Theory), Doignon & Falmagne (Knowledge Space Theory), Taylor & Rohrer (2010), and 50+ years of cognitive science research.
V2 additions informed by real-world data from Language 3-8 analysis: 1,392 KCs, 12,894 questions, 775 student journeys, 501 students, and 1,829 hole-filling assignments across grades 3-8.