The Learning Algorithm — Technical Specification v2
The complete mathematical and engineering reference for an FSRS-powered mastery system with DOK progression, adaptive scheduling, binary diagnostic, product tiers, event-driven integration, and QC validation.
1. Variables & Constants
T = daily time budget (minutes)
n = new KCs introduced per day
t_i = time per new KC introduction (minutes), default 3.0
t_r = time per review interaction (minutes), default 1.5
E[k] = expected reviews to graduate one KC, ~5 (range 3-8)
N = total KCs in a curriculum
W = number of worlds in the curriculum
C_w = number of chapters per world (2-5)
R_target = desired retention probability, default 0.90
S_grad = stability threshold for graduation (days), default 90
FSRS Memory Model (DSR)
Every KC (at each DOK level) carries an independent memory state with three variables:
S = Stability Time in days for retrievability to decline from 100% to 90%.
S = 45 means 90% recall probability after 45 days without review.
D = Difficulty Inherent complexity of the material (range 1-10).
Higher D → slower stability growth after each review.
R = Retrievability Probability of successful recall right now (0.0-1.0).
Decays over time since last review as a function of S.
Intervals are computed dynamically: after each review, FSRS updates S and D based on the rating, then schedules the next review at the day when R would drop to R_target. This replaces fixed interval tables with a per-KC adaptive schedule.
Rating Scale
Each review produces a 4-point rating mapped from learner behavior:
Grade 1 (Again): Incorrect response → post-lapse stability
Grade 2 (Hard): Correct but slow (>5s response) → stability grows slowly
Grade 3 (Good): Correct at normal speed (2-5s) → stability grows normally
Grade 4 (Easy): Correct instantly (<2s, automatic) → stability grows fast
Graduation Criteria
A KC graduates a DOK level when its stability exceeds the graduation threshold:
Graduated when: S ≥ S_grad (default 90 days)
This means: even after 90 days without review, the learner still has
≥90% probability of successful recall. That's durable long-term memory.
With default FSRS parameters and consistent Good ratings, a typical KC reaches S_grad ≈ 90 in 5-6 reviews over ~60-90 days. Fast learners (Easy ratings, low D) can graduate in 3-4 reviews. Hard material (high D, mixed ratings) may take 7-8 reviews.
Why FSRS Over Fixed Intervals
The previous specification used a fixed expanding schedule (Day 0, 3, 7, 14, 30, 60). FSRS improves on this in three ways:
- Adaptive per-KC intervals. Easy material gets longer gaps sooner. Hard material gets shorter gaps. Fixed schedules waste time over-reviewing easy items and under-reviewing hard ones.
- Richer signal. Four rating grades (vs binary pass/fail) give the algorithm more information per interaction, producing better-calibrated predictions.
- ~25% fewer reviews for equivalent retention. FSRS benchmarks show 20-30% reduction in review volume compared to fixed-interval algorithms (SM-2, Leitner) at the same retention level.
FSRS is based on the DSR model from MaiMemo (Ye, 2022; published at ACM KDD), trained on several hundred million reviews. The latest version (FSRS-6) uses 21 optimizable parameters.
2. Core Throughput Equation
At steady state, for every new KC introduced per day, an expected E[k]-1 review waves are active from previous days. The daily time budget T must cover both:
T = n × t_i + n × (E[k]-1) × t_r
Factored:
┌─────────────────────────────────────────────────────┐
│ T │
│ n = ───────────────────── │
│ t_i + (E[k]-1) × t_r │
└─────────────────────────────────────────────────────┘
With defaults (t_i = 3, t_r = 1.5, E[k] ≈ 5):
n = T / (3 + 4 × 1.5) = T / 9.0
FSRS reduces the expected review count from ~6 (fixed schedule) to ~5 because adaptive intervals skip unnecessary reviews for easy material and focus effort on hard material. Benchmarks show ~25% fewer reviews for equivalent retention.
Product Tier Throughput
| Tier | Daily Time (T) | New KCs/day (n) | 276 KCs Takes | Pace | Use Case |
|---|---|---|---|---|---|
| Full | 25 min | ~2.8 | ~130 days (1 semester) | 2x | Full grade-level remediation |
| Standard | 12 min | ~1.3 | ~260 days (1 school yr) | 1x | Grade-level pacing |
| Rx | 5 min | ~0.6 | ~90 days (20-40 KCs) | Scalpel | Diagnostic-identified gaps |
| Maintain | 5 min | 0 (reviews only) | Indefinite | Lock | Post-graduation durability |
General Throughput Table
| Daily Time (T) | New KCs/day (n) | KCs/month | KCs/quarter | KCs/year |
|---|---|---|---|---|
| 5 min | 0.6 | 18 | 54 | 219 |
| 12 min | 1.3 | 39 | 117 | 475 |
| 15 min | 1.7 | 51 | 153 | 621 |
| 25 min | 2.8 | 84 | 252 | 1,022 |
| 30 min | 3.3 | 99 | 297 | 1,205 |
| 60 min | 6.7 | 201 | 603 | 2,446 |
| 120 min | 13.3 | 399 | 1,197 | 4,855 |
These are expected values. FSRS's adaptive scheduling means actual throughput varies per learner — fast learners exceed these estimates, struggling learners fall below them.
3. Complete Formula Set
┌───────────────────────────────────────────────────────────────────┐
│ │
│ FSRS MEMORY MODEL: │
│ R(t, S) = (1 + factor · t/S)^(-w₂₀) │
│ factor = 0.9^(-1/w₂₀) - 1 │
│ I(r, S) = (S / factor) · (r^(1/(-w₂₀)) - 1) │
│ │
│ THROUGHPUT (expected new KCs per day): │
│ n = T / (t_i + (E[k]-1) × t_r) │
│ where E[k] ≈ 5 reviews to graduate (FSRS adaptive) │
│ │
│ EXPECTED TIME PER KC: │
│ t_kc = t_i + (E[k]-1) × t_r ≈ 9.0 min │
│ │
│ DAYS TO COMPLETE CURRICULUM: │
│ days = ⌈N / n⌉ + E[D_grad] │
│ where E[D_grad] ≈ 75 days (expected time to S ≥ S_grad) │
│ │
│ KCs PER PERIOD: │
│ KCs_month = n × 30 │
│ KCs_quarter = n × 90 │
│ KCs_year = n × 365 │
│ │
│ PIPELINE SIZE (KCs in active review): │
│ pipeline = n × E[D_grad] │
│ │
│ REVIEW FRACTION: │
│ review% = (E[k]-1) × t_r / (t_i + (E[k]-1) × t_r) │
│ │
│ DOK 4 EFFECTIVE REVIEW TIME: │
│ t_r_dok4 = t_dok4_session / KCs_tested_per_session │
│ │
│ GRADUATION CONDITION (per DOK level): │
│ graduated when S ≥ S_grad (default 90 days) │
│ │
│ NEXT REVIEW INTERVAL: │
│ I = (S / factor) · (R_target^(1/(-w₂₀)) - 1) │
│ At R_target = 0.90: I ≈ S │
│ │
│ DIAGNOSTIC QUESTIONS (binary probe): │
│ world_sweep = W (one per world) │
│ chapter_drill ≈ flagged_worlds × log₂(C_w) │
│ total ≈ W + flagged × log₂(C_w) │
│ information_ratio = N / total (KCs per question) │
│ │
│ ADAPTIVE TEST LENGTH (general graph-based): │
│ questions ≈ log₂(N) × 2 (full frontier mapping) │
│ questions ≈ log₂(N) (single gap finding) │
│ │
│ RX TIER DURATION (targeted gaps): │
│ days_rx = ⌈target_kcs / n_rx⌉ + E[D_grad] │
│ where n_rx = 5 / 9.0 ≈ 0.6 new/day │
│ │
└───────────────────────────────────────────────────────────────────┘
4. The Review Fraction and Sensitivity Analysis
At steady state, ~4/5 of daily time is reviews:
review% = (E[k]-1) × t_r / (t_i + (E[k]-1) × t_r)
= 4 × 1.5 / (3 + 6.0)
= 6.0 / 9.0
= 66.7%
FSRS reduces the review fraction from ~71% to ~67% compared to fixed schedules by eliminating unnecessary reviews for well-learned material.
Partial Derivatives (Sensitivity)
∂n/∂t_r = -n × (E[k]-1) / (t_i + (E[k]-1) × t_r) ← HIGH impact
∂n/∂t_i = -n / (t_i + (E[k]-1) × t_r) ← LOWER impact
Reducing review time by 1 second saves 4x more daily time than reducing introduction time by 1 second, because there are E[k]-1 ≈ 4 expected review waves vs 1 introduction. Review time remains the dominant lever.
Review Time Optimization (at T = 120 min)
| t_r (review) | n (new/day) | KCs/quarter | vs baseline |
|---|---|---|---|
| 2.0 min | 10.9 | 981 | 0.82x |
| 1.5 min | 13.3 | 1,197 | baseline |
| 1.0 min | 17.1 | 1,539 | 1.29x |
| 0.75 min | 20.0 | 1,800 | 1.50x |
| 0.5 min | 24.0 | 2,160 | 1.80x |
Introduction Time Optimization (at T = 120 min)
| t_i (intro) | n (new/day) | KCs/quarter |
|---|---|---|
| 5.0 min | 10.9 | 981 |
| 3.0 min | 13.3 | 1,197 |
| 2.0 min | 15.0 | 1,350 |
| 1.0 min | 17.1 | 1,539 |
Combined Optimization
| t_i | t_r | n at 2 hrs/day | KCs/quarter |
|---|---|---|---|
| 3.0 | 1.5 | 13.3 | 1,197 |
| 2.0 | 1.0 | 20.0 | 1,800 |
| 1.5 | 0.75 | 26.7 | 2,403 |
| 1.0 | 0.5 | 40.0 | 3,600 |
5. Desirable Difficulty and Retention Targeting
Reviews cannot be arbitrarily fast. There is a floor below which retrieval becomes recognition (pattern-matching without genuine memory recall):
FLOOR: t_r ≥ 30 seconds (genuine retrieval effort)
CEILING: t_r ≤ 120 seconds (diminishing returns)
SWEET SPOT: 30–90 seconds per review interaction
Target Retention Tradeoff
FSRS allows tuning R_target to balance review volume vs retention. This is the primary system-level lever:
| R_target | Expected Reviews to Graduate | Review Volume | Retention at 90 Days |
|---|---|---|---|
| 0.97 | ~7-8 | Very high | ~97% |
| 0.90 | ~5 | Balanced | ~90% |
| 0.85 | ~4 | Lower | ~85% |
| 0.80 | ~3-4 | Minimal | ~80% |
| 0.70 | ~3 | Very low | ~70% |
We default to R_target = 0.90 as the sweet spot for education: high enough for reliable test performance, low enough to avoid review fatigue.
Theoretical vs Practical Maximums (at T = 120 min)
Theoretical max (t_r = 0.5 min, t_i = 1.0 min, E[k] ≈ 4):
n_max = 120 / (1.0 + 3 × 0.5) = 120 / 2.5 = 48 KCs/day
≈ 17,520 KCs/year
Practical max (t_r = 1.0 min, t_i = 2.0 min, E[k] ≈ 5):
n_practical = 120 / (2.0 + 4 × 1.0) = 120 / 6.0 = 20 KCs/day
≈ 7,300 KCs/year
6. DOK 4 as a Review Accelerator
A single DOK 4 "teach me about X" prompt tests multiple KCs simultaneously:
Standard review (DOK 1-2):
5 KCs × 1.5 min each = 7.5 minutes
DOK 4 compound review:
"Teach me about spacing and interleaving"
Tests KCs 16–23 (8 KCs) in ~3 minutes
Effective t_r = 3 min / 8 KCs = 0.375 min per KC
DOK 4 reviews are 4x more efficient per KC than individual DOK 1-2 reviews.
Constraint: DOK 4 reviews are only available for KCs the learner has already reached DOK 4 mastery on. For climbing KCs, individual reviews are required.
Mature Learner Throughput (most KCs at DOK 4)
n_dok4 = 120 / (2.0 + 5 × 0.375) = 120 / 3.875 = 31 new KCs/day
≈ 2,790 KCs/quarter
≈ 11,315 KCs/year
7. Binary Probe Diagnostic Algorithm
Overview
The diagnostic exploits the prerequisite DAG to place students across an entire curriculum in under 5 minutes. Instead of testing every KC, it tests the hardest KC in each group and infers everything below it.
Algorithm: Top-Down Binary Probe
function runDiagnostic(student, curriculum):
known_kcs ← []
unknown_kcs ← []
probe_kcs ← []
// Phase 1: Grade Gate (2 questions, ~30 sec)
mid_question ← getMidDifficultyKC(curriculum.grade)
if student.fails(mid_question):
curriculum ← dropToGrade(curriculum.grade - 1)
else:
hard_question ← getHardKC(curriculum.grade + 1)
if student.passes(hard_question):
flag_for_potential_bump()
// Phase 2: World Sweep (W questions, ~2 min)
for each world in curriculum.worlds:
capstone ← getCapstoneKC(world) // most advanced KC
if student.passes(capstone):
known_kcs.addAll(world.all_kcs) // infer entire world
else:
probe_kcs.append(world) // flag for drill
// Phase 3: Chapter Binary Search (flagged worlds, ~2-3 min)
for each flagged_world in probe_kcs:
chapters ← flagged_world.chapters
lo ← 0
hi ← len(chapters) - 1
while lo <= hi:
mid ← (lo + hi) / 2
test_kc ← getLastKC(chapters[mid])
if student.passes(test_kc):
known_kcs.addAll(chapters[lo..mid])
lo ← mid + 1
else:
unknown_kcs.addAll(chapters[mid+1..hi])
hi ← mid - 1
// Remaining chapters at boundary are ambiguous → unknown
unknown_kcs.addAll(chapters[lo..hi])
return DiagnosticResult {
known: known_kcs,
unknown: unknown_kcs,
recommended_start: unknown_kcs[0],
estimated_gap_size: len(unknown_kcs)
}
Complexity Analysis
Best case (strong student):
Grade gate: 2 questions
World sweep: W questions, most pass → few flagged
Chapter drill: ~4 questions (2 flagged × log₂(3))
Total: ~12 questions, ~3 min
Typical case:
Grade gate: 2 questions
World sweep: W questions, ~half flagged
Chapter drill: ~8-10 questions
Total: ~16-18 questions, ~4-4.5 min
Worst case (weak student):
Grade gate: 2 questions
World sweep: W questions, most fail
Chapter drill: ~14 questions
Total: ~22 questions, ~5.5 min
Information ratio (typical): N / 18 ≈ 15:1 (for N = 276)
Discrimination KC Optimization
Discrimination KCs test two underlying concepts in one question:
"Is this a simile or a metaphor?"
PASS → knows both "simile" and "metaphor" concepts
FAIL → doesn't distinguish them (flag both for review)
Information gain per question:
Standard KC: 1 concept tested
Discrimination KC: 2 concepts tested
Efficiency ratio: 2x
The diagnostic preferentially selects discrimination KCs as probes, doubling the information density of each question.
Diagnostic Output → Scheduler Integration
function integrateWithScheduler(result, student):
for each kc in result.known:
// Seed FSRS state with moderate stability (skip encounter)
student.setKCState(kc, {
dok1: { stability: 3.0, difficulty: 5.0, nextDue: now + 3 days, reps: 1, unlocked: true },
dok2: { stability: 0, difficulty: 5.0, nextDue: null, reps: 0, unlocked: true },
dok4: { stability: 0, difficulty: 5.0, nextDue: null, reps: 0, unlocked: false }
})
for each kc in result.unknown:
// Queue for full encounter on Day 1
student.setKCState(kc, {
dok1: { stability: 0, difficulty: 5.0, nextDue: now, reps: 0, unlocked: true },
dok2: { stability: 0, difficulty: 5.0, nextDue: null, reps: 0, unlocked: false },
dok4: { stability: 0, difficulty: 5.0, nextDue: null, reps: 0, unlocked: false }
})
for each kc in result.fragile: // Retention Check mode
// Low stability — schedule immediate review (skip encounter)
student.setKCState(kc, {
dok1: { stability: 1.0, difficulty: 5.0, nextDue: now, reps: 0, unlocked: true },
dok2: { stability: 0, difficulty: 5.0, nextDue: null, reps: 0, unlocked: false },
dok4: { stability: 0, difficulty: 5.0, nextDue: null, reps: 0, unlocked: false },
skipEncounter: true
})
8. The Rolling Pipeline
At steady state, the system maintains a roughly constant-size pipeline:
pipeline_size ≈ n × E[D_grad] (E[D_grad] ≈ 75 days expected graduation time)
KCs graduate out the back at approximately the same rate new ones enter the front:
At n = 2.8 (Full tier): pipeline ≈ 210 KCs in flight
At n = 13.3: pipeline ≈ 998 KCs in flight
At n = 20: pipeline ≈ 1,500 KCs
Day 1: [KC 1–3] introduced, FSRS assigns S₀ based on first rating
Day 2: [KC 4–6] introduced, [KC 1–3] first reviews due (S₀ ≈ 1-2 days)
...
Day ~75: [KC 1–3] reach S ≥ S_grad → graduate (varies per KC)
...forever (steady state)
With FSRS, graduation timing is per-KC rather than fixed at Day 60. Easy material (low D, consistent Good/Easy ratings) may graduate in ~50 days. Hard material (high D, mixed ratings) may take ~100+ days.
Ramp-Up Phase
Before steady state, daily load increases as the review pipeline fills without graduations. FSRS makes this smoother than fixed schedules because early reviews are tightly spaced (S is small) and gradually widen:
| Phase | Days | Typical Load Profile |
|---|---|---|
| Intro + first reviews | 1–7 | Light: new KCs + very short interval reviews |
| Growing pipeline | 7–30 | Building: review waves from first two weeks |
| Near-peak | 30–60 | Heaviest: many KCs in mid-pipeline, no graduations yet |
| Steady state | 60–90+ | Balanced: graduations begin offsetting new introductions |
Peak Load
Peak occurs around Days 30–60 when the pipeline is full but graduations haven't started:
peak_time ≈ n × (t_i + (E[k]-2) × t_r)
= n × (3 + 3 × 1.5)
= n × 7.5 min
At n = 2.8 (Full tier): peak ≈ 21 min
At n = 4: peak ≈ 30 min
At n = 8: peak ≈ 60 min
9. DOK State Machine (FSRS-Powered)
Each KC runs three parallel FSRS memory tracks — one per DOK level. Instead of counting passes through a Leitner box, each track maintains its own (S, D, R) memory state and graduates when stability crosses the threshold.
┌───────────────────────────────────────────────────────────────┐
│ KC STATE MACHINE (FSRS) │
│ │
│ DOK 1-2 (Recall Gate) │
│ ┌────────────┐ review ┌────────────┐ ┌──────────┐ │
│ │ S=0, D=5 │─────────▶│ S grows │── ··· ──▶│ S≥S_grad │ │
│ │ (new card) │ (G=1-4) │ adaptively │ FSRS │ GRADUATED│ │
│ └────────────┘ └────────────┘ └──────────┘ │
│ ▲ G=1 (Again): S resets via post-lapse formula │
│ └────────────────────────────────────── │
│ │
│ DOK 2-3 (Apply) [unlocked after first DOK 1 pass] │
│ ┌────────────┐ review ┌────────────┐ ┌──────────┐ │
│ │ S=0, D=5 │─────────▶│ S grows │── ··· ──▶│ S≥S_grad │ │
│ │ (new card) │ (G=1-4) │ adaptively │ FSRS │ GRADUATED│ │
│ └────────────┘ └────────────┘ └──────────┘ │
│ ▲ G=1 (Again): S resets via post-lapse formula │
│ └────────────────────────────────────── │
│ │
│ DOK 4 (Mastery Gate) [unlocked after first DOK 2 pass] │
│ ┌────────────┐ review ┌────────────┐ ┌──────────┐ │
│ │ S=0, D=5 │─────────▶│ S grows │── ··· ──▶│ S≥S_grad │ │
│ │ (new card) │ (G=1-4) │ adaptively │ FSRS │ GRADUATED│ │
│ └────────────┘ └────────────┘ └──────────┘ │
│ ▲ G=1 (Again): S resets via post-lapse formula │
│ └────────────────────────────────────── │
│ │
│ TRUE MASTERY = DOK 4 track graduated (S ≥ S_grad) │
│ World complete when ALL KCs in world reach true mastery │
│ Mastery Castle unlocks when ALL worlds complete │
└───────────────────────────────────────────────────────────────┘
Cascading Credit Rules
DOK 4 pass → update FSRS state for DOK 4 AND DOK 2 AND DOK 1
DOK 2 pass → update FSRS state for DOK 2 AND DOK 1
DOK 1 pass → update FSRS state for DOK 1 only
The same rating (G) is applied to all cascaded tracks.
Rationale: if you can teach it (DOK 4), you can obviously recall it (DOK 1) and apply it (DOK 2). Cascading the FSRS update accelerates stability growth across all levels.
Failure Handling (FSRS Post-Lapse Stability)
When a student presses Again (G=1), FSRS computes post-lapse stability rather than simply dropping back a level:
S'_f(D, S, R) = w₁₁ · D^(-w₁₂) · ((S+1)^w₁₃ - 1) · e^(w₁₄ · (1-R))
This is more nuanced than the old "two misses = drop back" rule:
- High prior stability + low R (overdue): post-lapse S is moderate — the learner knew it but forgot, so recovery is faster
- Low prior stability + high R (recent): post-lapse S is very small — the material never stuck, needs intensive re-teaching
- Difficulty modulates recovery: easy material bounces back faster than hard material
Hard (G=2) does not trigger a lapse — it still increases S, just more slowly than Good or Easy.
DOK Unlock Rules
DOK 1 track: always unlocked (entry point)
DOK 2 track: unlocks after first DOK 1 pass (G ≥ 2)
DOK 4 track: unlocks after first DOK 2 pass (G ≥ 2)
World Unlock Rules
World 1: always unlocked
World N (N > 1): unlocked when World (N-1) has at least 1 DOK 1 pass
Mastery Castle: unlocked when ALL worlds have DOK 4 graduated (S ≥ S_grad)
Completion Metric
Per-DOK progress: min(1.0, S / S_grad)
world_completion% = (dok1_progress + dok2_progress + dok4_progress) / 3 × 100
Overall: (completed_worlds + mastery_castle_bonus) / (total_worlds + 1) × 100
This gives smooth progress instead of discrete jumps. A KC at S = 45 (halfway to S_grad = 90) shows 50% progress for that DOK level.
10. FSRS Memory Model (Replaces KCBelief)
Per-KC Memory State
interface FSRSMemoryState {
kcCode: string; // e.g., "LANG3-042"
dokLevel: 'dok1' | 'dok2' | 'dok4';
stability: number; // S: days until R drops to 90%
difficulty: number; // D: 1.0-10.0, how hard S is to grow
retrievability: number; // R: 0.0-1.0, current recall probability
lastReview: string; // ISO 8601 timestamp
reps: number; // total successful reviews
lapses: number; // total times G=1 (Again)
graduated: boolean; // true when S ≥ S_grad
}
FSRS Core Formulas (v6, 21 parameters)
Retrievability (forgetting curve):
R(t, S) = (1 + factor · t/S)^(-w₂₀)
where factor = 0.9^(-1/w₂₀) - 1 (ensures R(S, S) = 90%)
Next interval (solving for t at target retention):
I(r, S) = (S / factor) · (r^(1/(-w₂₀)) - 1)
At R_target = 0.90: I = S (by definition — stability IS the 90% interval)
Initial stability after first rating:
S₀(G) = w[G-1] // w₀ through w₃ map to Again, Hard, Good, Easy
Typical defaults: S₀(1)≈0.2, S₀(2)≈1.3, S₀(3)≈2.3, S₀(4)≈8.3 days
Stability after successful recall (G ≥ 2):
S'ᵣ(D, S, R, G) = S · (e^w₈ · (11-D) · S^(-w₉) · (e^(w₁₀·(1-R)) - 1)
· w₁₅(if G=2) · w₁₆(if G=4) + 1)
Key properties of S'ᵣ:
- Lower R (more overdue) → larger stability increase (spacing effect)
- Higher S → smaller relative increase (diminishing returns on strong memories)
- Higher D → smaller increase (hard material grows S more slowly)
- Easy rating (G=4) boosts growth; Hard (G=2) dampens it
Post-lapse stability (G = 1, Again):
S'f(D, S, R) = w₁₁ · D^(-w₁₂) · ((S+1)^w₁₃ - 1) · e^(w₁₄·(1-R))
Difficulty update after review:
D'(D, G) = w₇ · D₀(4) + (1 - w₇) · (D + ΔD · (10-D)/9)
where ΔD = -w₆ · (G - 3)
Mean reversion toward D₀(4) prevents "difficulty hell."
Prediction and Confidence
FSRS is inherently a prediction model — R is the predicted probability of recall at any moment. No separate prediction layer is needed.
Prediction confidence improves with data:
Reps 1-2: Default parameters dominate — predictions are population-level
Reps 3-4: Per-KC D and S are well-calibrated — predictions become personal
Reps 5+: High-confidence individual predictions
Day 14 review ≈ Rep 3-4 for KCs introduced Day 1
→ prediction inflection point for earliest KCs
For a grade (276 KCs at n ≈ 2.5/day):
Day 39: all KCs from first 2 weeks have 3-4+ reps
→ system can predict test outcome ~16 days before test
Rating Derivation from Response Data
function deriveGrade(correct: boolean, responseTimeMs: number): Grade
if !correct: return 1 // Again
if responseTimeMs > 5000: return 2 // Hard
if responseTimeMs < 2000: return 4 // Easy
return 3 // Good
This maps our binary correct/incorrect + response time into FSRS's 4-point scale, giving the algorithm richer signal than pass/fail alone.
11. Event-Driven Integration (Thin Client Architecture)
System Architecture
┌──────────────────────────────────────────────────────────────────┐
│ │
│ EXTERNAL PLATFORM (e.g., MobyMax, Freckle) │
│ ┌────────────────────────────────────────┐ │
│ │ Student completes lesson │ │
│ │ → Platform reports via Caliper event │ │
│ └──────────────┬─────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ EVENT BRIDGE │ │
│ │ Caliper ActivityEvent → │ │
│ │ Extract: kcCode, passed, timestamp │ │
│ │ Map: platform_skill → internal KC │ │
│ └──────────────┬──────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ FSRS SCHEDULER (the algorithm) │ │
│ │ Input: (student, kc, grade, timestamp) │ │
│ │ Output: next_due date per KC per DOK │ │
│ │ Schedule: adaptive via FSRS (S,D,R) │ │
│ └──────────────┬──────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ REVIEW UI (thin client) │ │
│ │ getDailySession() → render questions │ │
│ │ Student answers → recordPass/Fail │ │
│ │ → Emit Caliper event back to platform │ │
│ └─────────────────────────────────────────┘ │
│ │
│ WHAT'S CUSTOM: WHAT'S PLATFORM-PROVIDED: │
│ - Spacing scheduler - Content (lessons, videos) │
│ - Event bridge - Authentication (SSO) │
│ - Review UI - Enrollment (OneRoster) │
│ - KC registry - Grading (gradebook) │
│ - Diagnostic - Reporting (analytics) │
│ │
└──────────────────────────────────────────────────────────────────┘
Caliper Event Schema (Inbound)
interface CaliperActivityEvent {
type: 'ActivityEvent';
profile: 'TimebackProfile';
actor: {
id: string; // student UUID
type: 'TimebackUser';
};
object: {
id: string; // assessment item ID
type: 'AssessmentItem';
name: string; // skill/lesson name
extensions: {
kc_code?: string; // if available from platform
skill_name: string;
};
};
result: {
score: number; // 0-100
success: boolean; // passed threshold
completion: boolean;
extensions: {
xp_earned: number;
attempts: number;
};
};
eventTime: string; // ISO 8601
}
KC Mapping Table
Platform Skill Name → Internal KC Code
"Regular and Irregular Plurals" → LANG3-006
"Commas in Addresses" → LANG3-027
"Subject-Verb Agreement" → LANG4-015
Maintained in: kc_platform_mappings table
platform: TEXT -- 'mobymax' | 'freckle' | 'ixl'
platform_skill: TEXT -- platform's skill identifier
kc_code: TEXT -- internal KC code (references kcs.code)
12. Data Structures
DOK Track (FSRS Memory State)
interface DOKTrack {
stability: number; // S: days until R drops to 90% (graduate when ≥ S_grad)
difficulty: number; // D: 1.0-10.0, inherent complexity
lastReview: string | null; // ISO 8601 timestamp
nextDue: string | null; // ISO 8601 timestamp, null if graduated
reps: number; // total successful reviews
lapses: number; // total G=1 (Again) events
unlocked: boolean; // whether this DOK level is accessible
graduated: boolean; // true when S ≥ S_grad
}
Per-KC Mastery State
interface KCMasteryState {
kcId: number;
kcUuid?: string; // references kcs.id in knowledge graph DB
kcCode?: string; // e.g., "LANG3-042"
dokLevel: 'dok1' | 'dok2' | 'dok4';
stability: number; // FSRS S value
difficulty: number; // FSRS D value
reps: number; // successful review count
lapses: number; // lapse count (G=1)
nextDue: string; // ISO 8601, empty string if graduated
graduated: boolean; // true when S ≥ S_grad
skipEncounter: boolean; // true if placed by diagnostic
history: Array<{
date: string;
grade: 1 | 2 | 3 | 4; // FSRS rating: Again/Hard/Good/Easy
dok: string;
source: 'diagnostic' | 'review' | 'external';
responseTimeMs?: number;
stabilityAfter: number; // S after this review
retrievabilityBefore: number; // R at time of review
}>;
}
Diagnostic Result
interface DiagnosticResult {
studentId: string;
curriculumSlug: string;
gradeLevel: number;
gradeAdjusted: boolean; // true if grade gate shifted grade
mode: 'quick_scan' | 'prerequisite_probe' | 'retention_check';
totalQuestions: number;
totalTimeMs: number;
worlds: Array<{
worldId: number;
status: 'known' | 'partial' | 'unknown';
knownChapters: number[];
unknownChapters: number[];
}>;
knownKCs: string[]; // KC codes inferred as known
unknownKCs: string[]; // KC codes requiring instruction
fragileKCs: string[]; // KC codes passed but likely forgotten
recommendedTier: 'full' | 'rx' | 'maintain';
estimatedGapSize: number; // total unknown KCs
estimatedCompletionDays: number;
}
World Progress
interface WorldProgress {
worldId: number;
factsLearned: number[];
factProgress: Record<number, {
dok1Stability: number; // FSRS S for DOK 1
dok2Stability: number; // FSRS S for DOK 2
dok4Stability: number; // FSRS S for DOK 4
lastReviewed: string | null;
}>;
dok1Track: DOKTrack; // Recall Gate (DOK 1-2)
dok2Track: DOKTrack; // Apply (DOK 2-3)
dok4Track: DOKTrack; // Mastery Gate (DOK 4)
}
User Progress (Top-Level)
interface UserProgress {
worlds: WorldProgress[];
masteryCastleAttempts: number;
masteryCastleCompleted: boolean;
currentStreak: number;
lastActivityDate: string | null;
diagnosticResult?: DiagnosticResult;
productTier: 'full' | 'rx' | 'maintain';
}
13. Scheduling Algorithm (FSRS-Powered Pseudocode)
FSRS Constants
S_GRAD = 90 // days — graduation threshold
R_TARGET = 0.90 // desired retention at next review
FSRS_PARAMS = [...] // 21 FSRS-6 parameters (default or optimized per-user)
DECAY = -FSRS_PARAMS[20]
FACTOR = 0.9^(1/DECAY) - 1
Compute Retrievability
function getRetrievability(track, now):
if track.lastReview is null: return 0.0
elapsed_days ← daysBetween(track.lastReview, now)
return (1 + FACTOR × elapsed_days / track.stability)^(-FSRS_PARAMS[20])
Compute Next Interval
function nextInterval(stability):
return (stability / FACTOR) × (R_TARGET^(1/DECAY) - 1)
// At R_TARGET = 0.90, this simplifies to approximately: stability
Is Review Due?
function isDue(track, now):
if not track.unlocked: return false
if track.graduated: return false
if track.nextDue is null: return true // never attempted → due now
return now >= track.nextDue
Record Review (with FSRS Update + Cascading Credit)
function recordReview(kc, dokLevel, grade, responseTimeMs, now):
// Derive grade from response if not explicitly provided
if grade is null:
grade ← deriveGrade(grade >= 2, responseTimeMs)
// Determine which DOK tracks to update (cascading credit)
tracks_to_update ← []
if dokLevel == 'dok4':
tracks_to_update ← [kc.dok4Track, kc.dok2Track, kc.dok1Track]
else if dokLevel == 'dok2':
tracks_to_update ← [kc.dok2Track, kc.dok1Track]
if not kc.dok4Track.unlocked:
kc.dok4Track.unlocked ← true
else:
tracks_to_update ← [kc.dok1Track]
if not kc.dok2Track.unlocked:
kc.dok2Track.unlocked ← true
// Apply FSRS update to each cascaded track
for each track in tracks_to_update:
updateFSRS(track, grade, now)
function updateFSRS(track, grade, now):
if track.reps == 0:
// First review: initialize from FSRS defaults
track.stability ← FSRS_PARAMS[grade - 1] // S₀(G) = w[G-1]
track.difficulty ← computeInitialDifficulty(grade)
else:
R ← getRetrievability(track, now)
if grade == 1: // Again — lapse
track.stability ← computePostLapseStability(track.difficulty, track.stability, R)
track.lapses ← track.lapses + 1
else: // Hard, Good, or Easy — successful recall
track.stability ← computeRecallStability(track.difficulty, track.stability, R, grade)
track.reps ← track.reps + 1
track.difficulty ← updateDifficulty(track.difficulty, grade)
track.lastReview ← now
track.graduated ← track.stability >= S_GRAD
track.nextDue ← track.graduated ? null : now + nextInterval(track.stability)
Daily Session Scheduler (Tier-Aware)
function getDailySession(user, curriculum, tier):
timeBudget ← TIER_BUDGETS[tier] // full=25, rx=5, maintain=5
now ← timestamp()
due_reviews ← []
// Phase 1: Collect all due reviews, sorted by retrievability (lowest R first)
for each kc in user.active_kcs:
for each dok in ['dok1', 'dok2', 'dok4']:
track ← kc.getTrack(dok)
if isDue(track, now):
R ← getRetrievability(track, now)
due_reviews.append((kc, dok, R))
sort due_reviews by R ASC // lowest retrievability first (most at-risk)
// Phase 2: Allocate time
time_remaining ← timeBudget
session ← []
// Reviews first (prioritized by forgetting risk)
for each (kc, dok, _) in due_reviews:
if time_remaining < t_r: break
session.append({ type: 'review', kc, dok })
time_remaining -= t_r
// New KCs with remaining time (skip for 'maintain' tier)
if tier != 'maintain':
available_new ← tier == 'rx'
? getDiagnosticGapKCs(curriculum, user)
: getNextUnlocked(curriculum, user)
while time_remaining >= t_i AND available_new.hasNext():
kc ← available_new.next()
session.append({ type: 'introduction', kc })
time_remaining -= t_i
return session
Key Difference from Fixed Scheduling
The old scheduler sorted by "most overdue" (days past due date). FSRS sorts by "lowest retrievability" (highest forgetting risk). This is superior because two items both 3 days overdue may have very different R values — one with S=5 (R ≈ 0.58) is far more at-risk than one with S=30 (R ≈ 0.87). FSRS prioritizes the right reviews.
14. QC Validation Pipeline
Three-Layer Quality Control
┌────────────────────────────────────────────────────────────────┐
│ QC VALIDATION PIPELINE │
│ │
│ Layer 1: CONTENT VALIDATION │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ For each question: │ │
│ │ ✓ Valid correct answer matches an available option │ │
│ │ ✓ Non-empty choice set │ │
│ │ ✓ Question type matches declared type │ │
│ │ ✓ Auto-extract answers from feedback (16 regex patterns)│ │
│ │ ✓ Multi-select answers parse correctly (MC1|MC2|MC3) │ │
│ │ ✓ Labeled answers parse correctly (subject##MC1) │ │
│ │ Result: KC playability rate │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Layer 2: ENGINE TESTS (34 unit tests) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Scheduling (FSRS): │ │
│ │ ✓ FSRS stability updates correctly per grade (1-4) │ │
│ │ ✓ Good/Easy advances stability, Again triggers lapse │ │
│ │ ✓ Post-lapse stability computed correctly │ │
│ │ ✓ Next interval matches R_target via FSRS formula │ │
│ │ ✓ Graduate when S ≥ S_grad │ │
│ │ │ │
│ │ Cascading Credit: │ │
│ │ ✓ DOK 4 pass increments all three boxes │ │
│ │ ✓ DOK 2 pass increments DOK 2 + DOK 1 only │ │
│ │ ✓ DOK 1 pass increments DOK 1 only │ │
│ │ │ │
│ │ Session Building: │ │
│ │ ✓ Reviews before new KCs │ │
│ │ ✓ Most overdue reviews first │ │
│ │ ✓ Respects time budget │ │
│ │ ✓ Tier-aware (Full/Rx/Maintain) │ │
│ │ ✓ Diagnostic-placed KCs skip encounter │ │
│ │ │ │
│ │ Unlock Logic: │ │
│ │ ✓ DOK 2 unlocks after first DOK 1 pass │ │
│ │ ✓ DOK 4 unlocks after first DOK 2 pass │ │
│ │ ✓ World N unlocks after World N-1 pass │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Layer 3: RENDERER TESTS │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ For each question: │ │
│ │ ✓ Question text renders without errors │ │
│ │ ✓ All options visible and selectable │ │
│ │ ✓ Correct answer highlights properly │ │
│ │ ✓ Multi-select handles compound answers │ │
│ │ ✓ Labeled answers parse and display correctly │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ VALIDATION RESULTS (current): │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ KC playability: 95.3% (1,326/1,392) │ │
│ │ Question render: 99.96% (12,889/12,894) │ │
│ │ Engine tests: 100% (34/34) │ │
│ │ Auto-fix rescued: 2,389 questions │ │
│ └─────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
Answer Auto-Extraction Patterns
16 regex patterns extract answers from feedback text:
Pattern 1: "The correct answer is X"
Pattern 2: "X is correct"
Pattern 3: "<b>X</b>" (bold in HTML feedback)
Pattern 4: "Answer: X"
...
Pattern 16: Contextual extraction from explanation text
Multi-select handling:
Raw: "MC1|MC2|MC3"
Parsed: ["MC1", "MC2", "MC3"]
All must be selected for correct answer
Labeled answers:
Raw: "subject##MC1"
Parsed: { label: "subject", value: "MC1" }
15. Database Schema
Knowledge Graph Layer
kcs -- atomic knowledge components
id UUID PRIMARY KEY
code TEXT UNIQUE -- e.g., "LANG3-042"
text TEXT NOT NULL
explanation TEXT
kc_type TEXT -- fact|concept|procedure|discrimination|integrative
status TEXT DEFAULT 'active'
kc_prerequisites -- DAG edges
kc_id UUID REFERENCES kcs(id)
prerequisite_kc_id UUID REFERENCES kcs(id)
kc_platform_mappings -- external platform skill → internal KC
platform TEXT -- 'mobymax' | 'freckle' | 'ixl'
platform_skill TEXT -- platform's skill ID or name
kc_code TEXT REFERENCES kcs(code)
UNIQUE(platform, platform_skill)
assessment_patterns -- Mislevy contracts, one per KC
kc_id UUID UNIQUE REFERENCES kcs(id)
claim TEXT
evidence TEXT[]
task_variables JSONB
forbidden_shortcuts TEXT[]
common_errors JSONB
kg_questions -- pre-built question bank
kc_id UUID REFERENCES kcs(id)
type TEXT -- multiple_choice|true_false|open_response|error_analysis
question TEXT
options TEXT[]
answer TEXT
dok_level INTEGER -- 1, 2, 3, or 4
Course Layer
courses
id UUID PRIMARY KEY
slug TEXT UNIQUE
name TEXT
grade TEXT
subject TEXT
worlds
id UUID PRIMARY KEY
course_id UUID REFERENCES courses(id)
slug TEXT
name TEXT
anchor TEXT -- connects to prior knowledge
preview TEXT -- 2-sentence trailer
sort_order INTEGER
chapters
id UUID PRIMARY KEY
world_id UUID REFERENCES worlds(id)
name TEXT
icon TEXT -- emoji
sort_order INTEGER
course_kcs -- the curation join
course_id UUID REFERENCES courses(id)
world_id UUID REFERENCES worlds(id)
chapter_id UUID REFERENCES chapters(id)
kc_id UUID REFERENCES kcs(id)
sort_order INTEGER
apply_scenarios -- DOK 2-3 scenarios per world
world_id UUID REFERENCES worlds(id)
prompt TEXT
task TEXT
key_points TEXT[]
related_kc_ids UUID[]
Learner Layer
user_progress -- overall curriculum progress
user_id UUID
curriculum_id TEXT
world_progress JSONB -- serialized WorldProgress[]
product_tier TEXT DEFAULT 'full' -- full|rx|maintain
streak INTEGER
last_activity TIMESTAMPTZ
UNIQUE(user_id, curriculum_id)
kc_mastery -- per-KC, per-DOK mastery tracking (FSRS-powered)
user_id UUID
kc_id INTEGER
kc_uuid UUID -- optional, references kcs.id
kc_code TEXT -- e.g., "LANG3-042"
curriculum_id TEXT
dok_level TEXT -- dok1|dok2|dok4
stability FLOAT DEFAULT 0 -- FSRS S: days until R=90%
difficulty FLOAT DEFAULT 5.0 -- FSRS D: 1.0-10.0
reps INTEGER DEFAULT 0 -- successful review count
lapses INTEGER DEFAULT 0 -- G=1 (Again) count
next_due TIMESTAMPTZ
graduated BOOLEAN DEFAULT false -- true when S ≥ S_grad
skip_encounter BOOLEAN DEFAULT false
source TEXT DEFAULT 'review' -- review|diagnostic|external
history JSONB -- [{date, grade, dok, source, responseTimeMs, stabilityAfter, retrievabilityBefore}]
UNIQUE(user_id, kc_id, curriculum_id, dok_level)
diagnostic_results -- stored diagnostic outcomes
user_id UUID
curriculum_id TEXT
mode TEXT -- quick_scan|prerequisite_probe|retention_check
grade_level INTEGER
grade_adjusted BOOLEAN DEFAULT false
total_questions INTEGER
total_time_ms INTEGER
known_kcs TEXT[] -- KC codes
unknown_kcs TEXT[]
fragile_kcs TEXT[]
recommended_tier TEXT
estimated_gap INTEGER
created_at TIMESTAMPTZ DEFAULT now()
question_history -- every question attempt
user_id UUID
kc_id INTEGER
kc_uuid UUID
curriculum_id TEXT
question_json JSONB
correct BOOLEAN
user_response TEXT
response_time_ms INTEGER
dok_level TEXT
source TEXT DEFAULT 'review' -- review|diagnostic|external
created_at TIMESTAMPTZ DEFAULT now()
content_versions -- versioned KC content for A/B testing
kc_id INTEGER
curriculum_id TEXT
version INTEGER
content_json JSONB
active BOOLEAN DEFAULT true
total_attempts INTEGER DEFAULT 0
successful_attempts INTEGER DEFAULT 0
instant_recall_rate FLOAT DEFAULT 0
16. Worked Example: 276 KCs with Full Tier
Grade 3 Language curriculum. N = 276, n ≈ 2.8 new KCs/day (Full tier, 25 min/day).
Calendar days = ⌈276 / 2.8⌉ + 75 = 99 + 75 = 174 days (~1 semester + buffer)
Total time = 276 × 7.5 min / 60 = 34.5 hours
Daily average = 34.5 hrs × 60 / 174 = 12 min/day average
Peak load = Day 30–99 ≈ 21 min
With FSRS's adaptive scheduling, easy KCs graduate faster (~50 days) and hard KCs take longer (~100 days), so the pipeline is smoother than a fixed schedule. The 75-day E[D_grad] is a weighted average.
With Diagnostic Pre-Placement
Diagnostic: 5 min, 18 questions → 276 KCs placed
Strong student: 180 known, 96 unknown
→ calendar days = ⌈96 / 2.8⌉ + 75 = 35 + 75 = 110 days
→ 37% time savings
Typical student: 100 known, 176 unknown
→ calendar days = ⌈176 / 2.8⌉ + 75 = 63 + 75 = 138 days
→ 21% time savings
Weak student: 40 known, 236 unknown
→ calendar days = ⌈236 / 2.8⌉ + 75 = 85 + 75 = 160 days
→ 8% time savings
With Rx Tier (Targeted Gaps)
Diagnostic: 5 min → identifies 30 targeted gaps
Rx tier: n ≈ 0.6 new/day, 5 min/day
Calendar days = ⌈30 / 0.6⌉ + 75 = 50 + 75 = 125 days (~1 semester)
Total time = 30 × 7.5 / 60 = 3.75 hours over the semester
Daily average = 3.75 hrs × 60 / 125 = 1.8 min/day average
17. Reverse Engineering: Fixed Deadline Schedules
When a test is on a fixed date, FSRS can optimize the schedule to maximize recall on the exact test day by adjusting R_target.
FSRS Deadline Mode
Instead of fixed reverse-engineered intervals, FSRS computes the optimal schedule dynamically:
function scheduleForDeadline(kc, deadline_date):
days_remaining ← daysBetween(now, deadline_date)
current_S ← kc.stability
// Target R on test day = R_target
// If current schedule already achieves this, no change needed
predicted_R_on_test_day ← getRetrievability(kc, deadline_date)
if predicted_R_on_test_day >= R_target:
return current_schedule // already on track
// Otherwise, schedule a review before the test to boost S
// Place the final review so that R(deadline) = R_target
optimal_last_review ← deadline_date - nextInterval(current_S)
return adjustedSchedule(kc, optimal_last_review)
The 10-20% Rule (Cepeda et al.) — FSRS Version
Cepeda's guideline (optimal_gap ≈ 0.10 to 0.20 × retention_interval) is already baked into FSRS's stability growth formulas. FSRS goes further by personalizing the intervals based on actual per-KC difficulty and review history.
Example: 40-Day Deadline
With FSRS (R_target = 0.90, average difficulty):
Review 1: Day 1 (Introduction, S₀ ≈ 2.3)
Review 2: Day 3 (S ≈ 7, scheduled when R ≈ 0.90)
Review 3: Day 10 (S ≈ 20)
Review 4: Day 28 (S ≈ 50)
→ R on Day 40 ≈ 0.93 ✓
Easy material (D=3): Only 3 reviews needed — S grows faster
Hard material (D=8): 5 reviews needed — S grows slowly, reviews packed tighter
18. Knowledge Graph Topology
KC Types (5-type taxonomy)
enum KCType {
FACT // declarative, single testable claim
CONCEPT // category with defining features
PROCEDURE // ordered sequence of steps
DISCRIMINATION // distinguishing confusable cases (A vs B)
INTEGRATIVE // coordinates multiple sub-KCs
}
Curriculum composition target: ~15% discrimination KCs. These are treated as hard prerequisites because gateway distinctions prevent entire categories of downstream errors.
Prerequisite DAG
kc_prerequisites: (kc_id, prerequisite_kc_id)
Directed acyclic graph. A KC cannot be introduced until all its prerequisites have been encountered. The graph determines:
- Topological ordering for introduction sequence
- Unlock gating — world/chapter unlock requires prerequisite satisfaction
- Diagnostic testing — binary search on the DAG finds knowledge frontier in O(log N) questions
- Inference — if a student passes a KC, all prerequisites are inferred as known
Assessment Design Pattern (per KC)
interface AssessmentPattern {
claim: string; // what mastery looks like
evidence: string[]; // observable behaviors proving mastery
taskVariables: {
numbers?: string; // what values can change
context?: string; // what scenarios are valid
format?: string; // valid question formats
};
forbiddenShortcuts: string[]; // surface features that bypass the KC
commonErrors: Array<{
error: string;
why: string;
}>;
}
Question Generation Formula
INPUT: KC.text + KC.explanation + KC.kcType + KC.assessmentPattern
+ target DOK level + context theme
OUTPUT: Novel question testing that specific KC at that DOK level
Generation varies:
- Numbers (from taskVariables.numbers)
- Context (from taskVariables.context)
- Format (from taskVariables.format)
- DOK level (Recognition → Discrimination → Application → Teaching)
KC is the invariant. Everything else is a variable.
19. Real-World Validation Data
Curriculum Scale
| Metric | Value |
|---|---|
| Total KCs | 1,392 |
| Total questions | 12,894 |
| Grades covered | 3-8 |
| KCs per grade | ~232 average |
| Questions per KC | ~9.3 average |
| KC types | 5 (fact, concept, procedure, discrimination, integrative) |
Quality Metrics
| Metric | Result |
|---|---|
| KC playability rate | 95.3% (1,326/1,392) |
| Question render pass rate | 99.96% (12,889/12,894) |
| Engine unit tests | 100% (34/34) |
| Auto-extracted answers | 2,389 rescued |
| Multi-select fix | 331 additional questions rescued |
Problem Evidence
| Metric | Value | Source |
|---|---|---|
| In-app accuracy | 99.4% | MobyMax completion data |
| Standardized test pass rate | 12-26% | By grade (G8 worst, G4 best) |
| Students in doom loops | 79% (352/445) | 3+ attempts same skill, 0 XP |
| Zero-XP completion rate | 42% | All Language completions |
| Wasted student hours | 860 | Session 3 analysis |
| Near-miss students (80-89%) | 34% of first attempts | 262/775 journeys |
| Weak skills already practiced | 99.7% | 350/351 identifiable pairs |
20. Methods to Reduce t_r
| Method | Mechanism | Risk |
|---|---|---|
| Cued recall (vs open) | Faster response, still retrieval | Slightly weaker encoding |
| Voice input | Speaking faster than typing | Transcription accuracy |
| Timed responses | Forces quick retrieval | Anxiety if too tight |
| Flashcard-style | Show → flip → self-grade (4-point FSRS rating) | Honest self-assessment |
| DOK 4 compound | Test 5–8 KCs in one 3-min session | Only at mastery level |
| FSRS R_target tuning | Lower R_target → fewer reviews → faster sessions | Lower retention per-KC |
| FSRS difficulty routing | Auto-skip Easy material, focus on Hard | Over-reliance on D estimates early on |
21. Curriculum Benchmarks
| Curriculum | KCs | Full Tier (2.8/day) | Standard (1.3/day) | Total Hours |
|---|---|---|---|---|
| Language G3 | 276 | 174 days | 288 days | 34.5 hrs |
| Language G3-8 | 1,392 | 572 days | 1,146 days | 174 hrs |
| Single subject (science) | 159 | 132 days | 197 days | 20 hrs |
| One school year (1 subject) | ~300 | 182 days | 306 days | 37.5 hrs |
| Foreign language (basic) | ~2,000 | 789 days | 1,613 days | 250 hrs |
FSRS reduces total hours by ~7% compared to fixed schedules due to fewer expected reviews per KC.
Quick Estimator
"How long to learn X KCs at Y minutes/day?"
n = Y / 9.0 // KCs per day (FSRS expected)
calendar_days = ⌈X / n⌉ + 75 // days to complete (E[D_grad] ≈ 75)
total_hours = X × 7.5 / 60 // total hours of work
daily_avg = total_hours × 60 / calendar_days // average min/day
"With diagnostic, how much faster?"
diagnostic_time = 5 min
known_kcs = diagnostic_result.known.length
effective_N = X - known_kcs
calendar_days_with_diagnostic = ⌈effective_N / n⌉ + 75
time_savings = 1 - (calendar_days_with_diagnostic / calendar_days)
22. Comparison: Same Time, Different Distribution
Empirical retention outcomes for 40 min total study time with a 40-day deadline:
| Strategy | Distribution | Day 40 Recall | Day 50 Recall |
|---|---|---|---|
| Cram | 40 min on Day 39 | 70% | 25% |
| 2 sessions | 20 min × 2 | 78% | 45% |
| 6 sessions | ~7 min × 6 | 88% | 72% |
Forgetting Curve Data (Ebbinghaus, 1885)
| Time After Learning | Memory Remaining |
|---|---|
| 20 minutes | 58% |
| 1 hour | 44% |
| 1 day | 33% |
| 1 week | 25% |
| 1 month | 21% |
Each retrieval at the right moment resets and flattens the curve. After 6 well-spaced retrievals, memory becomes durable for years (Bahrick, 1993: 13 sessions at 56-day intervals = equivalent retention to 26 sessions at 14-day intervals).
23. Research References
| Component | Source | Key Finding |
|---|---|---|
| FSRS algorithm | Ye (2022), ACM KDD; Ye (2023), IEEE TKDE | DSR model: Difficulty, Stability, Retrievability — 20-30% fewer reviews than SM-2 at equivalent retention |
| FSRS forgetting curve | Ye et al. — power-law decay | R(t,S) = (1 + factor·t/S)^decay fits human memory data better than exponential |
| FSRS training data | open-spaced-repetition (2024-2026) | Default parameters trained on several hundred million reviews from ~10k users |
| Spacing intervals | Cepeda et al. (2006), meta-analysis of 317 experiments | Optimal ISI scales with retention interval; expanding intervals outperform fixed |
| Long-term spacing | Bahrick et al. (1993), 9-year study | 13 sessions at 56-day intervals = 26 sessions at 14-day intervals |
| Retrieval practice | Roediger & Karpicke (2006) | Testing produces 50%+ more retention at 1 week vs restudying |
| Interleaving | Taylor & Rohrer (2010) | Interleaved group: 77% vs blocked group: 38% |
| Forgetting curve | Ebbinghaus (1885) | Exponential decay; each retrieval flattens the curve |
| Working memory | Cowan (2001) | ~4 items (not 7) when controlling for chunking |
| Cognitive load | Sweller (1988, 2011) | Intrinsic + extraneous + germane must not exceed WM capacity |
| Mastery learning | Bloom (1968, 1984) | 1-on-1 tutoring + mastery = 2σ improvement (98th percentile) |
| DOK framework | Webb (1997) | Depth of Knowledge: 4 levels of cognitive complexity |
| Knowledge spaces | Doignon & Falmagne (1985, 1999) | Prerequisite structures formalize learnable-next (ZPD) |
| Assessment design | Mislevy — Evidence-Centered Design | Claim, evidence, task variables, forbidden shortcuts |
| KC modeling | Koedinger et al. | Knowledge components as atoms of cognitive tutoring |
| Desirable difficulty | Bjork (1994) | Hard-but-possible retrieval = strongest encoding |
| Performance ≠ learning | Bjork (1994) | High practice performance is a poor predictor of long-term retention |
| Near-permanent retention | Bahrick (1993) | 5-7 spaced retrievals produce decade-long retention |
| Spacing meta-analysis | Latimier et al. (2021) | g = 0.74 benefit for spaced vs massed practice |
This document specifies the complete FSRS-powered algorithm. For the conceptual overview and learning science rationale, see The Learning Algorithm v2. For v1 specifications without FSRS, diagnostic, product tiers, or QC validation, see The Learning Algorithm Technical. FSRS algorithm reference: open-spaced-repetition/fsrs4anki.