feat(personal-tutor): add desirable-difficulty hooks#22
Merged
minsoo-web merged 4 commits intomainfrom May 1, 2026
Merged
Conversation
Captures the brainstorm output for adding desirable-difficulty hooks to the personal-tutor skill. Defines R1-R12 covering Generation-first teaching, warm/cold quiz split, hint follow-through, session capacity policy, and Iron Rule #6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implementation plan for the desirable-difficulty hooks. 4 units (U1-U4): schema extension, Phase 3 Generation-first, Phase 4-5 warm semantics + Iron Rule, Phase 2.0 cold sweep + capacity policy. Plan was reviewed through ce-doc-review (4 personas, 19 actionable findings) and updated to reflect 15 Apply decisions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ration, Iron Rule #6) Restructures the learning protocol so that "fluency illusion" — passing quiz with the answer still in working memory — can no longer upgrade a node to understood. Implements three desirable-difficulty hooks across the session cycle: * Phase 3 Generation-first turn: predict before explain, with strategic hints (no answer leak) when the learner is stuck. * Phase 4 renamed to Warm Quiz, capped at gap→partial. Every new-concept warm pass schedules cold quiz for next session (R5 sole scheduler). * New Phase 2.0 Cold Quiz Sweep at session start (when pending exists), no re-teaching, deterministic format rotation Feynman→Apply→Analyze→ Apply, escalation prevented on cold-fail. partial→understood gates on cold no-hint pass (path A) or review-slot escape valve (path B). Plus: Iron Rule #6 (no same-session understood), Phase 5 self-audit hook, 3-strike downgrade for hint-pass loops, streak counter spanning warm/cold mix, capacity policy (cap=1 when cold-pending exists), 6-line knowledge-graph schema with read-time backward-compat and write-time gradual upgrade. README + Session Flow diagram + Applied Learning Science table synced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nd status docs - Phase 4 now runs warm quiz on review-slot nodes (path B was unreachable before — review flow ended at Socratic Q&A with no quiz step). - Review-slot warm passes do NOT trigger R5 cold-pending; only new-concept warm passes do. Review-slot uses Phase 2.0 rotation rule for format. - Phase 3 review branch now explicitly hands off to Phase 4. - Phase 2.0 cap-decision wording clarified: cap is fixed at session start, remaining pending count only feeds the agenda announcement. - README status table for `understood` now lists both path A (cold) and path B (review-slot escape valve). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
minsoo-web
added a commit
that referenced
this pull request
May 1, 2026
) The desirable-difficulty hooks shipped in #22 add new user-facing capability (cross-session retrieval verification, Generation Effect, mechanical Iron Rule #6 enforcement) while remaining backward-compatible with existing knowledge graphs (read-time defaults + touched-node-only migration). Minor bump per SemVer. Marketplace bumps to 1.10.0 to mirror the plugin's minor change. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cross-session retrieval is now the only path to
understood. Previously the personal-tutor skill let a learner pass a quiz with the answer still echoing in working memory ("fluency illusion") and mark a conceptunderstoodin the same session it was first taught — not really retention. This PR introduces Bjork's "desirable difficulty" hooks: same-session warm quizzes cap atpartial, aCold quiz pending: yesflag schedules retrieval verification for the next session, and a Phase 5 self-audit drift guard mechanically enforces the rule.What changed
Cold quiz pending: yes. Runs before any new teaching, with no re-explanation of the conceptunderstoodwithin the same session a node was first taught." Phase 5 self-audit reverts violations back topartialCold quiz pending: yes. Cold quiz is the only path tounderstoodfor first-taught nodesfeynman → apply → analyze → apply(no return to feynman).Last quiz formatunchanged on fail, so failure never escalates the learner into a higher Bloom levelfailedorpassed (hint used)entries with nopassed (no hint)between →partial → gap, breaks indefinite hint-pass loopspartial → understood. Rescues nodes whose cold quiz was attempted-but-not-cleanly-passedSchema additions:
Cold quiz pendingandLast quiz formatlines per node. Read-time defaults + write-time touched-node-only migration — no batch rewrite of existing graphs.Session flow
flowchart TB A[Session Start] --> B{Cold quiz<br/>pending?} B -->|yes| C[Phase 2.0<br/>Cold Quiz Sweep] B -->|no| D[Phase 2 Agenda] C --> D D --> E[Phase 3<br/>Predict → Explain → Q&A → Check] E --> F[Phase 4 Warm Quiz<br/>caps at gap→partial] F --> G[Phase 5 Archive<br/>+ self-audit] G --> H{Path A: cold<br/>no-hint pass?} G --> I{Path B: review-slot pass<br/>+ prior cold attempt?} H -->|yes| J[partial → understood] I -->|yes| J G --> K{3-strike or<br/>2-fail streak?} K -->|yes| L[partial → gap]Validation
Post-implementation eval against the pre-improvement skill across 4 protocol-compliance scenarios (cold-pending routing, Iron Rule #6 enforcement, path B escape valve, 3-strike streak detection):
The new version is faster despite carrying more rules — deterministic format rotation and explicit cap rules eliminate the reasoning ambiguity the old skill burned cycles on.
The eval set (
plugins/personal-tutor/evals/evals.json) lands in a follow-up commit so reviewers can verify methodology against future regressions; per-run outputs and the benchmark viewer are local-only.Why these ship together
The three feature commits (brainstorm, plan, impl) describe one cognitive intervention. Separating impl from its requirements-and-plan would force a reviewer to reconstruct the desirable-difficulty rationale from code alone — which is exactly the failure mode the rule changes are trying to fix in the learner. Keeping the why and the how in one PR means future edits to the skill can re-read the reason a rule exists before changing it.