docs: add a diagram-heavy developer guide for the research codebase by levon003 · Pull Request #1 · levon003/HealthBlogRec

levon003 · 2026-06-13T02:48:14Z

What

Adds a docs/ folder that documents HealthBlogRec at the architecture level rather than the docstring level — the goal being to make the choices, layout, and shortcomings of this 2021 research code easy to understand for someone reading the paper and the source side by side.

Per the brief: fewer docstrings, more Mermaid diagrams in Markdown summaries, with candid notes throughout about what more modern approaches would recommend.

New files

File	Contents
`docs/README.md`	Orientation, repo map, suggested reading order, one-paragraph summary
`docs/glossary.md`	The project's vocabulary — USP, initiation, eligible/existing/active, triple, coverage — with a state diagram. Read this first; nothing else parses without it.
`docs/architecture.md`	The whole system on one page (5-stage flow), the two entry points, the end-to-end offline experiment loop, and a package map
`docs/data-pipeline.md`	The heart of the project: the timestamp-ordered "replay history one interaction at a time" simulation that produces training triples and test contexts, the stateful machinery (eligibility/graph/activity), the async writer, and feature dedup
`docs/modeling.md`	The 1563-d feature vector layout, the model zoo (LinearNet/SimNet/ConcatNet/LearnedSimNet/InteractionNet), the training loop, baselines, and the cached offline-evaluation trick used for hyperparameter sweeps
`docs/modernization.md`	Consolidated, candid "what would you do today?" notes — and a short "what aged well" section

Also links the guide from the top-level README.md.

Approach & accuracy

Documentation is grounded in the actual source — file/line references throughout (e.g. the test-time target fallback whose own comment admits "random might literally be better" at reccontext.py:165, the FIXME is this reasonable? cache resize, reconstructed amp timestamps, pointwise-BCE-vs-ranking-metrics mismatch).
The "🕰️ Modern take" call-outs are framed as orientation, not bug reports — they explicitly acknowledge the original constraints (fixed dataset, single Slurm cluster, paper deadline, 2021 tooling).
All 14 Mermaid diagrams were validated against the mermaid parser (with a JSDOM backend) — 0 parse failures.

Docs-only change; no code is touched.

https://claude.ai/code/session_018fRrzqPsGMHL3roZ2E3gVq

Generated by Claude Code

Add a docs/ folder that documents the system at the architecture level rather than the docstring level: how the pieces fit together, the project-specific vocabulary, the streaming data-generation pipeline, the model zoo and offline evaluation, and candid notes on what modern practice would recommend. - docs/README.md orientation + repo map + reading order - docs/glossary.md USP / initiation / eligible-existing-active / triple / coverage, with a state diagram - docs/architecture.md whole-system overview, entry points, experiment loop, package map (Mermaid) - docs/data-pipeline.md the timestamp-ordered replay that produces training triples and test contexts, plus the async writer and feature dedup - docs/modeling.md 1563-d feature vector, model zoo, training loop, baselines, cached offline evaluation, sweeps - docs/modernization.md consolidated "what would you do today" notes All 14 Mermaid diagrams validated with the mermaid parser. Link the new guide from the top-level README. https://claude.ai/code/session_018fRrzqPsGMHL3roZ2E3gVq

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add a diagram-heavy developer guide for the research codebase#1

docs: add a diagram-heavy developer guide for the research codebase#1
levon003 wants to merge 1 commit into
mainfrom
claude/research-docs-diagrams-9p9xw4

levon003 commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

levon003 commented Jun 13, 2026

What

New files

Approach & accuracy

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants