Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
282 commits
Select commit Hold shift + click to select a range
f46bf03
phase-3: small_model 78% coverage — features now match upstream
BenjaminDEMAILLE Apr 26, 2026
e86666e
phase-3 milestone: 91.8% exact-match VCF + 78% small_model coverage
BenjaminDEMAILLE Apr 26, 2026
8e6c287
phase-3: native realigner — recovers 220 upstream-matching sites
BenjaminDEMAILLE Apr 26, 2026
1aae7c1
phase-3: PORT_LOG — realigner state, remaining 100% parity gaps
BenjaminDEMAILLE Apr 26, 2026
8e3383a
phase-3: multi-allelic merge — port upstream's "product" combiner
BenjaminDEMAILLE Apr 26, 2026
93cd166
phase-3: realigner_native.{h,cc} (was untracked from earlier commit)
BenjaminDEMAILLE Apr 26, 2026
b7d2538
phase-2: batched Core ML prediction → 1.8× speedup, GPU now wins
BenjaminDEMAILLE Apr 26, 2026
58fd685
phase-2/3: final PORT_LOG — batched GPU 1.06s + 84% upstream match
BenjaminDEMAILLE Apr 26, 2026
637293f
phase-3 honest assessment + 23 .mlpackage models converted
BenjaminDEMAILLE Apr 26, 2026
4d8ff50
phase-3: postprocess at 99.93 % bit-parity vs upstream on identical CVOs
BenjaminDEMAILLE Apr 26, 2026
2688d62
phase-5/6/4: release + Homebrew + validation scaffolding
BenjaminDEMAILLE Apr 26, 2026
d6843b2
realigner: extend ref window using reads' actual alignment span
BenjaminDEMAILLE Apr 26, 2026
dfab1d2
dev: add dump_cvo for parity diagnostics
BenjaminDEMAILLE Apr 26, 2026
971a6ed
dev: add dump_allele_counts for per-position parity diffs
BenjaminDEMAILLE Apr 26, 2026
633603f
log: realigner read_span port + per-position diagnostics findings
BenjaminDEMAILLE Apr 26, 2026
8f46277
realigner: dedicated WindowSelector AlleleCounter + region expansion
BenjaminDEMAILLE Apr 26, 2026
68a9c77
postprocess: bit-parity fixes for QUAL, PL, GQ saturation
BenjaminDEMAILLE Apr 26, 2026
884b299
postprocess: QUAL = phred(1 - sum(alt_probs)), not phred(p_ref)
BenjaminDEMAILLE Apr 26, 2026
7cf147e
postprocess: reindex AD/VAF/MF/MD when pruning alt alleles
BenjaminDEMAILLE Apr 26, 2026
0df6e83
log: parity push 86.5% → 98.75% key match, 0 → 81% byte-identical
BenjaminDEMAILLE Apr 26, 2026
cc77cb7
postprocess: GQ banker's rounding + 1.25e-10 phred floor
BenjaminDEMAILLE Apr 26, 2026
e6975ae
realigner: assign reads to max-overlap region, not first-overlap
BenjaminDEMAILLE Apr 26, 2026
9c4a23a
realigner: only check ref_end ≤ region.end, not also ref_start
BenjaminDEMAILLE Apr 26, 2026
78b31aa
make_examples: small_model GQ threshold uses truncation, not std::round
BenjaminDEMAILLE Apr 27, 2026
6cfafc8
log: final parity stats — 83.9% byte / 98.95% key on chr20:5M-6M
BenjaminDEMAILLE Apr 27, 2026
d740721
make_examples: partition calling region into 1000bp chunks
BenjaminDEMAILLE Apr 27, 2026
4a22595
realigner: extend diag CSV with hap_hash + DV_REALIGNER_DIAG_HAP
BenjaminDEMAILLE Apr 27, 2026
2a5b46a
log: partition_size=1000 fix → DBG bit-parity confirmed (89.83% byte)
BenjaminDEMAILLE Apr 27, 2026
81ce1d2
make_examples: --min_mapping_quality default 5 (mirror upstream)
BenjaminDEMAILLE Apr 27, 2026
c714a4b
log: min_mapping_quality fix → 100% candidate-set parity reached
BenjaminDEMAILLE Apr 27, 2026
1bd4e3b
phase-5.5: extract_weights.py — pack TF SavedModel to .dvw
BenjaminDEMAILLE Apr 27, 2026
c4b84d9
phase-5.5: dv_weights — C++ mmap loader for .dvw files
BenjaminDEMAILLE Apr 27, 2026
0e6fcb0
phase-4: GIAB F1 validation on HG002 chr20 + run_giab.sh fix
BenjaminDEMAILLE Apr 27, 2026
9f31f5c
phase-4: gate PASSED — chr20 F1 vs upstream Docker
BenjaminDEMAILLE Apr 27, 2026
16fa749
phase-5.5: metal_inference — MPSGraph Inception-v3 builder
BenjaminDEMAILLE Apr 27, 2026
3a923d8
phase-5.5: bnns_finalize — deterministic CPU dense + softmax
BenjaminDEMAILLE Apr 27, 2026
03139f1
phase-5.5: call_variants --inference_backend=metal wiring (WIP)
BenjaminDEMAILLE Apr 27, 2026
7f5f717
log: Phase 4 PASS + Phase 5.5 progress (5 deliverables, 1 known issue)
BenjaminDEMAILLE Apr 27, 2026
7836f60
phase-5.5 debug: metal_inference taps + debug_metal walker
BenjaminDEMAILLE Apr 27, 2026
bce0415
phase-5.5a: per-layer TF reference + debug_metal --compare + epsilon fix
BenjaminDEMAILLE Apr 28, 2026
0022050
log: CLAUDE.md update — strict FILTER gate + Phase 5.5 status + pitfalls
BenjaminDEMAILLE Apr 28, 2026
cadfcb2
phase-5.5c WIP: imToCol + matmul + Level0 — same channel permutation
BenjaminDEMAILLE Apr 28, 2026
b84d364
phase-5.5a: stem CBR matches TF reference within 1 ULP
BenjaminDEMAILLE Apr 28, 2026
0b445c1
phase-5.5a: fix InceptionA/B/C (conv_n, bn_n) pairs — 19/19 taps matc…
BenjaminDEMAILLE Apr 28, 2026
f1e9798
log: CLAUDE.md updated for Phase 5.5a fix + dump_authoritative_pairs.py
BenjaminDEMAILLE Apr 28, 2026
a2a2748
log: PORT_LOG Phase 5.5a + 5.5b — root cause + 100% PASS parity
BenjaminDEMAILLE Apr 28, 2026
0957a94
cli: parallelize make_examples shards via posix_spawn (workaround)
BenjaminDEMAILLE Apr 28, 2026
00264e0
cli: per-shard examples files + propagate --inference_backend / --che…
BenjaminDEMAILLE Apr 28, 2026
9cf4415
log: CLAUDE.md — Phase 5.5b sub-region results + parallel sharding note
BenjaminDEMAILLE Apr 28, 2026
16d4f2e
log: PORT_LOG full chr20 result — 4:11 wall-time, 1.13% FILTER drift
BenjaminDEMAILLE Apr 28, 2026
0d83c4c
make_examples: true intra-process threading (1 proc → ~1400 % CPU)
BenjaminDEMAILLE Apr 28, 2026
8435f8d
phase-5.5c: deterministic Metal compute kernel + per-layer drift prof…
BenjaminDEMAILLE Apr 28, 2026
41ce11a
phase-5.5d/1: portable libstdc++-compatible std::shuffle for pileup-i…
BenjaminDEMAILLE Apr 28, 2026
78aa3df
phase-5.5d/2: postprocess multi-allelic — skip CVOs touching pruned alts
BenjaminDEMAILLE Apr 28, 2026
2b90761
log: CLAUDE.md — Phase 5.5d full chr20 status with both fixes
BenjaminDEMAILLE Apr 28, 2026
334bf41
phase-5.5d/3: NumPy-compatible reservoir sampling — chr20 FILTER drif…
BenjaminDEMAILLE Apr 28, 2026
1a83940
log: CLAUDE.md — Phase 5.5d/3 final chr20 status (29 FILTER flips of …
BenjaminDEMAILLE Apr 28, 2026
2388f93
phase-5.5d/4: haplotype resolution — chr20 FILTER drift 0.014 % → 0.0…
BenjaminDEMAILLE Apr 29, 2026
3e79a0e
log: CLAUDE.md — Phase 5.5d/4 final state (4 FILTER mismatches of 209…
BenjaminDEMAILLE Apr 29, 2026
9bedc71
phase-5.5d/5: simplify_variant_alleles — chr20 FILTER drift 0.002 % →…
BenjaminDEMAILLE Apr 29, 2026
aed8eac
small_model: force MLComputeUnitsCPUOnly for FP32 determinism
BenjaminDEMAILLE Apr 29, 2026
95b2abd
phase-5.5d/7+8: BNNS-CPU small_model + per-alt-set dispatch — chr20 0…
BenjaminDEMAILLE Apr 29, 2026
b5fcde0
log: CLAUDE.md — Phase 5.5d FINAL — 100 % FILTER parity vs Docker on …
BenjaminDEMAILLE Apr 29, 2026
2ac66d1
log: confirm chr20 phase-5.5d/8 — 14/14 site-set diffs root-caused
BenjaminDEMAILLE Apr 29, 2026
9c6a735
phase-5.5d/9: AltAlleleQual = phred(1-sum_alt) rounded to 7 decimals …
BenjaminDEMAILLE Apr 29, 2026
e3a657a
phase-5.5d/10: PL log-space + truncation (matches upstream vcf_writer…
BenjaminDEMAILLE Apr 29, 2026
6a194e5
phase-6/step-1.1+1.2: DeepTrio flags + 3-sample SampleOptions builder
BenjaminDEMAILLE Apr 29, 2026
950e4a3
phase-6/step-1.3: DeepTrio multi_sample::VariantCaller wiring (v0)
BenjaminDEMAILLE Apr 29, 2026
19eb271
phase-6/step-1.5: cli.cc trio dispatch — 3x call_variants + 3 VCFs
BenjaminDEMAILLE Apr 29, 2026
5670940
phase-6/step-1.6+1.7-v0: DeepTrio end-to-end runs natively on Apple S…
BenjaminDEMAILLE Apr 29, 2026
4f9bc3d
phase-6/step-1.3-bis: per-sample realigner for trio — closes only_doc…
BenjaminDEMAILLE Apr 29, 2026
658282e
log: CLAUDE.md — Phase 6 status (trio in progress, somatic + pangenom…
BenjaminDEMAILLE Apr 29, 2026
1541a46
phase-6/step-1.7-v1: vsc_min_fraction_multiplier=0.67 for trio — clos…
BenjaminDEMAILLE Apr 29, 2026
d4eb7d1
phase-6/step-1.7-v2: trio small_model with per-sample features (106 dim)
BenjaminDEMAILLE Apr 29, 2026
65e0a07
phase-6/step-1.7-v3: total_depth must be unfiltered in per-sample fea…
BenjaminDEMAILLE Apr 29, 2026
a5d9fe9
phase-5.5d/11: trio small_model GQ clamp + DeepTrio thresholds
BenjaminDEMAILLE Apr 29, 2026
4e7cc7a
phase-5.5d/12: trio per-sample candidate_positions (not union)
BenjaminDEMAILLE Apr 30, 2026
e5bd918
phase-5.5d/13: parameterize Metal Inception-v3 input shape (H, C)
BenjaminDEMAILLE Apr 30, 2026
1d08e0c
phase-6/step-2: DeepSomatic orchestration — end-to-end working
BenjaminDEMAILLE Apr 30, 2026
b3a629d
phase-6/step-2-v2: somatic GERMLINE filter — PASS set matches Docker
BenjaminDEMAILLE Apr 30, 2026
4cd464a
phase-6/step-2-v3: somatic threshold overrides — 99.28% FILTER parity
BenjaminDEMAILLE Apr 30, 2026
5d69738
phase-6/step-2-v4: somatic sort_by_alt_allele_support — 100% FILTER p…
BenjaminDEMAILLE Apr 30, 2026
756609e
log: CLAUDE.md updated for Step 1 + Step 2 100% FILTER parity
BenjaminDEMAILLE Apr 30, 2026
67fe48d
phase-6/step-3-v1: pangenome-aware DV orchestration end-to-end
BenjaminDEMAILLE Apr 30, 2026
97aeb00
phase-6/step-3-v2: pangenome --min_mapping_quality=0 + apples-to-appl…
BenjaminDEMAILLE Apr 30, 2026
f0f0998
phase-6/step-3-v3: pangenome keep_legacy_behavior + keep_supplementar…
BenjaminDEMAILLE Apr 30, 2026
35342cd
log: CLAUDE.md updated for Phase 6 Step 3 (pangenome-aware DV in prog…
BenjaminDEMAILLE Apr 30, 2026
59fb198
phase-6/step-3-v4: pangenome aln_* + dbg_disable_graph_pruning
BenjaminDEMAILLE Apr 30, 2026
953e5ca
phase-6/step-3: GBZ→BAM extraction tooling for pangenome testing
BenjaminDEMAILLE Apr 30, 2026
be1aab1
log: CLAUDE.md updated for Phase 6 Step 3-v5 (better pangenome BAM)
BenjaminDEMAILLE Apr 30, 2026
d4393ad
phase-6/step-3-v6: dbg_disable_graph_pruning — kept min_edge_weight=0…
BenjaminDEMAILLE Apr 30, 2026
f5bff5d
phase-6/perf: auto-detect num_shards (hw_concurrency - 2)
BenjaminDEMAILLE Apr 30, 2026
31ce77d
phase-6/perf: metal as default inference_backend (was coreml)
BenjaminDEMAILLE Apr 30, 2026
da2a881
phase-6/step-3-v7+v8: pangenome 99.69% site-set parity (321/322)
BenjaminDEMAILLE Apr 30, 2026
e077348
log: CLAUDE.md updated for Phase 6 Step 3 v8 (99.69% pangenome parity)
BenjaminDEMAILLE Apr 30, 2026
bae3fab
phase-6/step-3-v9: pangenome 100% Docker parity (322/322)
BenjaminDEMAILLE Apr 30, 2026
93e153d
log: CLAUDE.md updated for Phase 6 Step 3 v9 (100% pangenome parity)
BenjaminDEMAILLE Apr 30, 2026
3824ac4
phase-5.5e: deterministic AvgPool / Concat / GlobalAvgPool Metal kernels
BenjaminDEMAILLE Apr 30, 2026
e8e5c2d
phase-5.5f: unfolded conv→BN→ReLU kernel infra (research, no FILTER win)
BenjaminDEMAILLE Apr 30, 2026
ffedb5a
phase-8/tier-6.0: Kahan PoC + extended conv_serial validation
BenjaminDEMAILLE Apr 30, 2026
124c6f9
phase-8/tier-6.0: DetMixedBlock infrastructure + Mixed_5b builder
BenjaminDEMAILLE Apr 30, 2026
71a1787
phase-8/tier-6.0: microtest_det_mixed5b — Mixed_5b PoC validation
BenjaminDEMAILLE Apr 30, 2026
07712aa
phase-8/tier-6.0: all 11 Inception blocks (5b-7c) det dispatch — SOTA…
BenjaminDEMAILLE Apr 30, 2026
f46e3e8
phase-8/tier-6.0: wire DV_METAL_SERIAL_FULL into MetalInception::Predict
BenjaminDEMAILLE Apr 30, 2026
c84b973
phase-8/tier-6.0: folded BN path in det blocks + relaxed nil checks
BenjaminDEMAILLE Apr 30, 2026
8a98697
log: CLAUDE.md updated for Phase 8 / Tier 6.0 final decision
BenjaminDEMAILLE Apr 30, 2026
183c122
tier-5: GLnexus 1.4.1 Mac ARM build script + Homebrew formula draft
BenjaminDEMAILLE May 1, 2026
54b9137
tier-4: temperature scaling (Guo et al. ICML 2017) — opt-in flag
BenjaminDEMAILLE May 1, 2026
6405a90
tier-5: GLnexus build script — document remaining build issues
BenjaminDEMAILLE May 1, 2026
9502449
tier-1: validation/diff_filter_classes.sh — standardize Docker FILTER…
BenjaminDEMAILLE May 1, 2026
67d1c15
tier-2: multi-seed TTA — --tta_seed_offset flag + run_tta.sh orchestr…
BenjaminDEMAILLE May 1, 2026
a0e5e3b
tier-1: validation/download_giab_strats.sh — GIAB strats v3.6 downloader
BenjaminDEMAILLE May 1, 2026
4f796f1
tier-5: GLnexus build — htslib + yaml-cpp patches (Mac ARM)
BenjaminDEMAILLE May 1, 2026
e31f10d
tier-5: GLnexus build — document fcmm upstream-deletion blocker
BenjaminDEMAILLE May 1, 2026
40a955a
log: CLAUDE.md — Phase 8 / Tier 1, 2, 4, 5 session summary
BenjaminDEMAILLE May 1, 2026
3d651b1
phase-9/step-1: --alt_aligned_pileup flag wiring (PacBio/ONT support)
BenjaminDEMAILLE May 1, 2026
cb38de0
phase-9/step-2a: methylation calling — make_examples flags + channel
BenjaminDEMAILLE May 1, 2026
6291ffd
phase-9/step-5: run_giab.sh — whole-genome mode (empty region arg)
BenjaminDEMAILLE May 1, 2026
e2d77f6
log: CLAUDE.md updated for Phase 9 (Steps 1, 2a, 5a done; 3, 4, 5b de…
BenjaminDEMAILLE May 1, 2026
236ae03
phase-9/step-4a: link dv_direct_phasing + --use_direct_phasing flag
BenjaminDEMAILLE May 1, 2026
e31547b
log: CLAUDE.md — Step 4a (DirectPhasing link + flag) done
BenjaminDEMAILLE May 1, 2026
35d1e1f
phase-9/step-4b: DirectPhasing per-region orchestration (single-sample)
BenjaminDEMAILLE May 1, 2026
54eddc6
log: CLAUDE.md — Step 2b (no-op, already done) + Step 4b (single-sample)
BenjaminDEMAILLE May 1, 2026
e6ac09d
phase-9/step-3: --gvcf_outfile warning stub + detailed TODO
BenjaminDEMAILLE May 1, 2026
46b367d
log: CLAUDE.md — Step 3 status (stub + warning + TODO)
BenjaminDEMAILLE May 1, 2026
ec98002
phase-9/step-5b: whole-genome trio validation scripts
BenjaminDEMAILLE May 1, 2026
65d6f3c
log: CLAUDE.md — Step 5b scripts done; gVCF impl remains deferred
BenjaminDEMAILLE May 1, 2026
2bdf046
phase-9/step-3: gVCF block emission — full native implementation
BenjaminDEMAILLE May 1, 2026
a3d7247
phase-9/step-3 v2: gVCF Docker parity — _quantize_gq, FORMAT order, PL
BenjaminDEMAILLE May 1, 2026
e0c5fcd
phase-9/step-5b v2: publication-ready trio benchmark infrastructure
BenjaminDEMAILLE May 1, 2026
9bdcfe9
docs: add scientific_report.md — FM biological-impact analysis + WG b…
BenjaminDEMAILLE May 1, 2026
0be5eea
phase-9/step-5b v3: shard-count-independence guard for reservoir samp…
BenjaminDEMAILLE May 1, 2026
a7b29e3
docs: replace README.md with port-specific overview, preserve upstream
BenjaminDEMAILLE May 1, 2026
3bcca88
perf: NEON normalization, RAM-tiered AutoBatchSize, hoisted buffer al…
BenjaminDEMAILLE May 1, 2026
71d7256
perf+docs: TFRecordWriter F_NOCACHE (Jetsam fix) + WG validation plan
BenjaminDEMAILLE May 1, 2026
12b4713
perf: P1+P2 pipeline parallelism — async writer + pre-fetch reader
BenjaminDEMAILLE May 1, 2026
467d1a7
perf: A5 — os_signpost profiling infrastructure for Instruments
BenjaminDEMAILLE May 1, 2026
ea07b03
perf: A2.1 — NEON base-color kernel + bit-equivalence microtest
BenjaminDEMAILLE May 2, 2026
352c89c
perf: A2.2 — NEON M-block byte classifier + bit-equivalence microtest
BenjaminDEMAILLE May 2, 2026
f9364c2
fix: WG chunked script was missing --small_model_path
BenjaminDEMAILLE May 2, 2026
e40f854
poc: scalar BNNS-CPU stem A/B vs TF Docker reference
BenjaminDEMAILLE May 2, 2026
a70fd15
perf: DV_METAL_GPU_FINALIZE — final 2048→3 dense + softmax on GPU
BenjaminDEMAILLE May 2, 2026
b6a164c
perf: --inference_backend=ane_speculate — ANE FP16 + GPU FP32 rerun
BenjaminDEMAILLE May 2, 2026
40c5266
perf: extend ane_speculate to trio / somatic / pangenome dispatches
BenjaminDEMAILLE May 2, 2026
de6be37
fix: ane_speculate — soften shape mismatch from fatal error to warning
BenjaminDEMAILLE May 2, 2026
8c1d5f3
docs: ane_speculate cross-mode validation + trio mlpackage shape fix
BenjaminDEMAILLE May 2, 2026
7456fc1
docs: pangenome ane_speculate completes the 4-mode validation matrix
BenjaminDEMAILLE May 2, 2026
70dfa10
perf: ane_speculate — add min(softmax) borderline trigger
BenjaminDEMAILLE May 2, 2026
e37f64b
docs: WG benchmark post-fix — F1 bit-identical to Docker, 1.84× speed…
BenjaminDEMAILLE May 2, 2026
1b79c31
feat: per-model flags for all DeepVariant model types (WES/PacBio/ONT…
BenjaminDEMAILLE May 2, 2026
eef07de
feat: complete multi-mode dispatch — trio/somatic/pangenome as top-le…
BenjaminDEMAILLE May 2, 2026
18e1209
fix: restore WGS realigner + remove erroneous vaf_context_window from…
BenjaminDEMAILLE May 2, 2026
413b3a3
fix: small_model_vaf_context_window_size=51 for WGS/WES — closes PASS…
BenjaminDEMAILLE May 2, 2026
2b57460
docs: 4-mode Docker parity + WG benchmark update
BenjaminDEMAILLE May 3, 2026
b0117f3
perf: A5 — os_signpost markers for make_examples hot phases
BenjaminDEMAILLE May 3, 2026
6334b10
docs: PORT_LOG.md — 2026-05-03 per-model flags + vaf51 fix + A5 signp…
BenjaminDEMAILLE May 3, 2026
7db6229
docs: PORT_LOG.md — correct vaf51 analysis: 4146 FM is FP32 drift non…
BenjaminDEMAILLE May 3, 2026
e241d7f
perf: A2.1 — NEON vqtbl1q_u8 base-color batch in CalculateRefRows
BenjaminDEMAILLE May 3, 2026
ca35031
feat: DeepSomatic tumor-only mode (WGS/WES/FFPE) + PON allele_frequen…
BenjaminDEMAILLE May 5, 2026
a884102
fix: capture script — correct model_type + .vcf.gz output for tumor-only
BenjaminDEMAILLE May 5, 2026
744969c
perf: A2.2 — NEON M-block classifier wired into AlleleCounter::Add
BenjaminDEMAILLE May 5, 2026
a798188
docs: tumor-only 100% FILTER parity — WGS_TO + FFPE_WGS_TO on chr20:1…
BenjaminDEMAILLE May 5, 2026
2f0da5c
fix: FFPE_WGS vsc_max_fraction_for_non_target_sample — 100% FILTER pa…
BenjaminDEMAILLE May 5, 2026
91624cd
feat: complete DeepSomatic 8-mode parity — WES/FFPE_WES TN + tumor-only
BenjaminDEMAILLE May 5, 2026
d746d99
fix: DeepTrio WES/ONT heights + sort_by_alt_allele_support somatic scope
BenjaminDEMAILLE May 5, 2026
7081da2
fix: PacBio/ONT germline pipeline — 3 bugs fixed, all modes runnable
BenjaminDEMAILLE May 5, 2026
6fffa32
docs: session 2026-05-05 — WES/FFPE somatic + DeepTrio WES + germline…
BenjaminDEMAILLE May 5, 2026
74ce090
docs: design spec for PacBio SM 106-features + WGS FM + Homebrew
BenjaminDEMAILLE May 5, 2026
a6c688a
feat: PacBio/ONT 106-feature haplotype-expanded small model
BenjaminDEMAILLE May 5, 2026
4654e1e
feat: Homebrew formulas v1.10.0 — full model inventory + DVW + PON
BenjaminDEMAILLE May 5, 2026
7a8974c
fix: DeepTrio PacBio/ONT pipeline — correct shape (140/300,199,9)
BenjaminDEMAILLE May 5, 2026
49b8a81
docs: DeepTrio PacBio/ONT shape fix + WGS temperature scan conclusion
BenjaminDEMAILLE May 5, 2026
b30aa7b
fix: parameterize input_width in MetalInception — fixes PacBio/ONT so…
BenjaminDEMAILLE May 5, 2026
5cceebd
docs: full proxy test matrix — 23/23 modes crash-free, 14/23 Docker-p…
BenjaminDEMAILLE May 5, 2026
17945e2
docs: clarify WGS 0 FM gate — small model required; T-scan was withou…
BenjaminDEMAILLE May 5, 2026
0908b7e
docs: full chr20 FM root-cause + revised Homebrew gate
BenjaminDEMAILLE May 6, 2026
1521a4e
docs: confirm 428 FM is genuine (matched Docker = same result) + upda…
BenjaminDEMAILLE May 6, 2026
b2c0284
docs: update CLAUDE.md project status table + release gates
BenjaminDEMAILLE May 6, 2026
6990455
fix: model flag audit — PacBio min_base_quality, ONT vsc_max_fraction…
BenjaminDEMAILLE May 6, 2026
3bdfbc9
docs: update Homebrew formula — dual PON (Illumina + PacBio/ONT), aut…
BenjaminDEMAILLE May 6, 2026
0808638
fix: somatic flag audit — WGS/WES TO vsc_max_fraction + FFPE_WGS dead…
BenjaminDEMAILLE May 6, 2026
34f966e
fix: DeepTrio flag audit — PacBio/ONT max_reads + vaf_context_window …
BenjaminDEMAILLE May 6, 2026
6d2302a
feat: declare discard_non_dna_regions ABSL_FLAG + restore trio override
BenjaminDEMAILLE May 6, 2026
6278510
feat: postprocess --pon_filtering for somatic — full DeepSomatic parity
BenjaminDEMAILLE May 6, 2026
0b3c50f
docs: PORT_LOG — final FILTER parity matrix + 6 bug fixes + 2 new fea…
BenjaminDEMAILLE May 6, 2026
11412c7
fix: CRITICAL — CVO merge silently produced empty file when small_cvo…
BenjaminDEMAILLE May 7, 2026
6da5b18
docs: 18 modes at scientific FILTER parity (14 short-read 0 FM + 4 lo…
BenjaminDEMAILLE May 7, 2026
877ba01
docs: WG regression check — chr20 byte-identical to 2026-05-02; F1 pr…
BenjaminDEMAILLE May 7, 2026
3fcc6b8
docs: PASS-flip root-cause analysis — sse2neon vs Rosetta SSW drift i…
BenjaminDEMAILLE May 7, 2026
9bc1d9e
chore: upgrade vendored sse2neon.h to modern DLTcollab fork (11744 li…
BenjaminDEMAILLE May 7, 2026
0f8470c
docs: PASS-flip deep dive — sse2neon ruled out, AlleleCounter localized
BenjaminDEMAILLE May 7, 2026
9cedd3a
docs: read-by-read trace — htslib parity confirmed, divergence in VC …
BenjaminDEMAILLE May 7, 2026
05cab51
defensive: sort proto-map iteration in CreateCombinedAllelesSupport
BenjaminDEMAILLE May 7, 2026
90ff83d
docs: deeper trace — divergence isolated to make_examples cvo (pre-ca…
BenjaminDEMAILLE May 7, 2026
e346b52
docs: deepest C++ trace — bq=11 boundary identified, root cause multi…
BenjaminDEMAILLE May 7, 2026
fbead42
feat: Phase 9 / Step 4c.1 — wire PS info field per-region for DirectP…
BenjaminDEMAILLE May 7, 2026
9fedf24
docs: PORT_LOG — Step 4c PS info field complete + cross-region stitch…
BenjaminDEMAILLE May 7, 2026
3e6a732
docs: B1+B2 — first real-data F1 validation for PacBio + ONT chr20:1M-2M
BenjaminDEMAILLE May 7, 2026
51b56d0
docs: B1+B2 fixed — long-read divergence was missing --small_model_path
BenjaminDEMAILLE May 7, 2026
94f41f0
feat: warn when --small_model_path empty + bundle has trained_small_m…
BenjaminDEMAILLE May 7, 2026
e78531c
feat: auto-discover small_model dir from --checkpoint sibling
BenjaminDEMAILLE May 7, 2026
c166c1f
docs: refresh stale --use_direct_phasing help text
BenjaminDEMAILLE May 7, 2026
5af4374
docs: refresh CLAUDE.md — B1+B2 done, Step 4b-trio+4c done
BenjaminDEMAILLE May 7, 2026
d6b9506
feat: early-fail validation for --reads / --ref / --checkpoint paths
BenjaminDEMAILLE May 7, 2026
35d7f87
feat: top-level help — list all subcommands, handle -h/--help/help
BenjaminDEMAILLE May 7, 2026
a55d00f
docs: PORT_LOG — mark stale 70-feature TODO as resolved
BenjaminDEMAILLE May 7, 2026
e7d45be
fix(5.5d/14): wire DirectPhasing per-read output into small_model dis…
BenjaminDEMAILLE May 7, 2026
264901f
fix(5.5d/15): propagate keep_supplementary_alignments to SamReader
BenjaminDEMAILLE May 7, 2026
4a68f6b
feat: per-subcommand --help now lists our flags (was: 'No flags match…
BenjaminDEMAILLE May 8, 2026
06689bc
feat: multi-call binary — deeptrio/deepsomatic/pangenome-aware-deepva…
BenjaminDEMAILLE May 8, 2026
44de9ae
feat: --version flag (and -v, version subcommand) across all entry po…
BenjaminDEMAILLE May 8, 2026
7fc84ab
chore: Homebrew formula — install multi-call symlinks + version test
BenjaminDEMAILLE May 8, 2026
c8ad950
docs: PORT_LOG — biological characterization of PacBio chr20 FILTER m…
BenjaminDEMAILLE May 8, 2026
2235aae
docs: PORT_LOG — cross-mode biological survey (13 hap.py-annotated runs)
BenjaminDEMAILLE May 8, 2026
224ac32
docs: PORT_LOG — comparative FILTER-vs-Docker on 4 cached baselines
BenjaminDEMAILLE May 8, 2026
0e15ddb
docs: PORT_LOG — root-cause diagnosis of chr20:23.97-23.99M PacBio bug
BenjaminDEMAILLE May 8, 2026
2c04874
docs: PORT_LOG — DeepSomatic tumor-only ALL 4 modes at 100% FILTER pa…
BenjaminDEMAILLE May 8, 2026
dd57364
docs: PORT_LOG — DeepSomatic T+N modes also at 100% FILTER parity
BenjaminDEMAILLE May 8, 2026
b41326a
fix(5.5d/16): DirectPhasing region padding — 20% match upstream
BenjaminDEMAILLE May 8, 2026
b91d5cb
docs: PORT_LOG — short-read Illumina trio also at 100% FILTER parity
BenjaminDEMAILLE May 8, 2026
65abb23
docs: PORT_LOG — interim WG biology while HG002 BAM downloads
BenjaminDEMAILLE May 8, 2026
26b55df
fix(critical): TFRecordReader — tolerate truncated last-record per shard
BenjaminDEMAILLE May 10, 2026
c841448
docs: PORT_LOG — WG FILTER parity vs Docker (99.96 %, single-commit r…
BenjaminDEMAILLE May 10, 2026
0e9d868
docs: PORT_LOG — WG F1 + biological residual characterization
BenjaminDEMAILLE May 10, 2026
0aeb00c
fix(critical): TFRecordWriter — F_NOCACHE silently truncates partial …
BenjaminDEMAILLE May 10, 2026
05ec75c
feat(WG-parity): canonical-contig filter — match Docker's default
BenjaminDEMAILLE May 10, 2026
a5f92ca
docs: PORT_LOG — WG re-run with all 3 fixes hits 99.91 % FILTER parity
BenjaminDEMAILLE May 10, 2026
044d850
fix(WG-parity): remove pre-reservoir-sort — Docker uses BAM-natural o…
BenjaminDEMAILLE May 10, 2026
ed4f7fd
feat(WG-parity): Path B — wire Kahan-compensated Conv2D into det path
BenjaminDEMAILLE May 10, 2026
d8c5df2
docs: PORT_LOG — no-sort fix lands: 99.91 % → 99.9993 % FILTER parity
BenjaminDEMAILLE May 10, 2026
a4b65a9
docs: PORT_LOG — Path B Kahan WG result: didn't reduce FM, runtime 8.…
BenjaminDEMAILLE May 11, 2026
3001dcc
docs: PORT_LOG — session-end consolidated state at 99.9993 % FM parity
BenjaminDEMAILLE May 11, 2026
8eba4cd
docs: PORT_LOG — Path D investigation, 2/24 different-DP FM sites
BenjaminDEMAILLE May 23, 2026
e9861cc
docs: PORT_LOG — Path D deep-dive, per-read evidence via BAM stream +…
BenjaminDEMAILLE May 23, 2026
ff2a6f7
docs: PORT_LOG — Path D Site 1 hypothesis BIT-CONFIRMED by Docker run
BenjaminDEMAILLE May 23, 2026
96629a4
fix(Path D Site 1): propagate normalize_reads onto FastPassAligner
BenjaminDEMAILLE May 23, 2026
f4d0eba
docs: PORT_LOG — Path D fix chr20-full validation, 87 % FM reduction
BenjaminDEMAILLE May 23, 2026
ed2dbbe
docs: CLAUDE.md — chr20-full FM gate updated to 0.027 % post Path D fix
BenjaminDEMAILLE May 23, 2026
6bb6d1e
docs: PORT_LOG — chr22 generalization check confirms ~0.03% FM floor
BenjaminDEMAILLE May 23, 2026
02f5f5e
docs: PORT_LOG — full multi-mode chr20 validation post Path D fix
BenjaminDEMAILLE May 24, 2026
15a1c82
fix(WES chr20-full): canonicalize bare contig names in EffectiveRegions
BenjaminDEMAILLE May 24, 2026
b05ad01
docs: PORT_LOG — all-mode chr20-full F1 table + FM categorization
BenjaminDEMAILLE May 24, 2026
ed2438f
docs: PORT_LOG — CoreML inference-backend comparison (Metal wins)
BenjaminDEMAILLE May 24, 2026
b9344ef
fix(coreml): 9 (conv,bn) pair swaps + BN epsilon 1e-4→1e-3
BenjaminDEMAILLE May 24, 2026
3487a73
docs: PORT_LOG — Phase B chr20-full WGS backend matrix (5 backends)
BenjaminDEMAILLE May 24, 2026
4b63938
docs: PORT_LOG — Phase C: HG002 WG (full whole-genome) F1 = Docker
BenjaminDEMAILLE May 25, 2026
e2f94d5
docs: PORT_LOG — Phase C: HG003 + HG004 WG ours F1 against own truths
BenjaminDEMAILLE May 25, 2026
cc1d35d
fix(pangenome): partition_size 25000→1000 — reservoir over-downsampli…
BenjaminDEMAILLE Jun 21, 2026
af59d3d
fix(rnaseq): implement split_skip_reads (split spliced reads on N CIGAR)
BenjaminDEMAILLE Jun 21, 2026
0b7b9c5
docs: full all-mode matrix on public data + RNASEQ/pangenome fixes (P…
BenjaminDEMAILLE Jun 21, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"env": {
"ECC_DISABLED_HOOKS": "pre:bash:gateguard-fact-force,pre:edit-write:gateguard-fact-force"
}
}
30 changes: 30 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,33 @@
bazel-*

**/.ipynb_checkpoints

# v2 — Apple Silicon native port
build/
build-*/
.cache/
**/__pycache__/
tools/conversion/venv-*/
tools/conversion/.cache/
tools/conversion/models/
tools/conversion/Generated/
tools/reference/cache/
tools/reference/output/
benchmarks/runs/
benchmarks/*.log
testdata/reference/large/
*.mlpackage
*.mlmodelc
*.tfrecord
*.bam
*.bai
*.fa
*.fai
*.fa.gz
*.vcf
*.vcf.gz
*.tbi
*.bed
.DS_Store
validation/work/
validation/output/
380 changes: 380 additions & 0 deletions CLAUDE.md

Large diffs are not rendered by default.

76 changes: 76 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
cmake_minimum_required(VERSION 3.27)
project(deepvariant VERSION 1.10.0 LANGUAGES CXX OBJCXX C)

# ---------------------------------------------------------------------------
# Guards: macOS arm64 only.
# ---------------------------------------------------------------------------
if(NOT APPLE OR NOT CMAKE_SYSTEM_PROCESSOR STREQUAL "arm64")
message(FATAL_ERROR "This build targets macOS arm64 only.")
endif()
if(CMAKE_SYSTEM_VERSION VERSION_LESS "23") # macOS 14 = Darwin 23.x
message(FATAL_ERROR "macOS 14 (Sonoma) or newer required.")
endif()

# ---------------------------------------------------------------------------
# Language standards
# ---------------------------------------------------------------------------
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_OBJCXX_STANDARD 17)
set(CMAKE_OBJCXX_STANDARD_REQUIRED ON)

# Visibility: match TF convention — default hidden.
set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_CXX_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN ON)

# All depedencies built as STATIC.
set(BUILD_SHARED_LIBS OFF)

# Default build type.
if(NOT CMAKE_BUILD_TYPE)
set(CMAKE_BUILD_TYPE Release CACHE STRING "" FORCE)
endif()

# Build output goes to a single directory for easy inspection.
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib")
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib")
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin")

# ---------------------------------------------------------------------------
# Compiler flags — arm64, Clang (Apple Clang 21+)
# ---------------------------------------------------------------------------
add_compile_options(
-arch arm64
-Wall
-Wextra
-Wno-unused-parameter
-Wno-missing-field-initializers
)

# ---------------------------------------------------------------------------
# Module path
# ---------------------------------------------------------------------------
list(PREPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake")

# ---------------------------------------------------------------------------
# External dependencies (order matters: protos before nucleus)
# ---------------------------------------------------------------------------
include(deps) # FetchContent / find_package for htslib, abseil, protobuf, ssw
include(protos) # compile DeepVariant + nucleus + TF-example protos

# ---------------------------------------------------------------------------
# Core libraries (TF-free)
# ---------------------------------------------------------------------------
add_subdirectory(third_party/nucleus)
add_subdirectory(deepvariant/realigner)
add_subdirectory(deepvariant) # upstream C++ libs (Phase 3)
add_subdirectory(deepvariant/native) # runtime binary (Phase 2+)

# ---------------------------------------------------------------------------
# Tests (Phase 1 gate: ctest -V must pass)
# ---------------------------------------------------------------------------
enable_testing()
include(CTest)
add_subdirectory(tests/native) # thin wrappers around upstream C++ test code
Loading