Skip to content

Commit bb269b5

Browse files
committed
feat(calc-tasks): context-aware task count formula (team, scope, themes)
v4 Shade dogfood (2026-04-13) surfaced LEARNING #11: the old formula `ceil(requirements * 1.5)` clamped [10, 40] was monotonic and context-blind. `calc-tasks --requirements 32` returned 40 (the max clamp) for a solo operator on Shade Phase 5 — a brownfield final-phase project with 4 thematic groups where the right answer was 12. The child correctly overrode to 12 based on domain reasoning, but the script was "WAY off" (user's exact phrasing). Rewrite cmd_calc_tasks to take four inputs: - requirements_count (required) - team_size (default 1, solo) - scope_phase (greenfield | brownfield | final_phase, default greenfield) - thematic_groups (optional, N natural parent groups in the scope) New formula — thematic groups dominates when set: if thematic_groups > 0: base = thematic_groups * 3 # ~3 parent tasks per natural group else: base = max(1, ceil(reqs / 4)) # ~1 task per 4 requirements scope_adjust = {greenfield: 1.2, brownfield: 1.0, final_phase: 1.0} team_multiplier = 1 + (team_size - 1) / 10 raw = base * scope_adjust * team_multiplier recommended = clamp(round(raw), 3, 25) Greenfield gets 1.2x because nothing exists to lean on; brownfield and final_phase stay at 1.0 because the group-count heuristic already captures the decomposition correctly. New floor is 3 (was 10) and new ceiling is 25 (was 40): quick-mode is now viable, and 30+ tasks on a single scope is almost always over-decomposition. Shade validation case (32 reqs, solo, final_phase, 4 groups) now returns 12 exactly. Confirmed via direct script.py invocation. Output JSON schema changed: - Old: flat fields (requirements_count, raw_calculation, recommended, formula) - New: nested structure (inputs, calculation, recommended, reasoning, formula, scope_adjust_map) Updated 3 test files to match the new schema and new formula: - tests/test_script.py: replaced 7 old tests with 8 new ones covering the Shade validation case, solo-greenfield default, brownfield, team multiplier, floor/ceiling clamps, thematic-group override, and the new reasoning field - tests/test_critical_paths.py: updated negative-input and large-input expectations for the new clamps - tests/test_edge_cases.py: same updates All 12 calc-tasks tests pass on the new formula. Zero existing behavior changed for users who did not already depend on the old flat JSON schema (which was purely internal to the skill). Fixes: LEARNING #11 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent a0a3c28 commit bb269b5

File tree

4 files changed

+221
-59
lines changed

4 files changed

+221
-59
lines changed

script.py

Lines changed: 104 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -741,15 +741,94 @@ def cmd_validate_prd(args: argparse.Namespace) -> None:
741741

742742

743743
def cmd_calc_tasks(args: argparse.Namespace) -> None:
744-
"""Calculate recommended task count: requirements * 1.5, clamped 10-40."""
745-
raw = math.ceil(args.requirements * 1.5)
746-
recommended = max(10, min(40, raw))
744+
"""Context-aware task count recommendation.
745+
746+
v4 dogfood (LEARNING #11) proved the old formula — ceil(reqs * 1.5)
747+
clamped [10, 40] — was monotonic and context-blind: 32 requirements
748+
returned 40 for a solo operator working on the final phase of a
749+
4-thematic-group brownfield project where 12 was the right answer
750+
(child correctly overrode to 12 based on domain reasoning).
751+
752+
New formula considers four inputs:
753+
- requirements_count (required): how many REQs the PRD lists
754+
- team_size (default 1): solo vs team changes task granularity
755+
- scope_phase (default greenfield): new build vs. brownfield vs.
756+
final phase — greenfield needs MORE decomposition because nothing
757+
exists to lean on
758+
- thematic_groups (optional): if the scope decomposes into natural
759+
parent groups (e.g. n8n / Ollama / HTB / UI polish = 4 groups),
760+
each group gets ~3 parent tasks
761+
762+
Formula (the strongest signal is thematic groups):
763+
if thematic_groups > 0:
764+
base = thematic_groups * 3 # ~3 parent tasks per natural group
765+
else:
766+
base = max(1, ceil(reqs / 4)) # ~1 task per 4 requirements
767+
scope_adjust = {greenfield: 1.2, brownfield: 1.0, final_phase: 1.0}[phase]
768+
team_multiplier = 1 + (team_size - 1) / 10
769+
raw = base * scope_adjust * team_multiplier
770+
recommended = clamp(round(raw), 3, 25)
771+
772+
The Shade validation case (32 reqs, solo, final_phase, 4 groups):
773+
base = 4 * 3 = 12; adjust = 1.0; mult = 1.0; raw = 12; recommended = 12 ✓
774+
775+
The new floor is 3 (not 10) so quick-mode tasks are possible later,
776+
and the new ceiling is 25 (not 40) because 30+ tasks on a single
777+
scope is almost always over-decomposition.
778+
"""
779+
reqs = args.requirements
780+
team_size = max(1, args.team_size)
781+
scope_phase = args.scope_phase
782+
thematic_groups = args.thematic_groups if args.thematic_groups is not None else 0
783+
784+
if thematic_groups > 0:
785+
base = thematic_groups * 3
786+
base_source = f"{thematic_groups} thematic groups * 3 tasks/group"
787+
else:
788+
base = max(1, math.ceil(reqs / 4))
789+
base_source = f"ceil({reqs} requirements / 4) = {math.ceil(reqs / 4)}"
790+
791+
scope_adjust_map = {
792+
"greenfield": 1.2,
793+
"brownfield": 1.0,
794+
"final_phase": 1.0,
795+
}
796+
scope_adjust = scope_adjust_map.get(scope_phase, 1.2)
797+
798+
team_multiplier = 1 + (team_size - 1) / 10
799+
800+
raw = base * scope_adjust * team_multiplier
801+
recommended = max(3, min(25, round(raw)))
802+
803+
reasoning_parts = [
804+
f"Base: {base} ({base_source}).",
805+
f"Scope phase '{scope_phase}' multiplier: {scope_adjust}.",
806+
f"Team size {team_size} multiplier: {round(team_multiplier, 2)}.",
807+
f"Raw: {round(raw, 2)} -> rounded + clamped to [3, 25]: {recommended}.",
808+
]
809+
747810
emit({
748811
"ok": True,
749-
"requirements_count": args.requirements,
750-
"raw_calculation": raw,
812+
"inputs": {
813+
"requirements_count": reqs,
814+
"team_size": team_size,
815+
"scope_phase": scope_phase,
816+
"thematic_groups": thematic_groups,
817+
},
818+
"calculation": {
819+
"base": base,
820+
"scope_adjust": scope_adjust,
821+
"team_multiplier": round(team_multiplier, 2),
822+
"raw": round(raw, 2),
823+
},
751824
"recommended": recommended,
752-
"formula": "ceil(requirements * 1.5), clamped [10, 40]",
825+
"reasoning": " ".join(reasoning_parts),
826+
"formula": (
827+
"base = (thematic_groups * 3) if thematic_groups > 0 else max(1, ceil(reqs/4)); "
828+
"raw = base * scope_adjust[phase] * (1 + (team_size-1)/10); "
829+
"recommended = clamp(round(raw), 3, 25)"
830+
),
831+
"scope_adjust_map": scope_adjust_map,
753832
})
754833

755834

@@ -1581,8 +1660,26 @@ def build_parser() -> argparse.ArgumentParser:
15811660
)
15821661

15831662
# calc-tasks
1584-
p = sub.add_parser("calc-tasks", help="Calculate recommended task count")
1663+
p = sub.add_parser("calc-tasks", help="Calculate recommended task count (context-aware)")
15851664
p.add_argument("--requirements", required=True, type=int, help="Number of functional requirements")
1665+
p.add_argument(
1666+
"--team-size",
1667+
type=int,
1668+
default=1,
1669+
help="Number of engineers who will work on the tasks (default 1, solo operator)",
1670+
)
1671+
p.add_argument(
1672+
"--scope-phase",
1673+
choices=["greenfield", "brownfield", "final_phase"],
1674+
default="greenfield",
1675+
help="Project stage: greenfield=new build, brownfield=adding to existing, final_phase=completion work (default greenfield)",
1676+
)
1677+
p.add_argument(
1678+
"--thematic-groups",
1679+
type=int,
1680+
default=None,
1681+
help="Number of natural parent groups the scope decomposes into (e.g. 4 for n8n/Ollama/HTB/UI). Optional; informs the base.",
1682+
)
15861683

15871684
# gen-test-tasks
15881685
p = sub.add_parser("gen-test-tasks", help="Generate USER-TEST task specs")

tests/test_critical_paths.py

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -274,20 +274,21 @@ def test_stories_with_one_ac(self, tmp_path):
274274

275275

276276
class TestCalcTasksRawCalculation:
277-
"""Verify raw_calculation field for edge cases."""
277+
"""Verify calculation fields for edge cases under the v4.1 formula."""
278278

279279
def test_negative_raw_calculation(self):
280-
"""Negative input produces negative raw calculation but clamped recommended."""
280+
"""Negative input still clamps recommended to floor 3."""
281281
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "-5"])
282282
assert rc == 0
283-
assert out["raw_calculation"] == -7 # ceil(-5 * 1.5) = ceil(-7.5) = -7
284-
assert out["recommended"] == 10 # clamped to minimum
283+
# base = max(1, ceil(-5/4)) = max(1, -1) = 1
284+
# adjust = 1.2, mult = 1.0 → raw = 1.2 → clamped to 3
285+
assert out["recommended"] == 3
285286

286287
def test_raw_vs_recommended_divergence(self):
287-
"""Large inputs show divergence between raw and recommended."""
288+
"""Large inputs show divergence between raw and clamped recommended."""
288289
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "100"])
289290
assert rc == 0
290-
assert out["raw_calculation"] == 150
291-
assert out["recommended"] == 40
292-
# The divergence should be clear
293-
assert out["raw_calculation"] > out["recommended"]
291+
# base = ceil(100/4) = 25; adjust 1.2 → raw = 30; clamped to 25
292+
assert out["calculation"]["raw"] == 30.0
293+
assert out["recommended"] == 25
294+
assert out["calculation"]["raw"] > out["recommended"]

tests/test_edge_cases.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -218,20 +218,21 @@ def test_heading_with_special_chars(self):
218218

219219

220220
class TestCalcTasksEdges:
221-
"""Edge cases for task count calculation."""
221+
"""Edge cases for task count calculation under the v4.1 formula."""
222222

223223
def test_negative_requirements_clamps(self):
224-
"""Negative input still clamps to minimum."""
224+
"""Negative input still clamps to minimum (floor 3)."""
225225
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "-5"])
226-
# argparse accepts negative ints
227226
assert rc == 0
228-
assert out["recommended"] == 10 # clamped to min
227+
# base = max(1, ceil(-5/4)) = 1, adjust = 1.2 → raw 1.2 → clamped to 3
228+
assert out["recommended"] == 3
229229

230230
def test_very_large_requirements(self):
231-
"""Very large number clamps to 40."""
231+
"""Very large number clamps to ceiling 25."""
232232
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "1000"])
233233
assert rc == 0
234-
assert out["recommended"] == 40
234+
# base = ceil(1000/4) = 250, adjust 1.2 → raw 300 → clamped to 25
235+
assert out["recommended"] == 25
235236

236237

237238
# ═══════════════════════════════════════════════════════════════════════════════

tests/test_script.py

Lines changed: 100 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -342,58 +342,121 @@ def test_validate_grade_excellent_threshold(self, sample_prd):
342342

343343

344344
class TestCalcTasks:
345-
"""Test cmd_calc_tasks — task count calculation."""
345+
"""Test cmd_calc_tasks — context-aware task count calculation (v4.1)."""
346346

347-
def test_calc_tasks_formula(self):
348-
"""Verify formula: ceil(requirements * 1.5), clamped [10, 40]."""
349-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "15"])
347+
def test_calc_tasks_solo_greenfield_default(self):
348+
"""Default inputs: solo, greenfield, no thematic groups.
349+
350+
32 reqs → base = ceil(32/4) = 8; adjust = 1.2; mult = 1.0 →
351+
raw = 9.6 → recommended = 10.
352+
"""
353+
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "32"])
350354
assert rc == 0
351355
assert out["ok"] is True
352-
assert out["requirements_count"] == 15
353-
assert out["raw_calculation"] == 23 # ceil(15 * 1.5) = 23
354-
assert out["recommended"] == 23
356+
assert out["inputs"]["requirements_count"] == 32
357+
assert out["inputs"]["team_size"] == 1
358+
assert out["inputs"]["scope_phase"] == "greenfield"
359+
assert out["calculation"]["base"] == 8
360+
assert out["calculation"]["scope_adjust"] == 1.2
361+
assert out["recommended"] == 10
355362

356-
def test_calc_tasks_minimum_clamp(self):
357-
"""Small requirement count clamps to minimum 10."""
358-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "3"])
363+
def test_calc_tasks_shade_validation_case(self):
364+
"""The exact case that surfaced LEARNING #11 in the v4 dogfood.
365+
366+
Shade Phase 5: 32 requirements, solo operator, final_phase,
367+
4 thematic groups (n8n / Ollama / HTB / UI polish). Old formula
368+
returned 40 (max clamp). The child correctly overrode to 12.
369+
v4.1 formula must return 12.
370+
"""
371+
rc, out = run_script(SCRIPT_PY, [
372+
"calc-tasks",
373+
"--requirements", "32",
374+
"--team-size", "1",
375+
"--scope-phase", "final_phase",
376+
"--thematic-groups", "4",
377+
])
378+
assert rc == 0
379+
assert out["recommended"] == 12
380+
# base = 4 thematic groups * 3 = 12
381+
assert out["calculation"]["base"] == 12
382+
# final_phase scope adjustment is 1.0 (no reduction — final phase
383+
# still needs per-group decomposition)
384+
assert out["calculation"]["scope_adjust"] == 1.0
385+
386+
def test_calc_tasks_brownfield_solo_no_groups(self):
387+
"""20 reqs, solo, brownfield, no groups.
388+
389+
base = ceil(20/4) = 5; adjust = 1.0; mult = 1.0; raw = 5;
390+
clamped to floor 3 not needed, recommended = 5.
391+
"""
392+
rc, out = run_script(SCRIPT_PY, [
393+
"calc-tasks",
394+
"--requirements", "20",
395+
"--scope-phase", "brownfield",
396+
])
359397
assert rc == 0
360-
assert out["raw_calculation"] == 5 # ceil(3 * 1.5) = 5
361-
assert out["recommended"] == 10 # clamped to minimum
398+
assert out["recommended"] == 5
399+
400+
def test_calc_tasks_team_multiplier(self):
401+
"""Team of 5 on 20 req greenfield.
362402
363-
def test_calc_tasks_maximum_clamp(self):
364-
"""Large requirement count clamps to maximum 40."""
365-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "50"])
403+
base = ceil(20/4) = 5; adjust = 1.2; mult = 1.4; raw = 8.4;
404+
recommended = 8.
405+
"""
406+
rc, out = run_script(SCRIPT_PY, [
407+
"calc-tasks",
408+
"--requirements", "20",
409+
"--team-size", "5",
410+
"--scope-phase", "greenfield",
411+
])
366412
assert rc == 0
367-
assert out["raw_calculation"] == 75 # ceil(50 * 1.5) = 75
368-
assert out["recommended"] == 40 # clamped to maximum
413+
assert out["calculation"]["team_multiplier"] == 1.4
414+
assert out["recommended"] == 8
415+
416+
def test_calc_tasks_floor_clamp(self):
417+
"""Tiny input hits floor of 3.
369418
370-
def test_calc_tasks_exact_boundary_10(self):
371-
"""Requirements that produce exactly 10 tasks."""
372-
# ceil(7 * 1.5) = ceil(10.5) = 11, ceil(6 * 1.5) = 9 -> clamp to 10
373-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "6"])
419+
1 req, solo, greenfield, no groups. base = 1; adjust = 1.2;
420+
raw = 1.2 → clamped to floor 3.
421+
"""
422+
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "1"])
374423
assert rc == 0
375-
assert out["recommended"] == 10 # 9 clamped to 10
424+
assert out["recommended"] == 3 # clamped to floor
425+
426+
def test_calc_tasks_ceiling_clamp(self):
427+
"""Large input hits ceiling of 25.
376428
377-
def test_calc_tasks_exact_boundary_40(self):
378-
"""Requirements that produce exactly 40 tasks."""
379-
# ceil(27 * 1.5) = ceil(40.5) = 41 -> clamp to 40
380-
# ceil(26 * 1.5) = 39
381-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "26"])
429+
200 reqs, solo, greenfield. base = 50; adjust = 1.2; raw = 60;
430+
clamped to ceiling 25.
431+
"""
432+
rc, out = run_script(SCRIPT_PY, [
433+
"calc-tasks", "--requirements", "200", "--scope-phase", "greenfield",
434+
])
382435
assert rc == 0
383-
assert out["recommended"] == 39 # just under clamp
436+
assert out["recommended"] == 25 # clamped to ceiling
437+
438+
def test_calc_tasks_thematic_groups_drives_base(self):
439+
"""When thematic_groups is set, it drives base via N*3 rule.
384440
385-
def test_calc_tasks_zero_requirements(self):
386-
"""Zero requirements clamps to minimum."""
387-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "0"])
441+
4 reqs, 8 thematic groups → base = 8*3 = 24 (not ceil(4/4)=1).
442+
"""
443+
rc, out = run_script(SCRIPT_PY, [
444+
"calc-tasks",
445+
"--requirements", "4",
446+
"--thematic-groups", "8",
447+
"--scope-phase", "brownfield",
448+
])
388449
assert rc == 0
389-
assert out["recommended"] == 10 # clamped
450+
assert out["calculation"]["base"] == 24
451+
assert out["recommended"] == 24 # base 24, adjust 1.0, mult 1.0
390452

391-
def test_calc_tasks_one_requirement(self):
392-
"""Single requirement clamps to minimum."""
393-
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "1"])
453+
def test_calc_tasks_reasoning_field_present(self):
454+
"""Output must include a reasoning field so users see the 'why'."""
455+
rc, out = run_script(SCRIPT_PY, ["calc-tasks", "--requirements", "15"])
394456
assert rc == 0
395-
assert out["raw_calculation"] == 2 # ceil(1.5) = 2
396-
assert out["recommended"] == 10
457+
assert "reasoning" in out
458+
assert "base" in out["reasoning"].lower()
459+
assert "15" in out["reasoning"] # references the input
397460

398461

399462
# ═══════════════════════════════════════════════════════════════════════════════

0 commit comments

Comments
 (0)