Motivation
Currently, group-level analysis in SPIMquant requires specifying a single --contrast_column and two --contrast_values, which limits flexibility for complex experimental designs (e.g., multiple grouping factors, continuous covariates). To better support common workflows (such as stratified pairwise comparisons within genotype and sex, treatment effects adjusted for covariates, etc.), we want to refactor the CLI to use a statistical modeling approach.
Proposal
- Replace the current
--contrast_column/--contrast_values CLI interface with a formula/model-based approach:
- Add a
--model argument that accepts a patsy/statsmodels-compatible formula (e.g. metric ~ C(treatment) * C(genotype) * C(sex) + age).
- Add
--pairwise <factor> (repeatable) for requesting all pairwise comparisons between levels of a categorical variable, optionally within strata.
- Add
--within <factor1> <factor2>... arguments to define strata for contrasts (e.g. all treatment effects within genotype×sex cells).
- Remove the
--contrast_column/--contrast_values parameters (and their usage in Snakemake config, groupstats rules, and scripts).
- Implementation details:
- Fit a single global model per region/metric (as specified by
--model), with the formula passed strictly as provided by the user (no automatic interaction expansion).
- To compute contrast effect sizes and stats, generate predicted means for each combination of contrast factor levels and strata, then compute all desired pairwise differences and their standard errors/p-values using the model’s covariance.
- Output one results table/map per contrast (with clear contrast and strata labels in filenames/headers).
- Document these changes in the CLI, usage guides, and output file conventions.
Benefits
- Users can specify arbitrary models including multiple grouping factors, interactions, and continuous covariates.
- Enables streamlined many-to-many (e.g., all pairwise) contrasts, stratified analyses, and effects of interest in a single run.
- Makes the workflow more general, future-proof, and accessible to advanced and basic users.
Not in scope
- No legacy CLI/backward compatibility required. All changes can break the old interface (major version bump if needed).
- No support for the old “per-stratum fit” mode—global model only.
- No automatic upgrading of the model formula beyond what the user supplies (to avoid ‘magic’).
Example usage
pixi run spimquant /bids /out group \
--model "metric ~ C(treatment) * C(genotype) * C(sex) + age" \
--pairwise treatment \
--within genotype sex \
--cores all
Tasks
cc @akhanf
Motivation
Currently, group-level analysis in SPIMquant requires specifying a single
--contrast_columnand two--contrast_values, which limits flexibility for complex experimental designs (e.g., multiple grouping factors, continuous covariates). To better support common workflows (such as stratified pairwise comparisons within genotype and sex, treatment effects adjusted for covariates, etc.), we want to refactor the CLI to use a statistical modeling approach.Proposal
--contrast_column/--contrast_valuesCLI interface with a formula/model-based approach:--modelargument that accepts a patsy/statsmodels-compatible formula (e.g.metric ~ C(treatment) * C(genotype) * C(sex) + age).--pairwise <factor>(repeatable) for requesting all pairwise comparisons between levels of a categorical variable, optionally within strata.--within <factor1> <factor2>...arguments to define strata for contrasts (e.g. all treatment effects within genotype×sex cells).--contrast_column/--contrast_valuesparameters (and their usage in Snakemake config, groupstats rules, and scripts).--model), with the formula passed strictly as provided by the user (no automatic interaction expansion).Benefits
Not in scope
Example usage
pixi run spimquant /bids /out group \ --model "metric ~ C(treatment) * C(genotype) * C(sex) + age" \ --pairwise treatment \ --within genotype sex \ --cores allTasks
contrast_column/contrast_valuesinterface and filtering logic--model,--pairwise,--within)cc @akhanf