Skip to content

KRYSTALM7/pymc-online-ssm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pymc-online-ssm

GSoC 2026 — NumFOCUS / PyMC
Scalable Online Bayesian State Space Models: Sequential Updates, Structured Linear Algebra, and Parallel Time Series Inference

Prototype implementation for the GSoC 2026 proposal. Implements a Cholesky-based Kalman filter validated against SciPy's reference — the foundation for online Bayesian inference in PyMC without re-running MCMC from scratch.

Mentors: Jesse Grabowski, Jonathan Dekermanjian
PyMC PR: #8211
Discourse: GSoC 2026 thread


The Problem

PyMC's pymc-extras state space models require full re-sampling when new observations arrive. For streaming applications this is O(n³) per step and computationally prohibitive.

This project solves it with a sequential Kalman update — O(n²) per step, no MCMC re-run.


Structure

statespace/
├── online/
│   ├── kalman_filter.py   # Cholesky-based predict + update (cho_solve, Joseph form)
│   ├── cholesky.py        # Covariance utilities: cholesky_predict, log_det, innovation_cov
│   ├── jax_backend.py     # jax.lax.scan filter, jnp.linalg.solve (no explicit inv)
│   └── api.py             # OnlineSSM: fit(x0, P0), update(y), forecast(h)
tests/
├── test_kalman.py          # 6 tests, validated against scipy reference
└── test_api.py             # 6 integration tests

Planned API

from statespace.online import OnlineSSM

ssm = OnlineSSM(F, H, Q, R)
ssm.fit(x0, P0)

for obs in data_stream:
    x, P = ssm.update(obs)      # O(n²), Cholesky-based
    means, covs = ssm.forecast(h=10)

Run

git clone https://github.com/KRYSTALM7/pymc-online-ssm
cd pymc-online-ssm
pip install -r requirements.txt
pytest tests/

Optimisations

Method Gain
Cholesky covariance O(n³) → O(n²) per update
Diagonal Q and R Simplified Kalman gain
Kronecker transition (F = Fₜ ⊗ Fₛ) O(n⁶) → O(n³) predict

GSoC Targets

Metric Target
Diagonal noise speedup ≥ 5× at n=200
Kronecker predict speedup ≥ 10× at n_space=n_time=20
Parallel scaling (Ray) Linear to N=500 series
Test coverage ≥ 80%
Tutorial notebook Published to PyMC Examples

License

MIT — see LICENSE

Releases

No releases published

Packages

 
 
 

Contributors

Languages