Skip to content

Explicit error terms production code#1064

Open
janzill wants to merge 151 commits into
ActivitySim:explicit_error_termsfrom
outerl:explicit_error_terms
Open

Explicit error terms production code#1064
janzill wants to merge 151 commits into
ActivitySim:explicit_error_termsfrom
outerl:explicit_error_terms

Conversation

@janzill
Copy link
Copy Markdown

@janzill janzill commented Apr 3, 2026

This PR brings the explicit error term (EET) work to production standard. It contains the following major changes compared to the PoC implementation:
1. It adds testing and documentation for EET
2. It decouples the sampling method from the simulation method and adds Poisson sampling (based on @m-richards implementation in #1065), including tests and documentation. It also addresses runtime and memory issues with the sampling method "eet".
3. It removes several inconsistencies in the EET simulation branch where edge cases could have led to unexpected changes in choices for individual choosers due to non-alignment of error terms.
4. It consistently implements EET for nested logit models

Note that the default simulation method remains Monte Carlo, with all existing unit and integration tests unchanged and passing. Users should therefore not see any differences in their model runs unless explicitly opting in to EET. This is done by adding use_explicit_error_terms: True to the settings. This is a drop-in replacement, with default settings leading to a 3-10% runtime increase of a single demand model run for the models we have tested so far.

This is a large PR and I will add details in comments below so we can keep discussions focused.

janzill and others added 30 commits March 19, 2026 15:21
Extend logit tests, add model tests, add simulate tests, draft docs
@janzill
Copy link
Copy Markdown
Author

janzill commented May 21, 2026

Regarding tests:

  • Regarding integration tests with the two external models, we cannot compare outcomes between MC and EET due to the small household sample sizes.
  • We included EET functionality in semcog tests and noticed a small difference for runs with 1 and 2 processes. Debugging revealed an edge case for probabilistic scheduling. This is independent of EET, the model just looks up probabilities from provided tables and therefore does not have an EET equivalent. It turns out that how failed trips are grouped together can lead to very small differences between single and multi-processing runs for small test sets. This applies to MC as well. Changing the base seed did not trigger that edge case, but worth discussing during the engineering meeting.
  • We noticed that ARC tests are disabled but wanted to add these back in because it's the only test model that uses trip_scheduling_choice afaik. It turns out the regress trips had small differences for trip schedules when running with MC. Git history shows the test was disabled in BayDAG Contribution #16: Parking Locations in Trip Matrices #840. It looks like the test files were later touched in Trip Scheduling Choice -- Same Results Single Process and Multi-Process #1005 but the test was never re-enabled. I cannot see why the test was disabled in the first place, I might be missing some context here but I decided to update the MC regress file and re-enable the test, as well as add one for EET

@janzill
Copy link
Copy Markdown
Author

janzill commented May 21, 2026

Inconsistencies fixes:

  • Improved consistency of error terms during sampling by aligning random draws to the universe of alternatives, not just zero-attraction alternatives
  • Improved consistency of error terms for choices from sampled sets by aligning random draws to the universe of alternatives, not the position in the sampled set
  • Improved consistency of two-zone pre-sampling by aligning MAZ choices to the universe of alternatives.
  • For shadow pricing, choosers now have consistent error terms for sample and final choice over shadow pricing iterations. Note for shadow pricing method simulation, fixing random numbers per chooser over iteration led to a markedly different solution (much longer distances) than without. This was independent of the simulation method and also of the sampling method and purely due to having the same random numbers in the shadow pricing decisions. We therefore introduced a separate RNG channel to keep results in line with current release code. This RNG is only used if use_explicit_error_terms = True and shadow pricing method = simulation.

@janzill
Copy link
Copy Markdown
Author

janzill commented May 21, 2026

Regarding Poisson sampling and disaggregate accessibilities:

For the MTC_extended and SANDAG models, we saw large differences (about 50%) when running with Poisson sampling compared to the other two methods. The other two methods agree. Disaggregate accessibilities are destination choice logsums, generated by sampling a subset of alternatives, and we traced the differences back to how the sampling correction factor is specified. The bottom line is that in the current specification Poisson is un-biased, whereas both MC and EET sampling are biased by log(sample_size). This is material (on the order of half the mean value) and leads to significant differences in downstream models that use disaggregate accessibilities, like auto_ownership and cdap in SANDAG. As mentioned in this week's meeting, we will present more details on this in the engineering meeting on 5/21.

@janzill
Copy link
Copy Markdown
Author

janzill commented May 21, 2026

Regarding nested logit:

The PoC implementation had a recursive structure, with choices made by walking down nests and choosing at each level. The use of logsums means that there were edge cases where unintuitive changes could happen (nest switching), and the recursive structure was slow. We replaced this with a method that draws nested logit error terms directly, solving both of these problems.

@janzill janzill changed the title Explicit error terms testing and documentation Explicit error terms production code May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants