Roundtrip Test Matrix
This document tracks the intended roundtrip-test surface for single-file fits. It is meant to answer three questions at a glance:
Which workflow API is under test?
Which backend expectation is under test?
Which supported model family is under test?
Project-level fitting and MCMC/parallel execution are important too, but they are tracked as secondary dimensions so the main matrix stays readable.
Axes
Workflow axis
B:File.fit_baseline()Sp:File.fit_spectrum()SbS:File.fit_slice_by_slice()2D:File.fit_2d()
Scope axis
SF: single-file workflowsP: project-level workflows viaProject.fit_2d()
Backend axis
M: force MCP withproject.spec_fun_str = "fit_model_mcp"and recover truthG: run default compiled dispatch and recover truth on the GIR pathC: assertdelta(MCP, GIR) = 0
For C, prefer fit_model_compare when the workflow supports it. Otherwise use
direct parity checks on residuals or evaluated outputs.
Cell notation
M/G/C: all three test types should exist for that workflow/model-family pair-: not applicable for that pair
Execution-mode qualifiers
These are not a full extra matrix axis; they are required focused variants on top of the main matrix:
Opt: normal optimizer-based fit, no MCMCMC1: MCMC enabled withworkers=1MC2: MCMC enabled withworkers=2W1: explicit serial execution for a workflow with worker supportW2: parallel execution withworkers=2
Use 2 as the standard parallel test setting. We usually care about
1 versus >1, not about many-worker scaling in CI.
Canonical model families
Each row below should have one canonical fixture. Closely related variants can be parameterized under the same row instead of getting separate rows.
ID |
Model family |
Representative fixture(s) |
|---|---|---|
|
Plain energy model |
|
|
Static expressions in energy model: direct refs, fan-out, forward refs |
|
|
Top-level standard dynamics |
|
|
Top-level dynamics with IRF / convolution |
|
|
Top-level subcycle / multi-cycle dynamics |
|
|
Top-level profile only |
|
|
Top-level profile plus separate top-level dynamics on another parameter |
|
|
Profile-internal dynamics |
|
|
Expression parameter referencing a top-level time-dependent base parameter |
|
|
Expression parameter referencing a top-level profiled base parameter |
|
|
Expression parameter referencing a profiled parameter whose profile internals are time-dependent |
|
|
Mixed expression referencing both profiled and time-dependent base parameters |
|
Target matrix
Scope: SF (single-file)
Family |
B |
Sp |
SbS |
2D |
Notes |
|---|---|---|---|---|---|
|
|
|
|
|
Core 1D workflow coverage |
|
|
|
|
|
Include at least one direct-ref case and one fan-out or forward-ref case |
|
|
|
|
|
Core 2D dynamic family |
|
|
|
|
|
One canonical roundtrip plus parametrized parity across kernels |
|
|
|
|
|
Important for multi-cycle indexing and expression prefixing |
|
|
|
|
|
Covers aux-axis plumbing in 1D APIs |
|
|
|
|
|
Top-level mixed feature case |
|
|
|
|
|
Single-cycle only |
|
|
|
|
|
High-value bug class for update ordering and pickling |
|
|
|
|
|
Expression namespace must see profiled values |
|
|
|
|
|
Single-cycle only |
|
|
|
|
|
Stress case for combined namespace resolution |
Project-level matrix
Scope: P (Project.fit_2d())
Project-level fitting should be tracked separately because it is currently
wired through fit_project_mcp, so the single-file GIR/MCP expectations do not
apply cleanly yet.
Family |
2D |
Notes |
|---|---|---|
|
|
Current core project roundtrip surface |
|
|
Includes file/project prefix rewriting and shared refs |
|
|
Covered with |
|
|
Add once project fixtures exist |
Future:
if project-level GIR lands, upgrade applicable cells from
MtoM/G/Cuntil then, do not force fake GIR coverage into the project matrix
MCMC and worker policy
Yes: MCMC should be tracked.
Yes: worker mode should be tracked anywhere the code can execute differently between serial and parallel paths.
But neither should multiply every cell in the main matrix. Instead use focused requirements:
MCMC requirements
MCMC is a second-layer contract on top of the clean optimizer roundtrips.
Minimum MCMC set:
MC1: one canonicalBtest onF1MC2: one canonicalBtest onF1MC2: one expression-sensitive case, preferablyF9orF10MC2: one 2D varying case, preferablyF3orF8
Rationale:
MC1checks that MCMC itself still worksMC2checks pickling / process-boundary behaviorexpression-heavy and nested-model cases are the highest-value bug surfaces
SbS worker requirements
Yes, SbS should distinguish W1 and W2 because n_workers=1 uses the
serial path and n_workers>1 crosses a process boundary.
Current status:
the main
SbSmatrix runs withn_workers=1focused
W2tests coverF1and profile-bearingF6withn_workers=2
Future requirement if worker-specific risk grows:
add a more expression-heavy
W2SbScase, likelyF2, if process-boundary risk shows up beyond the existing plain/profile cases
Project worker requirements
For project-level fits, add worker variants only when project execution gains a parallel path that is semantically different from serial execution.
Practical rule
Use this rule to decide whether a new dimension deserves explicit tracking:
add it as a full matrix axis only if it changes almost every cell
otherwise add it as a focused secondary requirement
By that rule:
project-level fits: yes, but separate matrix
MCMC: yes, as focused secondary coverage
workers=1vsworkers=2: yes, but only for APIs that actually expose worker-dependent behavior
Minimum test shape per cell
For each required cell, the minimum useful test is:
simulate noiseless data from a truth model
fit through the target workflow API
assert recovered non-expression parameters match truth
for
C, assert MCP and GIR agree exactly or within a tight tolerance
For MCMC-focused cells, the minimum useful test is:
run the fit with
mc_settings.use_mc=1assert no crash for
MC1assert no crash and no pickling/serialization failure for
MC2when runtime allows, also assert basic parameter recovery or constraint preservation
Noisy roundtrip tests are still valuable, but should be a second layer. The clean matrix above is the baseline contract.
Current snapshot
This is the current high-level state of the suite, not a substitute for the table above.
Covered today:
the full single-file clean matrix above for
M/G/CF2variants for direct, fan-out, and forward-reference expressionsnoisy second-layer checks for
F3,F6, andF8on the GIR pathfocused MCMC checks for
MC1,MC2, expression-sensitiveMC2, and 2DMC2focused
W2coverage for plain and profile-bearingfit_slice_by_slice()project-level
Mroundtrips forPF1,PF2, andPF3
Thin or missing today:
project-level
PF4shared subcycle dynamicsproject-level
G/Ccoverage, because project fitting is still MCP-onlyexpression-heavy
W2coverage forfit_slice_by_slice()MCMC assertions beyond no-crash / process-boundary coverage
exhaustive noisy coverage, intentionally kept out of the main matrix
Suggested implementation order
The original single-file matrix is implemented. Highest-value next steps:
Add
PF4once a shared project-subcycle fixture exists.Add a focused expression-heavy
W2SbStest if process-boundary risk shows up beyond the existing plain/profile cases.Add lightweight recovery or constraint-preservation assertions to focused MCMC tests when runtime allows.
Upgrade project-level cells from
MtoM/G/Cif project-level GIR lands.
Non-goals
This matrix does not try to track:
invalid / explicitly unsupported model combinations
low-level evaluator unit tests
plotting-only behavior
Those should stay in their existing focused tests.