Roundtrip Test Matrix

This document tracks the intended roundtrip-test surface for single-file fits. It is meant to answer three questions at a glance:

  1. Which workflow API is under test?

  2. Which backend expectation is under test?

  3. Which supported model family is under test?

Project-level fitting and MCMC/parallel execution are important too, but they are tracked as secondary dimensions so the main matrix stays readable.

Axes

Workflow axis

  • B: File.fit_baseline()

  • Sp: File.fit_spectrum()

  • SbS: File.fit_slice_by_slice()

  • 2D: File.fit_2d()

Scope axis

  • SF: single-file workflows

  • P: project-level workflows via Project.fit_2d()

Backend axis

  • M: force MCP with project.spec_fun_str = "fit_model_mcp" and recover truth

  • G: run default compiled dispatch and recover truth on the GIR path

  • C: assert delta(MCP, GIR) = 0

For C, prefer fit_model_compare when the workflow supports it. Otherwise use direct parity checks on residuals or evaluated outputs.

Cell notation

  • M/G/C: all three test types should exist for that workflow/model-family pair

  • -: not applicable for that pair

Execution-mode qualifiers

These are not a full extra matrix axis; they are required focused variants on top of the main matrix:

  • Opt: normal optimizer-based fit, no MCMC

  • MC1: MCMC enabled with workers=1

  • MC2: MCMC enabled with workers=2

  • W1: explicit serial execution for a workflow with worker support

  • W2: parallel execution with workers=2

Use 2 as the standard parallel test setting. We usually care about 1 versus >1, not about many-worker scaling in CI.

Canonical model families

Each row below should have one canonical fixture. Closely related variants can be parameterized under the same row instead of getting separate rows.

ID

Model family

Representative fixture(s)

F1

Plain energy model

single_glp, glp_only

F2

Static expressions in energy model: direct refs, fan-out, forward refs

two_glp_expr_amplitude, expression_fan_out, energy_expression_forward_reference, glp_expression

F3

Top-level standard dynamics

single_glp + MonoExpPos

F4

Top-level dynamics with IRF / convolution

single_glp + MonoExpPosIRF and other lowerable IRF kernels

F5

Top-level subcycle / multi-cycle dynamics

single_glp + ["ModelNone", "MonoExpNeg", "MonoExpPosExpr"], frequency=10

F6

Top-level profile only

single_gauss + roundtrip_pLinear_x0 / roundtrip_pExpDecay_A

F7

Top-level profile plus separate top-level dynamics on another parameter

single_gauss + profile on Gauss_01_A + dynamics on Gauss_01_x0

F8

Profile-internal dynamics

single_gauss + profile on Gauss_01_A + dynamics on Gauss_01_A_pExpDecay_01_A

F9

Expression parameter referencing a top-level time-dependent base parameter

two_glp_expr_amplitude + dynamics on GLP_01_A

F10

Expression parameter referencing a top-level profiled base parameter

two_glp_expr_amplitude + profile on GLP_01_A

F11

Expression parameter referencing a profiled parameter whose profile internals are time-dependent

two_glp_expr_amplitude + profile on GLP_01_A + dynamics on GLP_01_A_pExpDecay_01_A

F12

Mixed expression referencing both profiled and time-dependent base parameters

two_glp_mixed_profile_dynamics with profile on GLP_01_A and dynamics on GLP_01_x0

Target matrix

Scope: SF (single-file)

Family

B

Sp

SbS

2D

Notes

F1 Plain energy

M/G/C

M/G/C

M/G/C

-

Core 1D workflow coverage

F2 Static expressions

M/G/C

M/G/C

M/G/C

-

Include at least one direct-ref case and one fan-out or forward-ref case

F3 Standard dynamics

-

-

-

M/G/C

Core 2D dynamic family

F4 IRF dynamics

-

-

-

M/G/C

One canonical roundtrip plus parametrized parity across kernels

F5 Subcycle dynamics

-

-

-

M/G/C

Important for multi-cycle indexing and expression prefixing

F6 Profile only

M/G/C

M/G/C

M/G/C

-

Covers aux-axis plumbing in 1D APIs

F7 Profile + separate dynamics

-

-

-

M/G/C

Top-level mixed feature case

F8 Profile-internal dynamics

-

-

-

M/G/C

Single-cycle only

F9 Expr -> time-dependent base par

-

-

-

M/G/C

High-value bug class for update ordering and pickling

F10 Expr -> profiled base par

M/G/C

M/G/C

M/G/C

-

Expression namespace must see profiled values

F11 Expr -> profiled base par with profile-internal dynamics

-

-

-

M/G/C

Single-cycle only

F12 Mixed expr(profile + dynamics refs)

-

-

-

M/G/C

Stress case for combined namespace resolution

Project-level matrix

Scope: P (Project.fit_2d())

Project-level fitting should be tracked separately because it is currently wired through fit_project_mcp, so the single-file GIR/MCP expectations do not apply cleanly yet.

Family

2D

Notes

PF1 Shared plain dynamics across files

M

Current core project roundtrip surface

PF2 Project-level expressions

M

Includes file/project prefix rewriting and shared refs

PF3 Shared dynamics with IRF

M

Covered with BiExpProject + gaussCONV

PF4 Shared subcycle dynamics

M

Add once project fixtures exist

Future:

  • if project-level GIR lands, upgrade applicable cells from M to M/G/C

  • until then, do not force fake GIR coverage into the project matrix

MCMC and worker policy

Yes: MCMC should be tracked.

Yes: worker mode should be tracked anywhere the code can execute differently between serial and parallel paths.

But neither should multiply every cell in the main matrix. Instead use focused requirements:

MCMC requirements

MCMC is a second-layer contract on top of the clean optimizer roundtrips.

Minimum MCMC set:

  • MC1: one canonical B test on F1

  • MC2: one canonical B test on F1

  • MC2: one expression-sensitive case, preferably F9 or F10

  • MC2: one 2D varying case, preferably F3 or F8

Rationale:

  • MC1 checks that MCMC itself still works

  • MC2 checks pickling / process-boundary behavior

  • expression-heavy and nested-model cases are the highest-value bug surfaces

SbS worker requirements

Yes, SbS should distinguish W1 and W2 because n_workers=1 uses the serial path and n_workers>1 crosses a process boundary.

Current status:

  • the main SbS matrix runs with n_workers=1

  • focused W2 tests cover F1 and profile-bearing F6 with n_workers=2

Future requirement if worker-specific risk grows:

  • add a more expression-heavy W2 SbS case, likely F2, if process-boundary risk shows up beyond the existing plain/profile cases

Project worker requirements

For project-level fits, add worker variants only when project execution gains a parallel path that is semantically different from serial execution.

Practical rule

Use this rule to decide whether a new dimension deserves explicit tracking:

  • add it as a full matrix axis only if it changes almost every cell

  • otherwise add it as a focused secondary requirement

By that rule:

  • project-level fits: yes, but separate matrix

  • MCMC: yes, as focused secondary coverage

  • workers=1 vs workers=2: yes, but only for APIs that actually expose worker-dependent behavior

Minimum test shape per cell

For each required cell, the minimum useful test is:

  • simulate noiseless data from a truth model

  • fit through the target workflow API

  • assert recovered non-expression parameters match truth

  • for C, assert MCP and GIR agree exactly or within a tight tolerance

For MCMC-focused cells, the minimum useful test is:

  • run the fit with mc_settings.use_mc=1

  • assert no crash for MC1

  • assert no crash and no pickling/serialization failure for MC2

  • when runtime allows, also assert basic parameter recovery or constraint preservation

Noisy roundtrip tests are still valuable, but should be a second layer. The clean matrix above is the baseline contract.

Current snapshot

This is the current high-level state of the suite, not a substitute for the table above.

  • Covered today:

    • the full single-file clean matrix above for M/G/C

    • F2 variants for direct, fan-out, and forward-reference expressions

    • noisy second-layer checks for F3, F6, and F8 on the GIR path

    • focused MCMC checks for MC1, MC2, expression-sensitive MC2, and 2D MC2

    • focused W2 coverage for plain and profile-bearing fit_slice_by_slice()

    • project-level M roundtrips for PF1, PF2, and PF3

  • Thin or missing today:

    • project-level PF4 shared subcycle dynamics

    • project-level G/C coverage, because project fitting is still MCP-only

    • expression-heavy W2 coverage for fit_slice_by_slice()

    • MCMC assertions beyond no-crash / process-boundary coverage

    • exhaustive noisy coverage, intentionally kept out of the main matrix

Suggested implementation order

The original single-file matrix is implemented. Highest-value next steps:

  1. Add PF4 once a shared project-subcycle fixture exists.

  2. Add a focused expression-heavy W2 SbS test if process-boundary risk shows up beyond the existing plain/profile cases.

  3. Add lightweight recovery or constraint-preservation assertions to focused MCMC tests when runtime allows.

  4. Upgrade project-level cells from M to M/G/C if project-level GIR lands.

Non-goals

This matrix does not try to track:

  • invalid / explicitly unsupported model combinations

  • low-level evaluator unit tests

  • plotting-only behavior

Those should stay in their existing focused tests.