Examples Upgrade Plan
Design note for a future branch that reorganizes the example notebooks around the way users actually approach the package. This is deliberately separate from the fit-results save/load branch: the current branch should finish the archive feature with minimal examples/docs coverage, then merge. The broader examples upgrade is a teaching and UX project with enough file movement and narrative work to deserve its own branch.
Decisions (locked)
Track-based navigation replaces the linear “walk forward” path. The quickstart still recommends
01_basic_fittingas the first notebook, but does not imply that every user should walk every example in order.Top-level directories for this examples-upgrade pass:
fitting_workflows/(existing name kept) andsynthetic_data/(renamed fromdata_generation/). Nodata_preparation/track in this pass.Inside
fitting_workflows/, layout is flat with three numeric blocks: 01–04 = fitting skills on a single file; 10–11 = post-fit work (comparison, persistence, export); 20+ = multi-file workflows.fitting_workflows/README.mddocuments the legend.10_model_comparisonis strictly about model comparison. The persistence / inspection / export side (save/load h5, browse loaded archives, ship single slots, the two-channels framing) moves to a sibling notebook11_save_load_export. Each notebook has one job, and notebook 11 gets a discoverable filesystem location that can be linked fromsave_fit/export_fitdocstrings, the README, and the CHANGELOG.Notebook 11 uses notebook 10’s full pipeline as its preamble via the IPython
%runmagic (%run ../10_model_comparison/example.ipynb), preceded by a one-line markdown pointer (“see10_model_comparisonfor the fitting/comparison details; this notebook focuses on what to do with the results”). Output suppression rides on the existingproject.yamlknobs already used by10_model_comparison(show_output: 0,auto_export: false);%%captureis the fallback, not the default mechanism. Expected preamble runtime ~30–40 s (baseline fits <1 s each, SbS ~10 s each, 2D fits a few seconds each) — acceptable for a notebook a reader opens deliberately. Single source of truth: notebook 10 owns the fit pipeline; notebook 11 inherits any future updates automatically and ends up with rich state (baseline + SbS + 2D slots, σ snapshot, conf_ci) available for the persistence demos.Casual user’s mental model =
File.file.save_fit()saves a snapshot of completed fits for this file, keeping the latest slot per model / fit type / selection (snapshot semantics inherited fromProject.save_fits). It is the casual user’s “save all current fits for this file” API.Model comparison (
FitResults.load+compare_models) is its own notebook (10_model_comparison), not part of01_basic_fitting— comparison requires fitting two models, which doubles the cognitive load of the basic notebook.Export terminology: export = one-way CSV/PNG for humans + tools like Origin; save/load = HDF5 round-trip via the
FitResultsarchive.Project.auto_export(defaultTrue, configurable viaproject.yamlor post-init mutation) gates the fit-completion CSV/PNG side effects. The basic notebook documents the default behavior and shows the opt-out.
Motivation
The fit-results work introduces a clearer split between:
Fileas the natural surface for fitting and exporting one dataset.Projectas configuration, workspace, multi-file coordination, and archive ownership.FitResultsas the inspection/comparison object for completed fits.
The examples should reinforce that split. Users who are thinking “fit file 1,
export the fit, load it into Origin” should not feel that they need to
understand the full Project model. Power users should still have a clear path
to multi-file fitting, project-level shared fits, archives, and model
comparison.
Current State
The examples are currently organized as:
examples/
data_generation/
simulator/
ml_training/
fitting_workflows/
01_basic_fitting/
02_dependent_parameters/
03_multi_cycle/
04_par_profiles/
05_project_level_fitting/
This is already close in one respect: simulation and ML training data are separate from fitting. The weakness is that all fitting notebooks live in one linear sequence, even though they represent different user mindsets:
Single-file fitting skills (
01through04).Post-fit comparison (currently absent).
Multi-file / project-level fitting (
05, with no bridge between the single-file basics and full project-level shared fits).
The current quickstart also tells users to start at 01 and work forward. That
is useful for a tutorial path, but less useful once examples cover multiple
tracks.
User Workflows
Single-file / Origin-style user
This user has one processed dataset and wants to fit it, export tables/plots, and keep working in external tools.
Primary API:
file.fit_baseline(...)
file.fit_2d(...)
file.export_fit() # CSV + PNG, Origin-friendly
file.save_fit() # HDF5 archive snapshot for this file
file.get_fit_results(fit_type="2d")
The Project should appear as setup/context, not as the main conceptual object.
For this user, Project is mostly where config, paths, plotting defaults, and
model files live.
Multi-file individual fitting user
This user has several files but wants to fit each file separately. They want a shared loop, consistent settings, per-file exports, and a summary view.
Primary API:
project = trspecfit.Project(...)
files = [...]
for file in files:
file.fit_baseline(...)
file.fit_2d(...)
project.export_fits() # one coherent tree across files
project.save_fits() # one portable HDF5 for the batch
project.results.compare_models(file=files[0], ...)
This is the bridge workflow: Project is useful as a collection and session
workspace, but each fit remains file-scoped.
Synthetic-data / ML user
This user is working with forward simulation, validation, or training data. They are not preparing experimental data and not primarily fitting an existing file. The current simulator and ML training notebooks belong together.
Proposed Directory Layout
examples/
fitting_workflows/ # existing name kept
01_basic_fitting/ # block 0x: fitting skills (single-file)
02_dependent_parameters/
03_multi_cycle_dynamics/ # renamed from 03_multi_cycle
04_parameter_profiles/ # renamed from 04_par_profiles
10_model_comparison/ # block 1x: post-fit work
11_save_load_export/ # block 1x: post-fit work (NEW)
20_fit_each_separately/ # block 2x: multi-file (NEW, bridge)
21_project_level_shared_fit/ # was 05_project_level_fitting
synthetic_data/ # renamed from data_generation
01_simulator/
02_ml_training_data/
Numbering convention documented in fitting_workflows/README.md:
0x — fitting skills on a single file.
1x — post-fit work (comparison, persistence, export).
2x — multi-file workflows.
The flat layout with three numeric blocks keeps alphabetical sort intact while making the category structure visible without an extra directory level. Numbering restarts at the next block boundary as new notebooks are added within a category.
Notebook Content Targets
fitting_workflows/01_basic_fitting
Core “one file” story:
Load processed
data,energy, andtime.Fit baseline, optional slice-by-slice, and 2D model.
Show
file.get_fit_results(...).Show
file.export_fit()as the Origin-friendly CSV/PNG workflow.Show
file.save_fit()as the archive-snapshot persistence (keeps the latest slot per model / fit type / selection for this file).Callout:
fit_*methods auto-write CSVs/PNGs toproject.path_resultson completion by default. The notebook shows both the default and theproject.auto_export = Falseopt-out (also settable viaproject.yaml).Out of scope here:
FitResults.loadandcompare_models— those move to10_model_comparisonso this notebook stays focused on the casual user’s single-file path.
fitting_workflows/10_model_comparison
Post-fit comparison story (NEW). Strictly model selection — persistence,
inspection, and export move to the sibling notebook 11_save_load_export
so each notebook has one job.
Two models, one file. Compress the fitting cells — readers have seen the fit API in 01–04, so this notebook glosses over fitting and focuses on comparison.
Three comparison stories, each isolating one structural choice: baseline (line shape), SbS (parsimony), 2D (instrument response).
In-session comparison via
project.results.compare_models(...)and the sugar delegatefile.compare_models(...).compare_modelsaggregation modes: defaultmedian,sum, andlongfor per-slice rows. Close §6 with a “two practical questions” payoff — “which model fits spectrum #4 best?” (long form + slice filter) and “which model fits best across the board?” (sum-aggregated, sorted by AIC). This motivateslongform for the batch-of-spectra use case (SbS as N independent 1D fits, not necessarily time).plot_residualsat both ends: 1D obs+fit+residual for baseline, shared-scale residual heatmaps for 2D (where the IRF residual band is the decisive visual).
Persistence content (save_fit / FitResults.load / compare_models
on loaded archives / slot anatomy / filtered single-slot ship / two
channels / overwrite semantics / σ-snapshot recalibration) does not
live here. See 11_save_load_export.
fitting_workflows/11_save_load_export
Save / load / export story (NEW). The canonical reference for the
FitResults archive API, used after a reader has seen fitting and
comparison.
Preamble pattern (first two cells):
Markdown pointer to
10_model_comparison(“see that notebook for the fits’ details; this one focuses on what to do with the results”).A single code cell that runs notebook 10’s content with suppressed output:
%run ../10_model_comparison/example.ipynb
IPython
%runexecutes the target notebook in the current kernel, so all of notebook 10’s variables (file,project, fitted models) are in scope below. Output suppression rides on the existingproject.yamlknobs (show_output: 0,auto_export: false).%%captureis the fallback if any output leaks past the YAML knobs. Runtime ~30–40 s.
After the preamble, the actual content:
file.save_fit("comparison.fit.h5")andFitResults.load(path)— the canonical round-trip with no liveProjecton the reload side.loaded.compare_models(...)showing the same comparison API works identically against an on-disk archive (sanity check, not the primary point).Filtered single-slot save (
save_fit(path, model=..., fit_type=...)) and the parallelexport_fitwith the same filters. “Ship the winners” as the natural next step once a reader has a verdict.The two channels framed by audience: HDF5 (structured, lossless, σ-snapshot included, round-trips back into trspecfit — for future you and other trspecfit users) vs CSV/PNG tree (one-way — for Origin, MATLAB, paper plots, non-trspecfit colleagues).
FitResultsquery API:files(),models(),find(),get(), and slot anatomy viadataclasses.fields(slot)rendering shapes for arrays/frames and keys for dicts (every constituent part is discoverable without opening the.h5).Overwrite / slot-collision semantics on
save_fit(append-by-default,FileExistsErroron slot collision unlessoverwrite=True).σ-snapshot semantics — calibrated columns survive load without re-
set_sigma(); what-if recalibration viachi2_red_raw.
The preamble pattern is preferred over an inline stripped-down setup because (a) single source of truth — notebook 10 owns the fit pipeline, notebook 11 inherits future updates automatically — and (b) rich state: all slot types (baseline + SbS + 2D, with conf_ci on baseline) are available for the persistence demos, not just a minimum quorum. The ~30–40 s runtime cost is acceptable for a notebook a reader opens deliberately.
fitting_workflows/20_fit_each_separately
Bridge story (NEW):
One model definition and one set of fit limits applied across N files (avoids the duplicated setup code a bare
for file in files: ...loop would require withoutProject).project.export_fits()produces a single coherent directory tree (<root>/<file_name>/<model>__<fit_type>/...) — easier to diff or zip than N separate per-file dumps.project.results.compare_models(file=...)works across the full batch, including replicates of the same physical sample.project.save_fits(path)packages the whole batch into one portable HDF5.Concrete contrast: mention what is lost when running the loop without
Project(the four points above) — makes the value prop explicit rather than implicit.
This notebook makes the distinction clear: multi-file workspace does not necessarily mean shared/project-level fitting.
synthetic_data
Forward-model story:
01_simulator: generate known-truth spectra for validation and demos.02_ml_training_data: sweep parameter space and save training datasets.
These examples can keep using Project/File internally because the simulator
needs a model, but the section is described as synthetic data generation, not
as a fitting tutorial.
Save/Export/Load Presentation
The examples should be careful about language:
Use export for one-way CSV/PNG output intended for humans and tools like Origin:
file.export_fit()/project.export_fits().Use save/load for round-trippable HDF5 fit-result archives:
file.save_fit(),project.save_fits(),FitResults.load(...),project.load_fits(...).Keep individual export visibly supported. The deprecated methods are the old method names and legacy implementations, not the single-file export workflow.
Present
FitResultsas the result browser/comparison object, not as something casual single-file users must understand before exporting.
Auto-export side effect. fit_* methods write CSVs/PNGs to
project.path_results automatically on completion by default. Explicit
file.export_fit() / project.export_fits() calls are the re-runnable,
slot-filtered version of that same content. project.auto_export = False
(also settable via auto_export: false in project.yaml) makes the explicit
path the only one that writes — useful for parameter sweeps, ML training-data
generation, and the long-term real-time fitting goal. Notebooks should
describe both the default and the opt-out, so users are not surprised by
files appearing on disk before they “exported.”
Migration Plan
Finish the current fit-results save/load branch with minimal notebook/docs coverage:
Extend
01_basic_fitting/example.ipynbwith a final section demonstratingfile.save_fit()+file.export_fit()(nocompare_models/load— those belong in10_model_comparison, written in the follow-up branch).Make sure
file.export_fit()is presented as the Origin-style path.Run tests and merge.
Start a new branch for the examples upgrade.
Move current notebooks into the new structure:
fitting_workflows/01_basic_fitting→ unchangedfitting_workflows/02_dependent_parameters→ unchangedfitting_workflows/03_multi_cycle→fitting_workflows/03_multi_cycle_dynamicsfitting_workflows/04_par_profiles→fitting_workflows/04_parameter_profilesfitting_workflows/05_project_level_fitting→fitting_workflows/21_project_level_shared_fitdata_generation/simulator→synthetic_data/01_simulatordata_generation/ml_training→synthetic_data/02_ml_training_data
Split and add notebooks:
fitting_workflows/10_model_comparison/— already exists post fit-saving merge. Trim to comparison-only: lift §8 (Save → Load → Compare Across Sessions), §9 (Browse the Loaded Archive), the “Ship just the winning fits” subsection, and the persistence bullets/tips into11_save_load_export. Update its intro table-of-contents (drop bullets about save/load/export) and the Tips block accordingly.fitting_workflows/11_save_load_export/— NEW. Two-cell preamble (markdown pointer +%run ../10_model_comparison/example.ipynb), then the content lifted from the pre-split notebook 10. See the content target above for the full scope.fitting_workflows/20_fit_each_separately/— NEW.
Add
fitting_workflows/README.mddocumenting the 0x / 1x / 2x numeric-block legend.Update
examples/README.md,docs/examples/index.rst, anddocs/quickstart.mdto use the track-based navigation. Grep for hardcoded old paths first.Run notebook smoke checks or at least path/import checks after the moves.
Non-goals For The Save/Load Branch
Do not reorganize the full
examples/tree in the save/load branch.Do not rewrite every existing notebook to the new teaching architecture before merging the archive work.
The save/load branch should only make the new feature discoverable enough that users are not stranded. The full teaching architecture belongs in the follow-up examples branch.
Data-preparation workflows. Dark subtraction, detector calibration, and
pixel-to-energy mapping are upstream preprocessing — out of scope for this
examples-upgrade pass. Dark subtraction and detector calibration may be too
instrument-specific for this repo’s core example tree. Energy-axis calibration
by fitting reference spectra is closer to trspecfit’s value proposition, so
it can be revisited later if we have a compact, shareable Au 4f / valence-band
style dataset and a workflow that teaches calibration without turning into an
instrument-control tutorial.
Open Questions
Should moved notebooks preserve old numeric prefixes exactly, or use the new block scheme? Resolved: use the new 0x / 1x / 2x block scheme; document the legend in
fitting_workflows/README.md.Should the examples upgrade include a compatibility note for old paths, or is this acceptable as a clean pre-1.0 examples reorganization? Resolve via grep of
docs/,README.md,examples/README.md, and any reference in the docstrings before the rename branch starts — the answer follows the number of hits.