Fitting Module
1D and 2D peak fitting functions based on lmfit.
This module provides the complete fitting workflow for spectroscopy data: - Residual calculation for optimization - Fitting with lmfit (including global + local optimization) - Confidence interval estimation (lmfit.conf_interval) - MCMC sampling via lmfit.emcee for uncertainty quantification - Result visualization and export
The fitting functions here work with the MCP (Model/Component/Parameter) system from trspecfit.mcp to provide a flexible, component-based fitting framework.
Key Functions
residual_fun : Compute residual for optimizer fit_wrapper : Main fitting function with CI and MCMC support plt_fit_res_1d : Plot 1D fit results with residuals plt_fit_res_2d : Plot 2D fit results with residual maps
- trspecfit.fitlib.compute_fit_metrics(*, observed: ndarray, fit: ndarray, n_free_pars: int, sigma_eff: float | None = None) dict[str, float][source]
Compute fit-quality metrics from observed and fitted arrays.
Always emits the raw (unweighted) diagnostics
chi2_rawandchi2_red_raw— these match lmfit’sMinimizerResult.chisqr / .redchifor unweighted fits. Whensigma_effis provided, also emits the σ-calibratedchi2andchi2_red(≈ 1for a fit at the noise floor); withoutsigma_effboth calibrated values areNaN.r2,aic,bicare unaffected by σ (R² is dimensionless; AIC/BIC depend on raw χ² but their differences are invariant under constant rescaling).- Parameters:
observed (ndarray) – Data view that was fit against (e.g.
data_basefor baseline, croppeddatafor sbs/2d). Any shape; the array is flattened.fit (ndarray) – Model evaluated at final parameters. Must broadcast to
observed.n_free_pars (int) – Number of varying (non-fixed, non-expression) parameters in the fit. Used as
nvarysin the AIC/BIC and reduced-χ² formulas.sigma_eff (float, optional) – Effective noise σ on the fit’s data view (per-pixel for SbS/2D,
σ_pixel / √N_avgfor baseline). WhenNone/NaN/ non-positive, the calibratedchi2/chi2_redfields areNaN. Caller is responsible for any view-specific scaling.
- Returns:
{"chi2_raw", "chi2_red_raw", "chi2", "chi2_red", "r2", "aic", "bic"}.chi2_red_raw,aic,bicareNaNwhenndata <= n_free_parsorchi2_raw == 0(degenerate fits);chi2/chi2_redare additionallyNaNwhensigma_effis missing or invalid.- Return type:
- trspecfit.fitlib.fit_wrapper(const: tuple[Any, ...], args: tuple[Any, ...], par_names: list[str], par: Any, stages: int, sigmas: list[float] | None = None, try_ci: int = 1, mc_settings: MC | None = None, fit_alg_1: str = 'Nelder', fit_alg_2: str = 'leastsq', show_output: int = 0, save_output: int = 0, save_path: PathLike = '', num_fmt: str = '%.6e', delim: str = ',') list[Any][source]
Comprehensive fitting wrapper with optimization, CI, and MCMC.
This is the main fitting function in trspecfit. It handles: - Single or two-stage optimization - Confidence interval estimation via lmfit.conf_interval - MCMC sampling via lmfit.emcee - Result visualization and export
Two-stage fitting (stages=2) is recommended for robust optimization: first finds global minimum with Nelder-Mead, then refines locally with leastsq for accurate error bars.
- Parameters:
const (tuple) – Constants for residual_fun: (x, data, function_str, unpack, e_lim, t_lim)
args (tuple) – Arguments for fit function (passed to residual_fun): Typically (model, dim) for MCP models
par_names (list of str) – Parameter names in order (for display and export)
par (lmfit.Parameters or list) –
Initial parameter guess:
lmfit.Parameters: Use directly
list: Convert to lmfit.Parameters using par_names. Each element:
[value, vary, min, max]or['expression']
stages ({1, 2}) –
Number of optimization stages:
1: Single optimization with
fit_alg_12: Two-stage fit (
fit_alg_1thenfit_alg_2)
sigmas (list of int or float, default=[1,2,3]) – Confidence levels for CI and MCMC (e.g., [1,2,3] for 1σ, 2σ, 3σ)
try_ci ({0, 1}, default=1) –
Confidence interval estimation:
0: Skip CI calculation
1: Calculate CI if error bars available (result.errorbars=True)
mc_settings (ulmfit.MC, default=ulmfit.MC()) –
MCMC configuration object:
use_emcee: 0 (skip), 1 (always), 2 (if CI fails)
steps: Number of MCMC steps per walker
nwalkers: Number of MCMC walkers
burn, thin, ntemps, workers, is_weighted: MCMC parameters
See ulmfit.MC class for details
fit_alg_1 (str, default='Nelder') –
First/only optimization method. Common options:
’Nelder’: Nelder-Mead (robust, no gradients, slower)
’powell’: Powell (robust, no gradients)
’leastsq’: Levenberg-Marquardt (fast, needs good initial guess)
See lmfit documentation for full list
fit_alg_2 (str, default='leastsq') – Second optimization method (stages=2 only). Typically ‘leastsq’ for accurate local optimization and error bars.
show_output ({0, 1}, default=0) –
Output mode:
0: Silent / programmatic / API mode – no prints
1: Interactive / notebook / UI mode – show timing, fit results, and confidence intervals
save_output ({-1, 0, 1}, default=0) –
Save results to files:
0: Don’t save
1: Save all results (parameters, CIs, MCMC, plots)
-1: Same as 1 (for compatibility)
save_path (str or Path, default='') – Base path for saved files (without extension). Files saved: _par_ini.csv, _par_fin.txt, _par_fin.csv, _conf_ci.csv, _emcee_fin.txt, _emcee_flatchain.csv, _emcee_ci.csv, _emcee_walker_acceptance_ratio.png, _emcee_corner_plot.png
num_fmt (str, default='%.6e') – Float format applied to CSV outputs (pandas
float_format).delim (str, default=',') – Delimiter applied to CSV outputs (pandas
sep).
- Returns:
Five-element list containing results: [par_ini, par_fin, conf_ci, emcee_fin, emcee_ci]
par_ini (lmfit.Parameters) – Initial parameter guess.
par_fin (lmfit.MinimizerResult or []) – Final fit result from lmfit.minimize.
conf_ci (pd.DataFrame) – Confidence intervals from lmfit.conf_interval. Columns:
['par[v]/sigma[>]', '-3σ', '-2σ', '-1σ', 'best', '+1σ', '+2σ', '+3σ']. Empty DataFrame if CI not calculated/failed.emcee_fin (lmfit.MinimizerResult or []) – MCMC result from lmfit.emcee. Empty list if MCMC not used.
emcee_ci (pd.DataFrame) – MCMC confidence intervals from quantiles of flatchain. Same column structure as conf_ci. Empty DataFrame if MCMC not used.
- Return type:
Examples
>>> # Basic single-stage fit >>> const = (energy, spectrum, spectra, 'fit_model_mcp', 0, [], []) >>> args = (model, 1, False) >>> results = fit_wrapper( ... const=const, ... args=args, ... par_names=model.parameter_names, ... par=model.lmfit_pars, ... stages=1, ... show_output=1 ... ) >>> par_ini, par_fin, conf_ci, emcee_fin, emcee_ci = results
>>> # Two-stage fit with confidence intervals >>> results = fit_wrapper( ... const=const, ... args=args, ... par_names=model.parameter_names, ... par=model.lmfit_pars, ... stages=2, ... try_ci=1, ... sigmas=[1, 2, 3], ... show_output=1, ... save_output=1, ... save_path='fit_results/baseline_fit' ... )
>>> # Fit with MCMC for uncertainty quantification >>> mc = ulmfit.MC(use_emcee=1, steps=5000, nwalkers=50) >>> results = fit_wrapper( ... const=const, ... args=args, ... par_names=model.parameter_names, ... par=model.lmfit_pars, ... stages=2, ... try_ci=1, ... mc_settings=mc, ... show_output=1 ... )
Notes
Two-Stage Fitting: Recommended approach for robust results: 1. Nelder-Mead finds global minimum (no gradient needed) 2. Levenberg-Marquardt refines locally (fast, accurate errors)
This combination avoids local minima while providing reliable error estimates.
Error Estimation Methods: - Confidence intervals (CI): Profile likelihood method, exact but slow - MCMC: Samples parameter space, handles correlations, provides full distributions - Use MCMC for complex models or when CI fails
MCMC Diagnostics: When using MCMC, check: - Acceptance ratios (saved plot): Should be 0.2-0.5 - Corner plot (saved): Should show well-defined peaks - Chain length: Increase steps if distributions look noisy
Performance Tips: - Use stages=1 for quick fits during model development - Use stages=2 for final/publication fits - MCMC is slow (minutes for complex models) but provides best uncertainties
File Outputs: When save_output=1: - CSV files: Comma-separated, easy to read in Excel/pandas - TXT files: Human-readable lmfit.fit_report format - PNG files: High-resolution plots for documentation
- trspecfit.fitlib.plt_fit_res_1d(x: ArrayLike, y: ArrayLike, fit_fun_str: str, par_init: Any, par_fin: Any, args: tuple[Any, ...] | None = None, *, plot_sum: bool = False, show_init: bool = True, title: str = '', fit_lim: list[int] | None = None, config: PlotConfig | None = None, legend: list[str] | None = None, **kwargs: Any) None[source]
Plot 1D fit results: data, initial guess, final fit, components, and residual.
Creates a comprehensive visualization showing data, model components, total fit, and residual. Essential for evaluating fit quality and understanding multi-component models.
- Parameters:
x (array) – X-axis data (energy or time)
y (array) – Y-axis data (spectrum to be fitted)
fit_fun_str (str) – Name of fitting function in
trspecfit.spectra(e.g.,'fit_model_mcp','fit_model_gir')par_init (list or lmfit.Parameters) – Initial parameter guess. Can be empty list [] if show_init=False.
par_fin (lmfit.MinimizerResult or lmfit.Parameters or list) –
Final fit parameters:
lmfit.MinimizerResult: From fit_wrapper result[1]
lmfit.Parameters: Manual parameter object
list: Empty list shows initial guess only (no final fit)
args (tuple, optional) – Additional arguments for fit function (model, dim). If None, defaults to empty tuple.
plot_sum (bool, default=False) –
Plot sum only:
False: Show each component separately (colored + filled)
True: Show only total fit (faster, cleaner for many components)
show_init (bool, default=True) –
Show initial parameter guess:
True: Plot initial guess as dotted gold line
False: Skip initial guess (cleaner when guess is far off)
title (str, default='') – Plot title. Use for file/model identification.
fit_lim (list of int, optional) – Fit limit indices [start, stop) to show as vertical dashed lines. Visualizes which data region was used for optimization.
config (PlotConfig, optional) – Plot configuration object. If None, uses defaults.
legend (list of str, optional) – Legend labels for components (used only if plot_sum=False). If None, auto-generates ‘component 0’, ‘component 1’, etc.
**kwargs (dict) – Override config attributes for this plot: x_label, y_label, x_lim, y_lim, x_dir, y_dir, res_mult, save_img, save_path, dpi_plot, dpi_save
Notes
When saving (save_img=1 or -1), provide full path with extension: save_path=’results/baseline_fit.png’
- trspecfit.fitlib.plt_fit_res_2d(data: ndarray, fit: ndarray, x: ArrayLike | None = None, y: ArrayLike | None = None, config: PlotConfig | None = None, **kwargs: Any) None[source]
Plot 2D fit results: data, fit, and residual maps.
Creates a three-panel visualization showing measured data, fitted data, and residual (data - fit) as 2D color maps. Use to improve fit by switching component type, changing number of components, etc.
- Parameters:
data (2D array) – Measured data (shape: [n_time, n_energy])
fit (2D array) – Fitted data (same shape as data)
x (array-like, optional) – X-axis (energy) coordinates. If None, uses column indices.
y (array-like, optional) – Y-axis (time) coordinates. If None, uses row indices.
config (PlotConfig, optional) – Plot configuration object. If None, uses defaults.
**kwargs (dict) –
Override config attributes for this plot.
Common options:
x_label, y_label : Axis labels (z_label used for colorbar title)
x_lim, y_lim : Fit limit indices
[left, right]or[start, stop]. Used for both slicing residual and drawing limit linesz_lim_top : Color scale
[min, max]for data and fit panels. Synchronized scale enables direct comparisonz_lim_res : Color scale
[min, max]for residual panel. Independent scale optimizes residual visibilityz_colormap : Colormap name (default ‘viridis’)
x_dir, y_dir : ‘def’ or ‘rev’ for axis direction
x_type, y_type : ‘lin’ or ‘log’ for axis scale
save_img : 0 (display), 1 (save+display), -1 (save only)
save_path : Directory path (file saved as ‘2D_data_fit_res.png’)
- trspecfit.fitlib.plt_fit_res_pars(df: DataFrame, x: ArrayLike | None = None, config: PlotConfig | None = None, save_img: int | list[int] = 0, save_path: PathLike = '') None[source]
Plot fit parameters individually as functions of time/index.
Creates separate plots for each parameter column in the DataFrame, showing how parameters evolve over time (from Slice-by-Slice fitting).
- Parameters:
df (pd.DataFrame) – DataFrame with parameters as columns. Typically from results_to_df(). Each row represents one fitted slice/time point.
x (array-like, optional) – X-axis (time) values for plotting. If None, uses row indices.
config (PlotConfig, optional) – Plot configuration object. If None, uses defaults.
save_img (int or list, default=0) –
Save/display control for each plot:
int: Apply same setting to all parameters
0: Display only
1: Display and save
-1: Save only (no display)
list: One element per parameter (per row in df). Allows selective saving (e.g., save only varied parameters)
save_path (str or Path, default='') – Directory path for saving plots. Each plot saved as: save_path/{parameter_name}.png Directory created if doesn’t exist.
- trspecfit.fitlib.residual_fun(par: Any, x: ArrayLike, data: ndarray, fit_fun_str: str, unpack: int = 0, e_lim: list[int] | None = None, t_lim: list[int] | None = None, res_type: str = 'lmfit', args: Sequence[Any] | None = None) ndarray | float[source]
Compute residual (data - fit) for optimization and analysis.
This function is called repeatedly by optimizers to evaluate how well current parameters fit the data. It supports both 1D and 2D fitting, with options for limiting the fitting region and returning different residual representations.
- Parameters:
par (list or lmfit.Parameters) – Current parameter values. If lmfit.Parameters, values are extracted. Order must match the fit function’s parameter expectations.
x (ndarray) – Independent variable axis (energy or time). Passed to fit function but may not be used if function has its own axes.
data (ndarray (1D or 2D)) –
Experimental data to fit. Shape determines dimensionality:
1D: [n_energy] for energy-resolved fits
2D: [n_time, n_energy] for time- and energy-resolved fits
fit_fun_str (str) – Name of fit function in
trspecfit.spectra(e.g.,'fit_model_mcp','fit_model_gir')unpack ({0, 1}, default=0) –
Parameter passing mode:
0: Pass parameters as list:
fit_fun(x, par, ...)1: Unpack parameters:
fit_fun(x, *par, ...)
e_lim (list of int, default=[]) – Energy axis limits [start, stop) for residual calculation. Uses slice notation: data[e_lim[0]:e_lim[1]] Empty list uses full energy range.
t_lim (list of int, default=[]) – Time axis limits [start, stop) for residual calculation. Uses slice notation: data[t_lim[0]:t_lim[1]] Empty list uses full time range.
res_type ({'lmfit', 'RSS', 'abs', 'res', 'fit'}, default='lmfit') –
Return type:
’lmfit’: Residual array (1D) or flattened residual (2D) for lmfit
’RSS’: Residual sum of squares (scalar, for scipy.optimize)
’abs’: Sum of absolute residuals (scalar, L1 norm)
’res’: Raw residual array (data - fit)
’fit’: Return fit itself instead of residual
args (tuple, default=()) – Additional arguments for fit function, passed via
*args
- Returns:
Return type depends on res_type: - ‘lmfit’: ndarray (1D for 1D data, flattened for 2D data) - ‘RSS’, ‘abs’: float (scalar metric) - ‘res’: ndarray (same shape as data) - ‘fit’: ndarray (same shape as data)
- Return type:
ndarray or float
- trspecfit.fitlib.results_to_df(results: list[Any], x: ArrayLike | None = None, index: ArrayLike | None = None, config: PlotConfig | None = None, save_df: int = 0, save_path: PathLike = '', num_fmt: str = '%.6e', delim: str = ',') DataFrame[source]
Convert Slice-by-Slice fit results to DataFrame with parameter plots.
Transforms list of fit results (from slice-by-slice fitting) into a pandas DataFrame with time/index as rows and parameters as columns. Optionally creates individual plots for each parameter vs. time.
- Parameters:
results (list) – List of fit results from fit_wrapper, one per time slice. Each element: [par_ini, par_fin, conf_ci, emcee_fin, emcee_ci]
x (array-like, optional) – Time axis values. If provided, included as column in DataFrame.
index (array-like, optional) – Index values (e.g., slice numbers). If provided, included as column.
config (PlotConfig, optional) – Plot configuration. If None, uses defaults.
save_df ({-1, 0, 1}, default=0) –
Save outputs:
0: Don’t save
1: Save DataFrame and parameter plots
-1: Same as 1
When save_df != 0, plots only varied (not fixed) parameters
save_path (str or Path, default='') – Directory path for saving files (not full filename) (created if not exists). Files saved: ‘fit_pars.csv’, ‘{param_name}.png’ for each parameter
num_fmt (str, default='%.6e') – Float format applied to
fit_pars.csv(pandasfloat_format).delim (str, default=',') – Delimiter applied to
fit_pars.csv(pandassep).
- Returns:
Results with structure: - Columns: [‘index’, config.y_label, param1, param2, …] - Rows: One per fitted slice - Values: Optimized parameter values
- Return type:
pd.DataFrame
- trspecfit.fitlib.results_to_fit_2d(results: list[Any] | DataFrame, const: tuple[Any, ...], args: tuple[Any, ...], num_fmt: str = '%.6e', delim: str = ',', save_2d: int = 0, save_path: PathLike = '') ndarray[source]
Reconstruct 2D fit spectrum from Slice-by-Slice fit results.
Takes individual 1D fit results (one per time slice) and stacks them into a complete 2D fit array. This allows visualization and comparison with the measured 2D data for Slice-by-Slice fitting.
- Parameters:
results (list or pd.DataFrame) –
Fit results, either:
list: Output from fit_wrapper for each slice. Each element:
[par_ini, par_fin, conf_ci, emcee_fin, emcee_ci]pd.DataFrame: From results_to_df() with parameters as columns
const (tuple) – Constants for residual_fun: (x, data, function_str, unpack, e_lim, t_lim) Used to evaluate fit function at each time point.
args (tuple) – Arguments for fit function (model, dim). Passed to residual_fun for spectrum generation.
num_fmt (str, default='%.6e') – Number format for saving (scientific notation with 6 decimals)
delim (str, default=',') – Delimiter for CSV output
save_2d ({-1, 0, 1}, default=0) –
Save 2D fit to file:
0: Don’t save
1: Save to CSV
-1: Save to CSV (same as 1)
save_path (str or Path, default='') – Directory path for saving. File saved as: save_path/fit_2d.csv Directory created if doesn’t exist.
- Returns:
2D fit array (shape: [n_time, n_energy]) Each row is the fitted spectrum for one time slice
- Return type:
ndarray
- trspecfit.fitlib.sigma_dict() dict[str, float][source]
Get percentage of distribution within N-sigma intervals.
Returns dictionary mapping sigma values (as strings) to the percentage of a normal distribution contained within ±N sigma of the mean.
- Returns:
Keys: Sigma values as strings (‘0.5’, ‘1.0’, …, ‘5.0’) Values: Percentage of distribution (float)
- Return type:
- trspecfit.fitlib.sigma_start_stop_percent(sigma_list: Sequence[float]) list[list[float]][source]
Calculate percentile bounds for symmetric confidence intervals.
For each sigma level, computes the lower and upper percentiles that define a symmetric interval containing the specified percentage of a normal distribution.
- Parameters:
sigma_list (list of float) – Sigma values (e.g., [1, 2, 3] for 1σ, 2σ, 3σ intervals). Values must be in range [0.5, 5.0] in 0.5 increments.
- Returns:
Percentile bounds for each sigma level. Each element: [lower_percentile, upper_percentile] Returns empty list if any sigma value not supported.
- Return type:
Examples
>>> # 1σ, 2σ, 3σ intervals >>> bounds = sigma_start_stop_percent([1.0, 2.0, 3.0]) >>> print(bounds) [[15.87, 84.13], [2.28, 97.72], [0.14, 99.86]]
>>> # Use with numpy.percentile on MCMC samples >>> import numpy as np >>> samples = np.random.normal(0, 1, 10000) >>> bounds_1sigma = sigma_start_stop_percent([1.0])[0] >>> ci = np.percentile(samples, bounds_1sigma) >>> print(f"1σ interval: [{ci[0]:.2f}, {ci[1]:.2f}]") 1σ interval: [-1.01, 1.02]
- trspecfit.fitlib.time_display(t_start: float, print_str: str = '', *, return_delta_seconds: bool = False) float | None[source]
Display elapsed time in human-readable format.
Computes time elapsed since t_start and displays it with appropriate units (seconds, minutes, hours, or days). Useful for benchmarking fitting operations.
- Parameters:
- Returns:
None normally, or elapsed seconds if return_delta_seconds=True
- Return type:
None or float
Examples
>>> import time >>> t0 = time.time() >>> # ... do some work ... >>> time_display(t0, 'Operation completed in: ') Operation completed in: 01:23.456(mm:ss.ms)