hqs_nmr.analysis.autoshift_tools
Tools for autoshifting the shifts.
Functions
|
Calculate the area of a function. |
|
Calculate the center of mass of the spectrum. |
|
Calculate normalized cumulative fit scores. |
|
Calculate normalized cumulative fit scores in [0, 1]. |
|
Calculate fit scores between a reference and contributions summing to a signal. |
|
Cost function as the integral of the absolute differences between reference and signal. |
Integral of the absolute difference between reference and shifted and adapted signal. |
|
Integral of the absolute differences between reference and shifted signal. |
|
|
Find the shift boundaries clipping them to be within the given boundaries. |
Enforce equal shifts for equivalent spins. |
|
|
Get new guess for jcoupling adaption. |
Get the spectrum from the interpolators of the inverse of the green's function. |
|
|
Lorentzian function. |
|
Apply a lorentzian filter to the signal. |
Generate a new guess based on cumulative fit scores. |
|
|
Generate new guess by stepping from the current one. |
Process experimental data. |
|
Sum the green's function over the equivalent spins. |
- hqs_nmr.analysis.autoshift_tools._order_by_frequency(spectrum: ndarray, omegas: ndarray) tuple[ndarray, ndarray]
Order spectrum by frequency.
- Parameters:
spectrum (np.ndarray) – spectrum.
omegas (np.ndarray) – frequencies.
- Returns:
Ordered spectrum and frequencies.
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools._scale_fwhm(gf: ndarray, omegas: ndarray, fwhm_scale: float) ndarray
Apply a FWHM broadening to the green’s function.
- Parameters:
gf (np.ndarray) – green’s function.
omegas (np.ndarray) – frequencies.
fwhm_scale (float) – broadening scaling factor.
- Returns:
Broadened green’s function.
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools.calculate_area(fx: ndarray, x: ndarray) float
Calculate the area of a function.
- Parameters:
fx (np.ndarray) – function.
x (np.ndarray) – x-values on which the function is evaluated.
- Returns:
The area of the function.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools.calculate_center_of_mass(spectrum: ndarray, omegas: ndarray) float
Calculate the center of mass of the spectrum.
- Parameters:
spectrum (np.ndarray) – spectrum.
omegas (np.ndarray) – frequencies.
- Returns:
The center of mass of the spectrum.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools._index_bookkeeping(result_greens_function: NMRResultGreensFunction1D) tuple[list[int], dict[int, int], dict[int, list[int]]]
Extract all indices for bookkeeping.
Finds all indices of hydrogens, groups the equivalent ones and creates the maps between these two representations.
- Parameters:
result_greens_function (NMRResultGreensFunction1D) – greens function result
- Returns:
indices of the hydrogens, spins to groups map and groups to spins map.
- Return type:
tuple
- hqs_nmr.analysis.autoshift_tools._determine_max_shifts(exchangeable_protons: list[int], groups_to_spin_map: dict[int, list[int]], max_shift_h: float, max_shift_exc: float) ndarray
Return max allowed shift for each spin group, taking care of exchangeable protons.
- Parameters:
exchangeable_protons (list[int]) – exchangeable protons indices.
groups_to_spin_map (dict[int, list[int]]) – spin groups to spins map.
max_shift_h (float) – maximum shift for hydrogens.
max_shift_exc (float) – maximum shift for exchangeable protons.
- Returns:
max shift array.
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools._resample_spectrum(spectrum: ndarray, omegas: ndarray, num_rediscretized_omegas: int, num_linear_omegas: int) tuple[ndarray, ndarray]
Resample the spectrum to an optimized set of frequencies.
The spectrum will be resampled on a frequency grid combining a grid based on equal-area-rediscredization and a grid ensuring that no frequency step is larger than a theoretical linear grid of the specified dimension.
- Parameters:
spectrum (np.ndarray) – spectrum to resample.
omegas (np.ndarray) – original reference frequencies.
num_rediscretized_omegas (int) – max total number of rediscretized frequency points.
num_linear_omegas (int) – max number of linearly discretized frequency points to ensure that the maximum ferquency step is frequency_span / num_linear_omegas.
- Returns:
resampled spectrum and new frequencies.
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools._identify_multiplets_reduced(spectrum: NMRResultSpectrum1D, spin2groups_map: dict[int, int]) list[NMRJoinedMultiplet]
Identify spectrum multiplets and group them according to the provided spin to groups map.
- Parameters:
spectrum (NMRResultSpectrum1D) – spectrum to identify multiplets from.
spin2groups_map (dict[int, int]) – map of spins to groups.
- Returns:
list of multiplets grouped according to the provided map.
- Return type:
list[NMRJoinedMultiplet]
- hqs_nmr.analysis.autoshift_tools._get_compatible_multiplets(ref_multiplet: NMRMultiplet, ref_multiplet_cms: float, multiplets: list[NMRJoinedMultiplet], multiplets_cms: ndarray, max_shifts: ndarray) list[NMRJoinedMultiplet]
Get compatible multiplets for a given reference multiplet.
A multiplet is considered compatible with the reference one if its center of mass is within max_shift of the reference one.
- Parameters:
ref_multiplet (NMRMultiplet) – reference multiplet.
ref_multiplet_cms (float) – reference multiplet center of mass.
multiplets (list[NMRJoinedMultiplet]) – list of multiplets to check.
multiplets_cms (np.ndarray) – list of multiplets center of mass.
max_shifts (np.ndarray) – maximum shift allowed for center of mass for each multiplet.
- Returns:
list of compatible joined multiplets.
- Return type:
list[NMRJoinedMultiplet]
- hqs_nmr.analysis.autoshift_tools._cut_out_multiplet_from_signal(signal: ndarray, frequencies: ndarray, multiplet: NMRJoinedMultiplet | NMRMultiplet, buffer: float = 0, eta: float | None = None) tuple[ndarray, ndarray]
Single out a multiplet peak from the signal.
By “single out” it is meant to simply drop the elements from signal and frequencies that are outside of the boundaries of the given multiplet.
- Parameters:
signal (np.ndarray) – signal to single out the peak from.
frequencies (np.ndarray) – frequencies of the signal.
multiplet (Union[NMRJoinedMultiplet, NMRMultiplet]) – multiplet to single out.
buffer (float) – buffer in terms of multiplet span percentage to add on either side of the singled out peak. Signal will be assumed 0 + i eta there. Defaults to 0.
eta (Optional[float]) – eta value to set in the buffer region.
- Returns:
cut out signal and frequencies.
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools._find_optimal_shifts_fwhm(cost_matrix: ndarray, shifts_matrix: ndarray, fwhm_scale_matrix: ndarray) tuple[ndarray, ndarray]
Find optimal chemical shifts and fwhm given a cost matrix.
- Parameters:
cost_matrix (np.ndarray) – cost associated with the different pairs of shifts and fwhm scales. Its row index spans the different detected experimental multiplets and its column index spans the simulated spin contributions. NaNs should be used for the cost of non-compatible experimental and simulated multiplets.
shifts_matrix (np.ndarray) – shifts that minimize the cost for the corresponding experimental multiplets and simulated spin contribution.
fwhm_scale_matrix (np.ndarray) – fwhm that minimizes the cost for the corresponding experimental multiplets and simulated spin contribution.
- Returns:
optimal chemical shifts and fwhm
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools._unreduce_shift_fwhm(shifts: ndarray, fwhm_scales: ndarray, num_isotopes: int, groups2spin_map: dict[int, list[int]]) tuple[ndarray, ndarray]
Convert shifts and fwhm scalings from reduced space to the molecule level.
Takes the shifts and fhmw scalings in the autoshift indices and translates them to the original indices at the molecule level, so that they can be applied to the original NMR parameters.
- Parameters:
shifts (np.ndarray) – original shifts for each isotope.
fwhm_scales (np.ndarray) – additional spin-dependent fwhm scales.
num_isotopes (int) – total number of isotopes.
groups2spin_map (dict[int, list[int]]) – mapping from group index to spin indices.
- Returns:
shifts and fwhm scales for the molecule.
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools.process_experimental_spectrum(experimental_spectrum: NMRSpectrum1DProtocol, num_rediscretized_omegas: int, num_linear_omegas: int, normalization: float) tuple[ndarray, ndarray]
Process experimental data.
Order by frequency
Apply baseline corrector
Rediscretize
Normalize integral to provided value
- Parameters:
experimental_spectrum (NMRSpectrum1DProtocol) – experimental spectrum to process.
num_rediscretized_omegas (int) – max number of frequencies for the resampled spectrum.
num_linear_omegas (int) – number of linear frequencies to add in the resampling.
normalization (float) – value to which to normalize the integral.
- Returns:
resampled spectrum and corresponding frequencies.
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools.process_greens_function(result_greens_function: NMRResultGreensFunction1D, groups2spin_map: dict[int, list[int]]) tuple[ndarray, ndarray]
Sum the green’s function over the equivalent spins.
- Parameters:
result_greens_function (NMRResultGreensFunction1D) – Green’s function.
groups2spin_map (dict[int, list[int]]) – map from groups to spins
- Returns:
reduced green’s function and corresponding frequencies.
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools._find_shift_cms(reference: ndarray, reference_x: ndarray, signal: ndarray, signal_x: ndarray) float
Find the global shift via center-of-mass.
- Parameters:
reference (np.ndarray) – reference spectrum.
reference_x (np.ndarray) – reference spectrum frequencies.
signal (np.ndarray) – signal to shift.
signal_x (np.ndarray) – signal frequencies.
- Returns:
global shift aligning the centers of mass of signal and reference.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools._adapt_fwhm(reference: ndarray, reference_x: ndarray, greens_function: ndarray, greens_function_x: ndarray, max_fwhm_scale: float, tolerance: float = 0.0005, max_iter: int = 20) float
Adapt the FWHM of the spectrum (from GF) to match the reference spectrum.
- Parameters:
reference (np.ndarray) – reference spectrum.
reference_x (np.ndarray) – reference spectrum frequencies.
greens_function (np.ndarray) – green’s function to adapt.
greens_function_x (np.ndarray) – green’s function frequencies.
max_fwhm_scale (float) – maximum allowed scaling for the FWHM.
tolerance (float) – tolerance for FWHM changes in consecutive loops. Defaults to 5e-4.
max_iter (int) – Maximum number of iterations for FWHM refinement. Defaults to 20.
- Returns:
optimal scale to the FWHM to match the spectrum.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools.define_shift_boundaries(simulated_spectrum: NMRResultSpectrum1D, frequency_boundaries: tuple[float, float], spin_dependent_shifts: list[float], spin2groups_map: dict[int, int], max_shift_h: float, max_shift_exc: float, exchangeable_protons: list[int]) ndarray
Find the shift boundaries clipping them to be within the given boundaries.
Applies [-max_shift_h, max_shift_h] for protons and [-max_shift_exc, max_shift_exc] for exchangeable protons. Moreover it makes sure none of the ranges would bring a spin contribution outside of the boundaries.
- Parameters:
simulated_spectrum (NMRResultSpectrum1D) – calculated spectrum.
frequency_boundaries (tuple[float, float]) – boundaries for the frequency range.
spin_dependent_shifts (list[float]) – pre-computed shifts for each spin.
spin2groups_map (dict[int, int]) – map from spins to groups.
max_shift_h (float) – maximum allowed shift for protons.
max_shift_exc (float) – maximum allowed shift for exchangeable protons.
exchangeable_protons (list[int]) – indices of exchangeable protons.
- Returns:
shift boundaries for each group.
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools.cost_int_of_diff_interpolator_inverse(shift: float, reference: ndarray, reference_x: ndarray, signal_interpolator_inverse: Callable) float
Cost function as the integral of the absolute differences between reference and signal.
- Parameters:
shift (np.ndarray) – shift to apply to the signal interpolator.
reference (np.ndarray) – reference.
reference_x (np.ndarray) – x-values on which the reference is defined.
signal_interpolator_inverse (Callable) – callable interpolating the inverse of the signal to shift.
- Returns:
integral of absolute differences between reference and signals.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools.cost_integral_interpolators_list_inverse_opt(shifts: ndarray, reference: ndarray, reference_x: ndarray, integral_weights: ndarray, signal_interpolators_inverse_re: list[Callable], signal_interpolators_inverse_im: list[Callable], scratch1: ndarray, scratch2: ndarray, scratch3: ndarray, num_spins: float = 1) float
Integral of the absolute differences between reference and shifted signal.
The signal is constructed from the different interpolators of the inverses of the green’s functions. These are shifted accordingly, inverted, and summed. The integral of the difference between reference and signal is calculated and normalized according to the number of spins.
This version is optimized for speed. The interpolators of the inverse green’s functions are split in real and imaginary parts. The cost function is then evaluated using three preallocated scratch arrays to avoid continuous allocation. Moreover the integral weights for the trapezoidal rule are precomputed as np.diff(reference_x) / 2 and simply dotted into y[1:] + y[:-1], again to avoid recomputing it.
- Parameters:
shifts (np.ndarray) – shifts to apply to the interpolators.
reference (np.ndarray) – reference.
reference_x (np.ndarray) – x-values on which the reference is defined.
integral_weights (np.ndarray) – precomputed integrals weights defined as (reference_x[1:] - reference_x[:-1]).
signal_interpolators_inverse_re (list[Callable]) – interpolators of the real part of the inverse of the green’s function.
signal_interpolators_inverse_im (list[Callable]) – interpolators of the imaginary part of the inverse of the green’s function.
scratch1 (np.ndarray) – preallocated scratch array.
scratch2 (np.ndarray) – preallocated scratch array.
scratch3 (np.ndarray) – preallocated scratch array.
num_spins (float, optional) – normalization factor. Defaults to 1.
- Returns:
integral of the absolute difference between reference and shifted signal.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools.cost_integral_diff_shift_adapt_eta(parameters: ndarray, reference_spectrum: ndarray, reference_x: ndarray, jc_adaptation_parameters: JCouplingAdaptationParameters, normalization_factor: float) float
Integral of the absolute difference between reference and shifted and adapted signal.
The signal is reconstructed from the information contained in the jc_adaptation_parameters. It is shifted and adjusted given the parameters. Finally, the integral of the difference between reference and reconstructed signal is returned after normalization.
- Parameters:
parameters (np.ndarray) – parameters for adjusting the green’s function. The first element is the j-coupling change, the second the global broadening, the rest are the shifts for all spins.
reference_spectrum (np.ndarray) – reference spectrum to compare against.
reference_x (np.ndarray) – reference frequencies for the reference spectrum and at which the signal is reconstructed.
jc_adaptation_parameters (JCouplingAdaptationParameters) – parameters for reconstructing and
signal. (adapting the)
normalization_factor (float) – normalization factor for the integral.
- Returns:
Integral of the absolute diff between reference and shifted and adapted signal.
- Return type:
float
- hqs_nmr.analysis.autoshift_tools.get_spectrum_from_interpolator_inverse(shifts: ndarray, signal_interpolators_inverse: list[Callable[[ndarray], ndarray]], x: ndarray) ndarray
Get the spectrum from the interpolators of the inverse of the green’s function.
- Parameters:
shifts (np.ndarray) – shifts to apply to the interpolators.
signal_interpolators_inverse (list[Callable]) – callables interpolating the inverse of the green’s functions to shift.
x (np.ndarray) – x on which to evaluate the interpolators.
- Returns:
spectrum.
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools.lorentzian(x: ndarray, x0: float, gamma: float, result: ndarray) None
Lorentzian function.
- Parameters:
x (np.ndarray) – domain on which to define the function.
x0 (float) – center of the lorentzian.
gamma (float) – width of the lorentzian.
result (np.ndarray) – preallocated result array.
- hqs_nmr.analysis.autoshift_tools.lorentzian_filter(fx: ndarray, x: ndarray, gamma: float, result: ndarray) None
Apply a lorentzian filter to the signal.
- Parameters:
fx (np.ndarray) – signal to be filtered.
x (np.ndarray) – domain of the signal.
gamma (float) – lorentzian broadening.
result (np.ndarray) – preallocated array to store the result.
- hqs_nmr.analysis.autoshift_tools.calculate_fit_scores(reference: ndarray, reference_x: ndarray, contributions_interpolators: list[Callable], contributions_integrals: ndarray, offsets_boundaries: ndarray, num_offsets: int = 300) tuple[ndarray, ndarray]
Calculate fit scores between a reference and contributions summing to a signal.
Fit score is defined as:
\[K(s) = \operatorname{clip}\!\left( \frac{\int dx\, \min(R(x),\, g_i(x - s))}{c_i},\; 0,\; 1 \right)\]with $R(x) geq 0$ the reference, $g_i(x)$ the i-th contribution (possibly with negative lobes), and $c_i = int g_i, dx > 0$ its signed integral (a positive integer).
Since the total spectrum $sum_j g_j > 0$, $c_i > 0$ is guaranteed. Negative lobes in $g_i$ produce a negative contribution to the integral which is removed by the clip, correctly yielding $K = 0$ in the no-overlap case. The sparse-grid issue is resolved because $min(R, g_i) leq R$ always.
- Parameters:
reference (np.ndarray) – reference spectrum (non-negative).
reference_x (np.ndarray) – x-values on which the reference is defined.
contributions_interpolators (list[Callable]) – contributions’ interpolators.
contributions_integrals (np.ndarray) – signed integrals $c_i$ (positive integers).
offsets_boundaries (np.ndarray) – lower and upper boundaries of the offset values.
num_offsets (int) – number of offsets. Defaults to 300.
- Returns:
fit scores in [0, 1] and offset values.
- Return type:
tuple[np.ndarray, np.ndarray]
- hqs_nmr.analysis.autoshift_tools.calculate_cumulative_fit_scores(fit_scores: ndarray, fit_scores_domain: ndarray, power: float = 2, base_threshold: float = 0.4, min_threshold: float = 0.06) ndarray
Calculate normalized cumulative fit scores.
Calculated as
\[C(s) = \int_{-\infty}^s dt\, F(t)^p, \qquad \text{CFS}(s) = \alpha_i \; n_p \;\frac{C(s)}{\max C(s)}\]with $F(t)$ the fit score, $n_p$ the number of peaks in the fit score, and $alpha_i$ a per-spin scaling factor that compensates for fit-score flatness.
When the fit score is peaked, a fixed CFS threshold produces small steps near peaks and large steps in between — the desired behavior. When the fit score is flat (~1 everywhere), the CFS becomes linear and the same threshold consumes a large fraction of the domain, causing boundary bouncing. The scaling factor $alpha_i$ stretches the CFS for flat spins so that steps remain small relative to the domain, while leaving peaked spins unaffected.
- Parameters:
fit_scores (np.ndarray) – precomputed fit scores.
fit_scores_domain (np.ndarray) – domain of the fit scores.
power (float, optional) – power to which the fit scores are raised. Defaults to 2.
base_threshold (float, optional) – reference CFS threshold (peaked regime). Defaults to 0.4.
min_threshold (float, optional) – effective CFS threshold for flat spins. Defaults to 0.06.
- Returns:
Cumulative fit scores.
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools.calculate_cumulative_fit_scores_raw(fit_scores: ndarray, fit_scores_domain: ndarray, power: float = 2) ndarray
Calculate normalized cumulative fit scores in [0, 1].
The CFS is the cumulative integral of $f^p$, normalized to [0, 1]. This turns it into a quantile map: stepping uniformly in [0, 1] CFS-space produces small domain steps where the fit score is high (peaked regions) and large domain steps where it is low. For flat fit scores the CFS is linear and stepping becomes uniform in domain.
- Parameters:
fit_scores (np.ndarray) – precomputed fit scores.
fit_scores_domain (np.ndarray) – domain of the fit scores.
power (float, optional) – power to which fit scores are raised. Defaults to 2.
- Returns:
Cumulative fit scores normalized to [0, 1].
- Return type:
np.ndarray
- hqs_nmr.analysis.autoshift_tools.new_guess_via_cumulative_fit_score(current_guess: ndarray, new_guess: ndarray, cumulative_fit_score: ndarray, cfs_domain: ndarray, cfs_threshold: float) None
Generate a new guess based on cumulative fit scores.
Starting from a current guess, generate a new one by stepping such that a fixed amount of cumulative fit score (the threshold) is consumed. This produces small steps in regions of high fit score and large steps in regions of low fit score.
The CFS passed in should already be normalized and scaled by peak count, so that the threshold has a universal meaning (per-peak resolution).
- Parameters:
current_guess – Current guess values, shape (n_spins,).
new_guess – Preallocated array for the new guess, shape (n_spins,).
cumulative_fit_score – CFS for each spin, shape (n_spins, n_grid).
cfs_domain – Domain grid for each spin, shape (n_spins, n_grid).
cfs_threshold – Threshold of CFS to consume per step.
- hqs_nmr.analysis.autoshift_tools.new_guess_via_step_size(current_guess: ndarray, new_guess: ndarray, step_size: float, boundaries: ndarray, distribution: Literal['uniform', 'cauchy'] = 'uniform') None
Generate new guess by stepping from the current one.
- Parameters:
current_guess (np.ndarray) – current guess.
new_guess (np.ndarray) – preallocated array for the new guess.
step_size (float) – step size to use.
boundaries (np.ndarray) – boundaries of each value of the guess.
distribution (Literal["uniform", "cauchy"]) – which distribution to use for the sampling of the steps. Defaults to “uniform”.
- hqs_nmr.analysis.autoshift_tools.enforce_equal_shifts_for_equivalent_spins(shifts: ndarray, equivalent_spins_list_of_arrays: list[ndarray]) None
Enforce equal shifts for equivalent spins.
- Parameters:
shifts (np.ndarray) – shifts vector.
equivalent_spins_list_of_arrays (list[np.ndarray]) – list of arrays of equivalent spins.
- hqs_nmr.analysis.autoshift_tools.get_new_guess_jc_eta(current_guess: ndarray, preallocated_new_guess: ndarray, amplitude_j_coupling: float, amplitude_eta: float, amplitude_shifts: float, limits: ndarray, equivalent_spins: list[ndarray], non_reference_spins: list[int]) None
Get new guess for jcoupling adaption.
First element is the j-coupling, the rest are the shifts.
- Parameters:
current_guess (np.ndarray) – current guess.
preallocated_new_guess (np.ndarray) – preallocated array for the new guess.
amplitude_j_coupling (float) – step amplitude for j coupling.
amplitude_eta (float) – step amplitude for eta.
amplitude_shifts (float) – step amplitude for shifts.
limits (np.ndarray) – allowed range for the guess.
equivalent_spins (list[np.ndarray]) – list of arrays of equivalent spins.
non_reference_spins (list[int]) – list of non-reference spins for which the shift is zero.
- hqs_nmr.analysis.autoshift_tools._broaden_gf_and_calculate_interpolators(greens_function: ndarray, omegas_gf: ndarray, broadening: float) tuple[list[Callable], list[Callable], list[Callable]]