OrbiTrack Dev Log 6: TOF Fitting Refiner

1. Motivation

Continuing from the previous post on molecular formula assignment, I encountered a common issue: overfitting during TOF peak assignment. In some cases, a narrow TOF m/z window was being assigned with too many candidate formulas, resulting in high uncertainty for each individual ion. In other words, we might obtain a wrong intensity for a correct formula, which affecting the dat reliability.

To address this, I implemented a refinement step after formula assignment:
Only the local maxima within a given ppm range are retained, along with known TOF apex peaks.

2. Concept

The idea is simple:

  • If multiple peaks are detected within a small ppm window, keep only the most intense one.
  • If a lower-intensity peak is near a known TOF apex position (from missing peaks), it is also retained to preserve chemically real signals that Orbitrap may have only partially captured. (this is because sometimes there are multiple peaks in an unit m/z window, and there are some ion on the shoulder of the main peak, and it is close to the small peak apex, if they somehow matched with the previous criteria, the ion on the peak apex were removed,which was not we expced.)

3. Function of Intensity-Based Refiner

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def refiner_by_orbitrap_intensity_ratio(mf_result, ppm_range=50, missing_peak_list=None):
"""
Refine Orbitrap peak list by comparing intensities with neighbors in ±ppm_range.
Keeps local maxima or known TOF apex ions (from missing_peak_list).

Parameters:
mf_result (pd.DataFrame): DataFrame with 'exp_mass' and 'abundance'
ppm_range (float): PPM range for neighborhood comparison
missing_peak_list (pd.DataFrame): Known TOF apex m/z values (optional)

Returns:
pd.DataFrame: Filtered mf_result containing only retained peaks
"""
if missing_peak_list is None:
missing_peak_list = []

mf_result = mf_result.sort_values(by='exp_mass').reset_index(drop=True)
exp_mass = mf_result['exp_mass'].values
abundance = mf_result['abundance'].values
keep_indices = np.ones(len(exp_mass), dtype=bool)

for i in range(len(exp_mass)):
if not keep_indices[i]:
continue

mz_i = exp_mass[i]
intensity_i = abundance[i]
ppm_tolerance = mz_i * ppm_range / 1e6

lower = mz_i - ppm_tolerance
upper = mz_i + ppm_tolerance

mask = (exp_mass >= lower) & (exp_mass <= upper) & (np.arange(len(exp_mass)) != i) & keep_indices
neighbor_indices = np.where(mask)[0]

is_max = all(intensity_i >= abundance[j] for j in neighbor_indices)

if not is_max:
# Keep only if close to a known TOF apex
close_to_tof_apex = any(abs(mz_i - tof_mz) <= 0.001 for tof_mz in missing_peak_list['m.z'].values)
if not close_to_tof_apex:
keep_indices[i] = False

return mf_result[keep_indices].reset_index(drop=True)

4. Applying the Refiner

1
2
3
4
refining_distance = 40  # in ppm
print(len(orbi_list_all)) # Before refinement
refined_orbi_list = refiner_by_orbitrap_intensity_ratio(orbi_list_all, refining_distance, missing_peak_list)
print(len(refined_orbi_list)) # After refinement

After refinement, the final peak list can be exported in text format for direct use in TOFware fitting. This refined list includes:

  • Orbitrap-resolved peaks
  • TOF-reconstructed (“missing”) peaks
  • Known inorganic salt peaks (primary ions and its clusters, K+, Mg+, etc)

While isotope peaks are excluded from the fitting, unassigned ions are retained. Although these unassigned ions may not provide direct chemical information, their presence helps prevent signal over-allocation to nearby assigned peaks — particularly in dense or overlapping m/z regions.

An illustration of the ion inflow–outflow process is shown below as a Sankey diagram.
Before integrating Orbitrap into the TOF fitting workflow, only ~2000 ions could be manually fit, often with significant chemical ambiguity. With the addition of Orbitrap’s ultra-high-resolution molecular information, the number of confidently assigned ions has increased to over 3500, significantly enhancing the chemical characterization of the analytes.

Visualizing Chemical Structures 绘制化学物质的结构式 OrbiTrack Dev Log 5: Chemical Formula Assignment

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×