Fragment Spectrum Generation#
Generating theoretical fragment spectra is central to many identification tasks in computational mass spectrometry.
TheoreticalSpectrumGenerator
can be configured to generate MS2 spectra from
a given peptide charge combination. There are various parameters which influence
the generated ions e.g. simulating different fragmentation techniques.
Y-ion Mass Spectrum#
First, we will generate a simple mass spectrum that only contains y-ions
1import pyopenms as oms
2
3tsg = oms.TheoreticalSpectrumGenerator()
4spec1 = oms.MSSpectrum()
5peptide = oms.AASequence.fromString("DFPIANGER")
6# standard behavior is adding b- and y-ions of charge 1
7p = oms.Param()
8p.setValue("add_b_ions", "false")
9p.setValue("add_metainfo", "true")
10tsg.setParameters(p)
11tsg.getSpectrum(spec1, peptide, 1, 1) # charge range 1:1
12
13# Iterate over annotated ions and their masses
14print("Spectrum 1 of", peptide, "has", spec1.size(), "peaks.")
15for ion, peak in zip(spec1.getStringDataArrays()[0], spec1):
16 print(ion.decode(), "is generated at m/z", peak.getMZ())
which produces all y single charged ions:
Spectrum 1 of DFPIANGER has 8 peaks.
y1+ is generated at m/z 175.118952913371
y2+ is generated at m/z 304.161547136671
y3+ is generated at m/z 361.18301123237103
y4+ is generated at m/z 475.225939423771
y5+ is generated at m/z 546.2630535832709
y6+ is generated at m/z 659.3471179341709
y7+ is generated at m/z 756.3998821574709
y8+ is generated at m/z 903.4682964445709
which you could plot with pyopenms.plotting.plot_spectrum()
, automatically showing annotated ions.:
1import matplotlib.pyplot as plt
2from pyopenms.plotting import plot_spectrum
3
4plot_spectrum(spec1)
5plt.show()
Full Fragment Ion Mass Spectrum#
We can also produce additional peaks in the fragment ion mass spectrum, such as isotopic peaks, precursor peaks, ions from higher charge states, additional ion series, or common neutral losses:
1spec2 = oms.MSSpectrum()
2# standard behavior is adding b- and y-ions
3p2 = oms.Param()
4p2.setValue("add_a_ions", "true")
5# adding n-term ion (in this case, a1 and b1)
6p2.setValue("add_first_prefix_ion", "true")
7p2.setValue("add_precursor_peaks", "true")
8# standard is to add precursor peaks with only the largest charge
9p2.setValue("add_all_precursor_charges", "true")
10p2.setValue("add_losses", "true")
11p2.setValue("add_metainfo", "true")
12tsg.setParameters(p2)
13tsg.getSpectrum(spec2, peptide, 1, 2)
14
15# Iterate over annotated ions and their masses
16print("Spectrum 2 of", peptide, "has", spec2.size(), "peaks.")
17for ion, peak in zip(spec2.getStringDataArrays()[0], spec2):
18 print(ion.decode(), "is generated at m/z", peak.getMZ())
19
20exp = oms.MSExperiment()
21exp.addSpectrum(spec1)
22exp.addSpectrum(spec2)
23oms.MzMLFile().store("DFPIANGER.mzML", exp)
which outputs all 160 peaks that are generated (this is without isotopic peaks), here we will just show the first few peaks:
Spectrum 2 of DFPIANGER has 160 peaks.
a1-H2O1++ is generated at m/z 35.518008514620995
a1++ is generated at m/z 44.523291046520995
b1-H2O1++ is generated at m/z 49.515466014621
b1++ is generated at m/z 58.520748546521
y1-C1H2N1O1++ is generated at m/z 66.05629515817103
y1-C1H2N2++ is generated at m/z 67.05221565817102
a1-H2O1+ is generated at m/z 70.02874056247099
y1-H3N1++ is generated at m/z 79.54984014222102
a1+ is generated at m/z 88.03930562627099
y1++ is generated at m/z 88.06311469007102
b1-H2O1+ is generated at m/z 98.02365556247099
a2-H2O1++ is generated at m/z 109.05221565817101
b1+ is generated at m/z 116.034220626271
a2++ is generated at m/z 118.05749819007102
b2-H2O1++ is generated at m/z 123.049673158171
[...]
which you can again visualize with:
1plot_spectrum(spec2, annotate_ions=False)
2plt.show()
The first example shows how to put peaks of a certain type, y-ions in this case, into a mass spectrum. The second mass spectrum is filled with a complete fragment ion mass spectrum of all peaks (a-, b-, y-ions, precursor peaks, and losses).
Here, from the peptide with 9 amino acids, fragments theoretically can occur in 8 different positions, resulting in 8 peaks per ion type (a, b, and y-ion in this example code). For instance, b-ions (prefix) and y-ions (suffix) are complementary, so b3(DFP) and y6(IANGER) fragments make up the peptide “DFPIANGER.”
Adding precursor ions with the parameter add_precursor_peaks
add 3 peaks with
the largest charge states (precursor ion (M+H) and its loss of water ([M+H]-H2O) or
ammonia ([M+H]-NH3)). To include all precursor ions with possible charge states, the
add_all_precursor_charges
parameter should be set to true.
The losses are based on commonly observed fragment ion losses for specific
amino acids and are defined in the Residues.xml
file, which means that not all
fragment ions will produce all possible losses, as can be observed above: water loss
is not observed for the y1 ion but for the y2 ion since glutamic acid can have a neutral
water loss but arginine cannot. Similarly, only water loss and no ammonia loss is simulated
in the a/b/c
ion series with the first fragment capable of ammonia loss being
asparagine at position 6.
The TheoreticalSpectrumGenerator
has many parameters which have a detailed description located in the class
documentation. Note how the add_metainfo
parameter
populates the StringDataArray
of the output spectrum, allowing us to
iterate over annotated ions and their masses.
Visualization#
We can now visualize the resulting spectra using TOPPView when we open the DFPIANGER.mzML file that we produced above in TOPPView:
We can see all eight y ion peaks that are produced in the
TheoreticalSpectrumGenerator
and when we hover over one of the peaks (\(546\ mz\) in
this example) there is an annotation in the bottom left corner that indicates
charge state and ion name (\(\ce{y5+}\) for every peak). The larger spectrum with
\(146\) peaks can also be interactively investigated with TOPPView (the second
spectrum in the file):
There are substantially more peaks here and the mass spectrum is much busier, with singly and double charged peaks of the b, y and a series creating \(44\) different individual fragment ion peaks as well as neutral losses adding an additional \(102\) peaks (neutral losses easily recognizable by their \(10-fold\) lower intensity in the simulated spectrum).