# Mass Decomposition

## Fragment Mass to Amino Acid Composition

One challenge often encountered in mass spectrometry is the question of the composition of a specific mass fragment only given its mass. For example, for the internal fragment mass $$262.0953584466$$ there are three different interpretations within a narrow mass band of $$0.05\ Th$$:

from pyopenms import *

print(
AASequence.fromString("MM").getMonoWeight(Residue.ResidueType.Internal, 0)
)
print(
AASequence.fromString("VY").getMonoWeight(Residue.ResidueType.Internal, 0)
)
print(
AASequence.fromString("DF").getMonoWeight(Residue.ResidueType.Internal, 0)
)

262.08097003420005
262.1317435742
262.0953584466


As you can see, already for relatively simple two-amino acid combinations, multiple explanations may exist. OpenMS provides an algorithm to compute all potential amino acid combinations that explain a certain mass in the MassDecompositionAlgorithm class:

md_alg = MassDecompositionAlgorithm()
param = md_alg.getParameters()
param.setValue("tolerance", 0.05)
param.setValue("residue_set", b"Natural19WithoutI")
md_alg.setParameters(param)
decomps = []
md_alg.getDecompositions(decomps, 262.0953584466)
for d in decomps:
print(d.toExpandedString())


Which outputs the three potential compositions for the mass $$262.0953584466$$. Note that every single combination of amino acids is only printed once, e.g. only DF is reported while the isobaric FD is not reported. This makes the algorithm more efficient.

## Naive Algorithm

We can compare this result with a more naive algorithm which simply iterates through all combinations of amino acid residues until the sum of of all residues equals the target mass:

mass = 262.0953584466
residues = ResidueDB().getResidues(b"Natural19WithoutI")

def recursive_mass_decomposition(mass_sum, peptide):
if abs(mass_sum - mass) < 0.05:
print(peptide + "\t" + str(mass_sum))
for r in residues:
new_mass = mass_sum + r.getMonoWeight(Residue.ResidueType.Internal)
if new_mass < mass + 0.05:
recursive_mass_decomposition(
new_mass, peptide + r.getOneLetterCode()
)

print("Mass explanations by naive algorithm:")
recursive_mass_decomposition(0, "")


Note that this approach is substantially slower than the OpenMS algorithm and also does not treat DF and FD as equivalent, instead outputting them both as viable solutions.

## Stand-Alone Program

We can use pyOpenMS to write a short program that takes a mass and outputs all possible amino acid combinations for that mass within a given tolerance:

 1import sys
2
3# Example for mass decomposition (mass explanation)
4# Internal residue masses (as observed e.g. as mass shifts in tandem mass spectra)
5# are decomposed in possible amino acid strings that match in mass.
6
7mass = float(sys.argv[1])
8tol = float(sys.argv[2])
9
10md_alg = MassDecompositionAlgorithm()
11param = md_alg.getParameters()
12param.setValue("tolerance", tol)
13param.setValue("residue_set", b"Natural19WithoutI")
14md_alg.setParameters(param)
15decomps = []
16md_alg.getDecompositions(decomps, mass)
17for d in decomps:
18  print(d.toExpandedString().decode())


If we copy the above code into a script, for example mass_decomposition.py, we will have a stand-alone software that takes two arguments: first the mass to be de-composed and secondly the tolerance to be used (which are collected on line 8 and 9). We can call it as follows:

python mass_decomposition.py 999.4773990735001 1.0
python mass_decomposition.py 999.4773990735001 0.001


Try to change the tolerance parameter. The parameter has a very large influence on the reported results, for example for $$1.0$$ tolerance, the algorithm will produce $$80,463$$ results while for a $$0.001$$ tolerance, only $$911$$ results are expected.

## Spectrum Tagger

 1tsg = TheoreticalSpectrumGenerator()
2param = tsg.getParameters()
8tsg.setParameters(param)
9
10# spectrum with charges +1 and +2
11test_sequence = AASequence.fromString("PEPTIDETESTTHISTAGGER")
12spec = MSSpectrum()
13tsg.getSpectrum(spec, test_sequence, 1, 2)
14
15print(spec.size())  # should be 357
16
17# tagger searching only for charge +1
18tags = []
19tagger = Tagger(2, 10.0, 5, 1, 1, [], [])
20tagger.getTag(spec, tags)
21
22print(len(tags))  # should be 890
23
24b"EPTID" in tags  # True
25b"PTIDE" in tags  # True
26b"PTIDEF" in tags  # False