BayesianProteinInferenceAlgorithm#

class pyopenms.BayesianProteinInferenceAlgorithm#

Bases: object

Cython implementation of _BayesianProteinInferenceAlgorithm

Original C++ documentation is available here: – Inherits from [‘DefaultParamHandler’, ‘ProgressLogger’]

Performs a Bayesian protein inference on Protein/Peptide identifications or ConsensusMap.

Filters for best n PSMs per spectrum.
Calculates and filters for best peptide per spectrum.
Builds a k-partite graph from the structures.
Finds and splits into connected components by DFS
Extends the graph by adding layers from indist. protein groups, peptides with the same parents and optionally some additional layers (peptide sequence, charge, replicate -> extended model = experimental)
Builds a factor graph representation of a Bayesian network using the Evergreen library See model param section. It is based on the Fido noisy-OR model with an option for regularizing the number of proteins per peptide.
Performs loopy belief propagation on the graph and queries protein, protein group and/or peptide posteriors See loopy_belief_propagation param section.
Learns best parameters via grid search if the parameters were not given in the param section.
Writes posteriors to peptides and/or proteins and adds indistinguishable protein groups to the underlying data structures.
Can make use of OpenMP to parallelize over connected components.

Usage:

from pyopenms import *
from urllib.request import urlretrieve
urlretrieve("https://raw.githubusercontent.com/OpenMS/OpenMS/develop/src/tests/class_tests/openms/data/BayesianProteinInference_test.idXML", "BayesianProteinInference_test.idXML")
proteins = []
peptides = []
idf = IdXMLFile()
idf.load("BayesianProteinInference_test.idXML", proteins, peptides)
bpia = BayesianProteinInferenceAlgorithm()
p = bpia.getParameters()
p.setValue("update_PSM_probabilities", "false")
bpia.setParameters(p)
bpia.inferPosteriorProbabilities(proteins, peptides)
#
print(len(peptides)) # 9
print(peptides[0].getHits()[0].getScore()) # 0.6
print(proteins[0].getHits()[0].getScore()) # 0.624641
print(proteins[0].getHits()[1].getScore()) # 0.648346

__init__()#

Overload:

__init__(self) → None

Overload:

__init__(self, debug_lvl: int) → None

Methods

`__init__`	Overload:
`endProgress`(self)	Ends the progress display
`getDefaults`(self)	Returns the default parameters
`getLogType`(self)	Returns the type of progress log being used
`getName`(self)	Returns the name
`getParameters`(self)	Returns the parameters
`getSubsections`(self)
`inferPosteriorProbabilities`	Overload:
`nextProgress`(self)	Increment progress by 1 (according to range begin-end)
`setLogType`(self, in_0)	Sets the progress log that should be used.
`setName`(self, in_0)	Sets the name
`setParameters`(self, param)	Sets the parameters
`setProgress`(self, value)	Sets the current progress
`startProgress`(self, begin, end, label)

endProgress(self) → None#: Ends the progress display

getDefaults(self) → Param#: Returns the default parameters

getLogType(self) → int#: Returns the type of progress log being used

getName(self) → bytes | str | String#: Returns the name

getParameters(self) → Param#: Returns the parameters

getSubsections(self) → List[bytes]#

inferPosteriorProbabilities()#

Overload:

inferPosteriorProbabilities(self, proteinIDs: List[ProteinIdentification], peptideIDs: List[PeptideIdentification], greedy_group_resolution: bool) → None

Optionally adds indistinguishable protein groups with separate scores, too Currently only takes first proteinID run and all peptides

Parameters:

proteinIDs – Vector of protein identifications
peptideIDs – Vector of peptide identifications

Returns:

Writes its results into protein and (optionally also) peptide hits (as new score)

Overload:

inferPosteriorProbabilities(self, proteinIDs: List[ProteinIdentification], peptideIDs: List[PeptideIdentification], greedy_group_resolution: bool, exp_des: ExperimentalDesign) → None

Writes its results into protein and (optionally also) peptide hits (as new score). Optionally adds indistinguishable protein groups with separate scores, too Currently only takes first proteinID run and all peptides Experimental design can be used to create an extended graph with replicate information. (experimental)

Parameters:

proteinIDs – Vector of protein identifications
peptideIDs – Vector of peptide identifications
exp_des – Experimental Design

Returns:

Writes its results into protein and (optionally also) peptide hits (as new score)

Overload:

inferPosteriorProbabilities(self, cmap: ConsensusMap, greedy_group_resolution: bool) → None

Writes its results into protein and (optionally also) peptide hits (as new score) Optionally adds indistinguishable protein groups with separate scores, too Loops over all runs in the ConsensusMaps’ protein IDs (experimental)

Parameters:

cmap – ConsensusMaps with protein IDs
greedy_group_resolution – Adds indistinguishable protein groups with separate scores

Returns:

Writes its protein ID results into the ConsensusMap

Overload:

inferPosteriorProbabilities(self, cmap: ConsensusMap, greedy_group_resolution: bool, exp_des: ExperimentalDesign) → None

Parameters:

cmap – ConsensusMaps with protein IDs.
greedy_group_resolution – Adds indistinguishable protein groups with separate scores
exp_des – Experimental Design

Returns:

Writes its protein ID results into the ConsensusMap

nextProgress(self) → None#: Increment progress by 1 (according to range begin-end)

setLogType(self, in_0: int) → None#: Sets the progress log that should be used. The default type is NONE!

setName(self, in_0: bytes | str | String) → None#: Sets the name

setParameters(self, param: Param) → None#: Sets the parameters

setProgress(self, value: int) → None#: Sets the current progress

startProgress(self, begin: int, end: int, label: bytes | str | String) → None#