PeakSpectrum#
- class pyopenms.PeakSpectrum(*args, **kwargs)#
Bases:
MSSpectrumMSSpectrum with DataFrame export capabilities.
This class extends MSSpectrum with a get_df() method that converts spectrum data to a pandas DataFrame.
- __init__(*args, **kwargs)#
Overload:
- __init__(self) None
Overload:
- __init__(self, in_0: MSSpectrum) None
Methods
__init__(*args, **kwargs)calculateTIC(self)Returns the total ion current (=sum) of peak intensities in the spectrum
clear(self, clear_meta_data)Clears all data (and meta data if clear_meta_data is true)
clearMetaInfo(self)Removes all meta values
clearRanges(self)Resets all range dimensions as empty
containsIMData(self)findHighestInWindow(self, mz, ...)Returns the index of the highest peak in the provided abs.
getAcquisitionInfo(self)Returns a const reference to the acquisition info
Returns all spectrum type names known to OpenMS
getComment(self)Returns the free-text comment
getDataProcessing(self)getDriftTime(self)Returns the drift time (-1 if not set)
getDriftTimeUnit(self)getDriftTimeUnitAsString(self)getFloatDataArrays(self)Returns the additional float data arrays to store e.g. meta data.
Get the position of ion mobility data array and its unit.
getIMFormat(self)Returns the ion mobility format
getInstrumentSettings(self)Returns a const reference to the instrument settings of the current spectrum
getIntegerDataArrays(self)Returns the additional int data arrays to store e.g. meta data.
getKeys(self, keys)Fills the given vector with a list of all keys for which a value is set
getMSLevel(self)Returns the MS level
getMaxIntensity(self)Returns the maximum intensity
getMaxMZ(self)Returns the maximum m/z
getMetaValue(self, in_0)Returns the value corresponding to a string, or
getMinIntensity(self)Returns the minimum intensity
getMinMZ(self)Returns the minimum m/z
getName(self)getNativeID(self)Returns the native identifier for the spectrum, used by the acquisition software
getPrecursors(self)Returns a const reference to the precursors
getProducts(self)Returns a const reference to the products
getRT(self)Returns the absolute retention time (in seconds)
getSourceFile(self)Returns a const reference to the source file
getStringDataArrays(self)Returns the additional string data arrays to store e.g. meta data.
getType(self)Returns the spectrum type (centroided (PEAKS) or profile data (RAW))
Returns a dictionary of NumPy arrays with m/z, intensities, and metadata.
get_df([columns, export_meta_values])Returns a pandas DataFrame representation of the MSSpectrum.
Returns a list of column names that get_df() would produce for this spectrum.
Get the ion mobility drift time array as a numpy array (copy).
Get the ion mobility drift time array as a memory view (no copy).
Get the drift time unit for ion mobility data.
Get the intensity values of the spectrum as a numpy array.
Get the m/z values of the spectrum as a numpy array.
Cython signature: numpy_vector, numpy_vector get_peaks()
isMetaEmpty(self)Returns if the MetaInfo is empty
isSorted(self)Returns true if the spectrum is sorte by m/z
metaRegistry(self)Returns a reference to the MetaInfoRegistry
metaValueExists(self, in_0)Returns whether an entry with the given name exists
push_back(self, in_0)Append a peak
removeMetaValue(self, in_0)Removes the DataValue corresponding to name if it exists
reserve(self, n)resize(self, n)Resize the peak array
select(self, indices)Subset the spectrum by indices.
setAcquisitionInfo(self, in_0)Sets the acquisition info
setComment(self, in_0)Sets the free-text comment
setDataProcessing(self, in_0)setDriftTime(self, in_0)Sets the drift time (-1 if not set)
setDriftTimeUnit(self, dt)setFloatDataArrays(self, fda)Sets the additional float data arrays to store e.g. meta data.
setIMFormat(self, im_format)Sets the ion mobility format
setInstrumentSettings(self, in_0)Sets the instrument settings of the current spectrum
setIntegerDataArrays(self, ida)Sets the additional int data arrays to store e.g. meta data.
setMSLevel(self, in_0)Sets the MS level
setMetaValue(self, in_0, in_1)Sets the DataValue corresponding to a name
setName(self, in_0)setNativeID(self, in_0)Sets the native identifier for the spectrum, used by the acquisition software
setPrecursors(self, in_0)Sets the precursors
setProducts(self, in_0)Sets the products
setRT(self, in_0)Sets the absolute retention time (in seconds)
setSourceFile(self, in_0)Sets the source file
setStringDataArrays(self, sda)Sets the additional string data arrays to store e.g. meta data.
setType(self, in_0)Sets the spectrum type
Cython signature: set_peaks((numpy_vector, numpy_vector))
size(self)Returns the number of peaks in the spectrum
sortByIntensity(self, reverse)sortByPosition(self)unify(self, in_0)updateRanges(self)- calculateTIC(self) float#
Returns the total ion current (=sum) of peak intensities in the spectrum
- clear(self, clear_meta_data: bool) None#
Clears all data (and meta data if clear_meta_data is true)
- clearMetaInfo(self) None#
Removes all meta values
- clearRanges(self) None#
Resets all range dimensions as empty
- containsIMData(self) bool#
- findHighestInWindow(self, mz: float, tolerance_left: float, tolerance_right: float) int#
Returns the index of the highest peak in the provided abs. m/z tolerance window to the left and right (-1 if none match)
- findNearest()#
Overload:
- findNearest(self, mz: float) int
Returns the index of the closest peak in m/z
Overload:
- findNearest(self, mz: float, tolerance: float) int
Returns the index of the closest peak in the provided +/- m/z tolerance window (-1 if none match)
Overload:
- findNearest(self, mz: float, tolerance_left: float, tolerance_right: float) int
Returns the index of the closest peak in the provided abs. m/z tolerance window to the left and right (-1 if none match)
- getAcquisitionInfo(self) AcquisitionInfo#
Returns a const reference to the acquisition info
- static getAllNamesOfSpectrumType() List[bytes]#
Returns all spectrum type names known to OpenMS
- getDataProcessing(self) List[DataProcessing]#
- getDriftTime(self) float#
Returns the drift time (-1 if not set)
- getDriftTimeUnit(self) int#
- getFloatDataArrays(self) List[FloatDataArray]#
Returns the additional float data arrays to store e.g. meta data
- getIMData()#
Get the position of ion mobility data array and its unit.
- Returns:
- tuple: (index, unit) where index is the position in FloatDataArrays
and unit is the DriftTimeUnit enum value.
- Raises:
Exception: If no ion mobility data is present. Use containsIMData() first.
- Example:
>>> if spectrum.containsIMData(): ... idx, unit = spectrum.getIMData() ... im_array = spectrum.getFloatDataArrays()[idx]
- getIMFormat(self) int#
Returns the ion mobility format
- getInstrumentSettings(self) InstrumentSettings#
Returns a const reference to the instrument settings of the current spectrum
- getIntegerDataArrays(self) List[IntegerDataArray]#
Returns the additional int data arrays to store e.g. meta data
- getKeys(self, keys: List[bytes]) None#
Fills the given vector with a list of all keys for which a value is set
- getMSLevel(self) int#
Returns the MS level
- getMaxIntensity(self) float#
Returns the maximum intensity
- getMaxMZ(self) float#
Returns the maximum m/z
- getMetaValue(self, in_0: bytes | str | String) int | float | bytes | str | List[int] | List[float] | List[bytes]#
Returns the value corresponding to a string, or
- getMinIntensity(self) float#
Returns the minimum intensity
- getMinMZ(self) float#
Returns the minimum m/z
- getNativeID(self) bytes | str | String#
Returns the native identifier for the spectrum, used by the acquisition software
- getRT(self) float#
Returns the absolute retention time (in seconds)
- getSourceFile(self) SourceFile#
Returns a const reference to the source file
- getStringDataArrays(self) List[StringDataArray]#
Returns the additional string data arrays to store e.g. meta data
- getType(self) int#
Returns the spectrum type (centroided (PEAKS) or profile data (RAW))
- get_data_dict()#
Returns a dictionary of NumPy arrays with m/z, intensities, and metadata.
This method extracts spectrum data including peaks, retention time, MS level, ion mobility data (if present), precursor information, and optional meta values into a dictionary format suitable for conversion to a pandas DataFrame.
- Args:
- columns (list or None): List of column names to include. If None, includes
all default columns. Use get_df_columns(‘all’) to see all available columns including custom data arrays.
- export_meta_values (bool): Whether to include meta values in the output.
Only applies when columns=None. Defaults to True.
- Returns:
- dict: Dictionary with requested columns as keys and numpy arrays as values.
Default columns include:
‘mz’: numpy array of m/z values (float64)
‘intensity’: numpy array of intensity values (float32)
‘rt’: numpy array of retention time values (float64)
‘ms_level’: numpy array of MS level values (uint16)
‘native_id’: numpy array of native ID strings
‘ion_mobility’: ion mobility values (if IM data present)
‘precursor_mz’: precursor m/z (if precursor present)
‘precursor_charge’: precursor charge (if precursor present)
‘ion_annotation’: ion annotations (if IonNames StringDataArray present)
Additional meta value columns (if export_meta_values=True)
Non-default columns (must be explicitly requested): - ‘ion_mobility_unit’: ion mobility unit string - ‘float_array:<name>’: custom FloatDataArray values - ‘int_array:<name>’: custom IntegerDataArray values - ‘string_array:<name>’: custom StringDataArray values
- Example:
>>> # Get all columns (default) >>> data = spectrum.get_data_dict()
>>> # Get only specific columns for performance >>> data = spectrum.get_data_dict(columns=['mz', 'intensity'])
>>> # Get all available columns including custom data arrays >>> all_cols = spectrum.get_df_columns('all') >>> data = spectrum.get_data_dict(columns=all_cols)
- get_df(columns: None | List[str] = None, export_meta_values: bool = True) DataFrame#
Returns a pandas DataFrame representation of the MSSpectrum.
This method converts the spectrum data (peaks, metadata, precursor info, ion mobility) into a pandas DataFrame format.
- Args:
- columns (list or None): List of column names to include. If None,
includes all default columns. Use get_df_columns() to discover available columns.
- export_meta_values (bool): Whether to include meta values. Only applies
when columns=None. Defaults to True.
- Returns:
- pd.DataFrame: DataFrame with requested columns. Default columns include:
mz: m/z values of peaks
intensity: intensity values of peaks
rt: retention time (replicated for each peak)
ms_level: MS level (replicated for each peak)
native_id: native spectrum identifier
ion_mobility: ion mobility values (if IM data present)
precursor_mz: precursor m/z (if precursor present)
precursor_charge: precursor charge (if precursor present)
ion_annotation: ion annotation strings (if IonNames present)
Additional meta value columns (if export_meta_values=True)
- Example:
>>> # Get all default columns >>> df = spectrum.get_df()
>>> # Discover available columns >>> print(spectrum.get_df_columns())
>>> # Get only specific columns (faster) >>> df = spectrum.get_df(columns=['mz', 'intensity'])
>>> # Get all columns including non-defaults like ion_mobility_unit >>> cols = spectrum.get_df_columns() >>> cols.append('ion_mobility_unit') >>> df = spectrum.get_df(columns=cols)
- get_df_columns()#
Returns a list of column names that get_df() would produce for this spectrum.
Useful for discovering available columns before export, especially when selecting specific columns for performance optimization.
- Args:
- columns (str): ‘default’ for standard columns, ‘all’ for all available
columns including non-default ones (ion_mobility_unit, custom data arrays).
- export_meta_values (bool): Whether to include meta value column names.
Defaults to True.
- Returns:
list: List of column name strings.
- Example:
>>> # See default columns >>> cols = spectrum.get_df_columns() ['mz', 'intensity', 'rt', ...]
>>> # See ALL available columns including custom data arrays >>> cols = spectrum.get_df_columns('all') ['mz', 'intensity', ..., 'ion_mobility_unit', 'float_array:MyData']
>>> # Export everything >>> df = spectrum.get_df(columns=spectrum.get_df_columns('all'))
- get_drift_time_array()#
Get the ion mobility drift time array as a numpy array (copy).
This is a convenience method that retrieves the ion mobility data from the FloatDataArrays and returns it as a numpy array.
- Returns:
- np.ndarray or None: A 1D numpy array (float32) containing drift time
values for each peak, or None if no IM data present.
- Example:
>>> spectrum = MSSpectrum() >>> drift_times = spectrum.get_drift_time_array() >>> if drift_times is not None: ... print(f"Drift time range: {drift_times.min():.2f} - {drift_times.max():.2f}")
- get_drift_time_array_mv()#
Get the ion mobility drift time array as a memory view (no copy).
This method provides direct access to the underlying drift time data without copying, which is more memory efficient for large datasets.
- Returns:
- memoryview or None: A memory view of drift time values, or None if
no IM data is present or array is empty.
- Warning:
The returned memory view refers directly to the underlying data in a FloatDataArray. You must keep a reference to the FloatDataArray (via getFloatDataArrays()) to ensure the data remains valid.
For safer access, use get_drift_time_array() which returns a copy.
- Example:
>>> if spectrum.containsIMData(): ... # Keep reference to data arrays to prevent garbage collection ... fdas = spectrum.getFloatDataArrays() ... idx, unit = spectrum.getIMData() ... drift_mv = spectrum.get_drift_time_array_mv() ... total = sum(drift_mv)
- get_drift_time_unit()#
Get the drift time unit for ion mobility data.
- Returns:
- int or None: The DriftTimeUnit enum value, or None if no IM data present.
Values: 0=NONE, 1=MILLISECOND, 2=VSSC, 3=FAIMS_COMPENSATION_VOLTAGE
- Example:
>>> unit = spectrum.get_drift_time_unit() >>> if unit == 1: # DriftTimeUnit.MILLISECOND ... print("Drift time is in milliseconds")
- get_intensity_array()#
Get the intensity values of the spectrum as a numpy array.
- Returns:
- np.ndarray: A 1D numpy array (float32) containing the intensity values
for each peak in the spectrum.
- Example:
>>> spectrum = MSSpectrum() >>> intensities = spectrum.get_intensity_array() >>> print(f"Total ion current: {intensities.sum():.2f}")
- get_mz_array()#
Get the m/z values of the spectrum as a numpy array.
- Returns:
- np.ndarray: A 1D numpy array (float64) containing the m/z values
for each peak in the spectrum.
- Example:
>>> spectrum = MSSpectrum() >>> mz_values = spectrum.get_mz_array() >>> print(f"m/z range: {mz_values.min():.2f} - {mz_values.max():.2f}")
- get_peaks()#
Cython signature: numpy_vector, numpy_vector get_peaks()
Will return a tuple of two numpy arrays (m/z, intensity) corresponding to the peaks in the MSSpectrum. Provides fast access to peaks.
- Returns:
- tuple: A tuple of (mz_array, intensity_array) where:
mz_array is np.ndarray[float64] of m/z values
intensity_array is np.ndarray[float32] of intensity values
- Example:
>>> spectrum = MSSpectrum() >>> spectrum.set_peaks(([100.0, 200.0, 300.0], [1000.0, 2000.0, 500.0])) >>> mz, intensities = spectrum.get_peaks() >>> print(f"Base peak m/z: {mz[intensities.argmax()]}")
- intensityInRange()#
- isMetaEmpty(self) bool#
Returns if the MetaInfo is empty
- isSorted(self) bool#
Returns true if the spectrum is sorte by m/z
- metaRegistry(self) MetaInfoRegistry#
Returns a reference to the MetaInfoRegistry
- metaValueExists(self, in_0: bytes | str | String) bool#
Returns whether an entry with the given name exists
- removeMetaValue(self, in_0: bytes | str | String) None#
Removes the DataValue corresponding to name if it exists
- reserve(self, n: int) None#
- resize(self, n: int) None#
Resize the peak array
- select(self, indices: List[int]) MSSpectrum#
Subset the spectrum by indices. Also applies to associated data arrays if present.
- setAcquisitionInfo(self, in_0: AcquisitionInfo) None#
Sets the acquisition info
- setDataProcessing(self, in_0: List[DataProcessing]) None#
- setDriftTime(self, in_0: float) None#
Sets the drift time (-1 if not set)
- setDriftTimeUnit(self, dt: int) None#
- setFloatDataArrays(self, fda: List[FloatDataArray]) None#
Sets the additional float data arrays to store e.g. meta data
- setIMFormat(self, im_format: int) None#
Sets the ion mobility format
- setInstrumentSettings(self, in_0: InstrumentSettings) None#
Sets the instrument settings of the current spectrum
- setIntegerDataArrays(self, ida: List[IntegerDataArray]) None#
Sets the additional int data arrays to store e.g. meta data
- setMSLevel(self, in_0: int) None#
Sets the MS level
- setMetaValue(self, in_0: bytes | str | String, in_1: int | float | bytes | str | List[int] | List[float] | List[bytes]) None#
Sets the DataValue corresponding to a name
- setNativeID(self, in_0: bytes | str | String) None#
Sets the native identifier for the spectrum, used by the acquisition software
- setRT(self, in_0: float) None#
Sets the absolute retention time (in seconds)
- setSourceFile(self, in_0: SourceFile) None#
Sets the source file
- setStringDataArrays(self, sda: List[StringDataArray]) None#
Sets the additional string data arrays to store e.g. meta data
- setType(self, in_0: int) None#
Sets the spectrum type
- set_peaks()#
Cython signature: set_peaks((numpy_vector, numpy_vector))
Takes a tuple or list of two arrays (m/z, intensity) and populates the MSSpectrum. The arrays can be numpy arrays (faster).
- size(self) int#
Returns the number of peaks in the spectrum
- sortByIntensity(self, reverse: bool) None#
- sortByPosition(self) None#
- unify(self, in_0: SpectrumSettings) None#
- updateRanges(self) None#