PeakSpectrum#

class pyopenms.PeakSpectrum(*args, **kwargs)#

Bases: MSSpectrum

MSSpectrum with DataFrame export capabilities.

This class extends MSSpectrum with a get_df() method that converts spectrum data to a pandas DataFrame.

__init__(*args, **kwargs)#

Overload:

__init__(self) None

Overload:

__init__(self, in_0: MSSpectrum) None

Methods

__init__(*args, **kwargs)

calculateTIC(self)

Returns the total ion current (=sum) of peak intensities in the spectrum

clear(self, clear_meta_data)

Clears all data (and meta data if clear_meta_data is true)

clearMetaInfo(self)

Removes all meta values

clearRanges(self)

Resets all range dimensions as empty

containsIMData(self)

findHighestInWindow(self, mz, ...)

Returns the index of the highest peak in the provided abs.

findNearest

getAcquisitionInfo(self)

Returns a const reference to the acquisition info

getAllNamesOfSpectrumType()

Returns all spectrum type names known to OpenMS

getComment(self)

Returns the free-text comment

getDataProcessing(self)

getDriftTime(self)

Returns the drift time (-1 if not set)

getDriftTimeUnit(self)

getDriftTimeUnitAsString(self)

getFloatDataArrays(self)

Returns the additional float data arrays to store e.g. meta data.

getIMData

Get the position of ion mobility data array and its unit.

getIMFormat(self)

Returns the ion mobility format

getInstrumentSettings(self)

Returns a const reference to the instrument settings of the current spectrum

getIntegerDataArrays(self)

Returns the additional int data arrays to store e.g. meta data.

getKeys(self, keys)

Fills the given vector with a list of all keys for which a value is set

getMSLevel(self)

Returns the MS level

getMaxIntensity(self)

Returns the maximum intensity

getMaxMZ(self)

Returns the maximum m/z

getMetaValue(self, in_0)

Returns the value corresponding to a string, or

getMinIntensity(self)

Returns the minimum intensity

getMinMZ(self)

Returns the minimum m/z

getName(self)

getNativeID(self)

Returns the native identifier for the spectrum, used by the acquisition software

getPrecursors(self)

Returns a const reference to the precursors

getProducts(self)

Returns a const reference to the products

getRT(self)

Returns the absolute retention time (in seconds)

getSourceFile(self)

Returns a const reference to the source file

getStringDataArrays(self)

Returns the additional string data arrays to store e.g. meta data.

getType(self)

Returns the spectrum type (centroided (PEAKS) or profile data (RAW))

get_data_dict

Returns a dictionary of NumPy arrays with m/z, intensities, and metadata.

get_df([columns, export_meta_values])

Returns a pandas DataFrame representation of the MSSpectrum.

get_df_columns

Returns a list of column names that get_df() would produce for this spectrum.

get_drift_time_array

Get the ion mobility drift time array as a numpy array (copy).

get_drift_time_array_mv

Get the ion mobility drift time array as a memory view (no copy).

get_drift_time_unit

Get the drift time unit for ion mobility data.

get_intensity_array

Get the intensity values of the spectrum as a numpy array.

get_mz_array

Get the m/z values of the spectrum as a numpy array.

get_peaks

Cython signature: numpy_vector, numpy_vector get_peaks()

intensityInRange

isMetaEmpty(self)

Returns if the MetaInfo is empty

isSorted(self)

Returns true if the spectrum is sorte by m/z

metaRegistry(self)

Returns a reference to the MetaInfoRegistry

metaValueExists(self, in_0)

Returns whether an entry with the given name exists

push_back(self, in_0)

Append a peak

removeMetaValue(self, in_0)

Removes the DataValue corresponding to name if it exists

reserve(self, n)

resize(self, n)

Resize the peak array

select(self, indices)

Subset the spectrum by indices.

setAcquisitionInfo(self, in_0)

Sets the acquisition info

setComment(self, in_0)

Sets the free-text comment

setDataProcessing(self, in_0)

setDriftTime(self, in_0)

Sets the drift time (-1 if not set)

setDriftTimeUnit(self, dt)

setFloatDataArrays(self, fda)

Sets the additional float data arrays to store e.g. meta data.

setIMFormat(self, im_format)

Sets the ion mobility format

setInstrumentSettings(self, in_0)

Sets the instrument settings of the current spectrum

setIntegerDataArrays(self, ida)

Sets the additional int data arrays to store e.g. meta data.

setMSLevel(self, in_0)

Sets the MS level

setMetaValue(self, in_0, in_1)

Sets the DataValue corresponding to a name

setName(self, in_0)

setNativeID(self, in_0)

Sets the native identifier for the spectrum, used by the acquisition software

setPrecursors(self, in_0)

Sets the precursors

setProducts(self, in_0)

Sets the products

setRT(self, in_0)

Sets the absolute retention time (in seconds)

setSourceFile(self, in_0)

Sets the source file

setStringDataArrays(self, sda)

Sets the additional string data arrays to store e.g. meta data.

setType(self, in_0)

Sets the spectrum type

set_peaks

Cython signature: set_peaks((numpy_vector, numpy_vector))

size(self)

Returns the number of peaks in the spectrum

sortByIntensity(self, reverse)

sortByPosition(self)

unify(self, in_0)

updateRanges(self)

calculateTIC(self) float#

Returns the total ion current (=sum) of peak intensities in the spectrum

clear(self, clear_meta_data: bool) None#

Clears all data (and meta data if clear_meta_data is true)

clearMetaInfo(self) None#

Removes all meta values

clearRanges(self) None#

Resets all range dimensions as empty

containsIMData(self) bool#
findHighestInWindow(self, mz: float, tolerance_left: float, tolerance_right: float) int#

Returns the index of the highest peak in the provided abs. m/z tolerance window to the left and right (-1 if none match)

findNearest()#

Overload:

findNearest(self, mz: float) int

Returns the index of the closest peak in m/z

Overload:

findNearest(self, mz: float, tolerance: float) int

Returns the index of the closest peak in the provided +/- m/z tolerance window (-1 if none match)

Overload:

findNearest(self, mz: float, tolerance_left: float, tolerance_right: float) int

Returns the index of the closest peak in the provided abs. m/z tolerance window to the left and right (-1 if none match)

getAcquisitionInfo(self) AcquisitionInfo#

Returns a const reference to the acquisition info

static getAllNamesOfSpectrumType() List[bytes]#

Returns all spectrum type names known to OpenMS

getComment(self) bytes | str | String#

Returns the free-text comment

getDataProcessing(self) List[DataProcessing]#
getDriftTime(self) float#

Returns the drift time (-1 if not set)

getDriftTimeUnit(self) int#
getDriftTimeUnitAsString(self) bytes | str | String#
getFloatDataArrays(self) List[FloatDataArray]#

Returns the additional float data arrays to store e.g. meta data

getIMData()#

Get the position of ion mobility data array and its unit.

Returns:
tuple: (index, unit) where index is the position in FloatDataArrays

and unit is the DriftTimeUnit enum value.

Raises:

Exception: If no ion mobility data is present. Use containsIMData() first.

Example:
>>> if spectrum.containsIMData():
...     idx, unit = spectrum.getIMData()
...     im_array = spectrum.getFloatDataArrays()[idx]
getIMFormat(self) int#

Returns the ion mobility format

getInstrumentSettings(self) InstrumentSettings#

Returns a const reference to the instrument settings of the current spectrum

getIntegerDataArrays(self) List[IntegerDataArray]#

Returns the additional int data arrays to store e.g. meta data

getKeys(self, keys: List[bytes]) None#

Fills the given vector with a list of all keys for which a value is set

getMSLevel(self) int#

Returns the MS level

getMaxIntensity(self) float#

Returns the maximum intensity

getMaxMZ(self) float#

Returns the maximum m/z

getMetaValue(self, in_0: bytes | str | String) int | float | bytes | str | List[int] | List[float] | List[bytes]#

Returns the value corresponding to a string, or

getMinIntensity(self) float#

Returns the minimum intensity

getMinMZ(self) float#

Returns the minimum m/z

getName(self) bytes | str | String#
getNativeID(self) bytes | str | String#

Returns the native identifier for the spectrum, used by the acquisition software

getPrecursors(self) List[Precursor]#

Returns a const reference to the precursors

getProducts(self) List[Product]#

Returns a const reference to the products

getRT(self) float#

Returns the absolute retention time (in seconds)

getSourceFile(self) SourceFile#

Returns a const reference to the source file

getStringDataArrays(self) List[StringDataArray]#

Returns the additional string data arrays to store e.g. meta data

getType(self) int#

Returns the spectrum type (centroided (PEAKS) or profile data (RAW))

get_data_dict()#

Returns a dictionary of NumPy arrays with m/z, intensities, and metadata.

This method extracts spectrum data including peaks, retention time, MS level, ion mobility data (if present), precursor information, and optional meta values into a dictionary format suitable for conversion to a pandas DataFrame.

Args:
columns (list or None): List of column names to include. If None, includes

all default columns. Use get_df_columns(‘all’) to see all available columns including custom data arrays.

export_meta_values (bool): Whether to include meta values in the output.

Only applies when columns=None. Defaults to True.

Returns:
dict: Dictionary with requested columns as keys and numpy arrays as values.

Default columns include:

  • ‘mz’: numpy array of m/z values (float64)

  • ‘intensity’: numpy array of intensity values (float32)

  • ‘rt’: numpy array of retention time values (float64)

  • ‘ms_level’: numpy array of MS level values (uint16)

  • ‘native_id’: numpy array of native ID strings

  • ‘ion_mobility’: ion mobility values (if IM data present)

  • ‘precursor_mz’: precursor m/z (if precursor present)

  • ‘precursor_charge’: precursor charge (if precursor present)

  • ‘ion_annotation’: ion annotations (if IonNames StringDataArray present)

  • Additional meta value columns (if export_meta_values=True)

Non-default columns (must be explicitly requested): - ‘ion_mobility_unit’: ion mobility unit string - ‘float_array:<name>’: custom FloatDataArray values - ‘int_array:<name>’: custom IntegerDataArray values - ‘string_array:<name>’: custom StringDataArray values

Example:
>>> # Get all columns (default)
>>> data = spectrum.get_data_dict()
>>> # Get only specific columns for performance
>>> data = spectrum.get_data_dict(columns=['mz', 'intensity'])
>>> # Get all available columns including custom data arrays
>>> all_cols = spectrum.get_df_columns('all')
>>> data = spectrum.get_data_dict(columns=all_cols)
get_df(columns: None | List[str] = None, export_meta_values: bool = True) DataFrame#

Returns a pandas DataFrame representation of the MSSpectrum.

This method converts the spectrum data (peaks, metadata, precursor info, ion mobility) into a pandas DataFrame format.

Args:
columns (list or None): List of column names to include. If None,

includes all default columns. Use get_df_columns() to discover available columns.

export_meta_values (bool): Whether to include meta values. Only applies

when columns=None. Defaults to True.

Returns:
pd.DataFrame: DataFrame with requested columns. Default columns include:
  • mz: m/z values of peaks

  • intensity: intensity values of peaks

  • rt: retention time (replicated for each peak)

  • ms_level: MS level (replicated for each peak)

  • native_id: native spectrum identifier

  • ion_mobility: ion mobility values (if IM data present)

  • precursor_mz: precursor m/z (if precursor present)

  • precursor_charge: precursor charge (if precursor present)

  • ion_annotation: ion annotation strings (if IonNames present)

  • Additional meta value columns (if export_meta_values=True)

Example:
>>> # Get all default columns
>>> df = spectrum.get_df()
>>> # Discover available columns
>>> print(spectrum.get_df_columns())
>>> # Get only specific columns (faster)
>>> df = spectrum.get_df(columns=['mz', 'intensity'])
>>> # Get all columns including non-defaults like ion_mobility_unit
>>> cols = spectrum.get_df_columns()
>>> cols.append('ion_mobility_unit')
>>> df = spectrum.get_df(columns=cols)
get_df_columns()#

Returns a list of column names that get_df() would produce for this spectrum.

Useful for discovering available columns before export, especially when selecting specific columns for performance optimization.

Args:
columns (str): ‘default’ for standard columns, ‘all’ for all available

columns including non-default ones (ion_mobility_unit, custom data arrays).

export_meta_values (bool): Whether to include meta value column names.

Defaults to True.

Returns:

list: List of column name strings.

Example:
>>> # See default columns
>>> cols = spectrum.get_df_columns()
['mz', 'intensity', 'rt', ...]
>>> # See ALL available columns including custom data arrays
>>> cols = spectrum.get_df_columns('all')
['mz', 'intensity', ..., 'ion_mobility_unit', 'float_array:MyData']
>>> # Export everything
>>> df = spectrum.get_df(columns=spectrum.get_df_columns('all'))
get_drift_time_array()#

Get the ion mobility drift time array as a numpy array (copy).

This is a convenience method that retrieves the ion mobility data from the FloatDataArrays and returns it as a numpy array.

Returns:
np.ndarray or None: A 1D numpy array (float32) containing drift time

values for each peak, or None if no IM data present.

Example:
>>> spectrum = MSSpectrum()
>>> drift_times = spectrum.get_drift_time_array()
>>> if drift_times is not None:
...     print(f"Drift time range: {drift_times.min():.2f} - {drift_times.max():.2f}")
get_drift_time_array_mv()#

Get the ion mobility drift time array as a memory view (no copy).

This method provides direct access to the underlying drift time data without copying, which is more memory efficient for large datasets.

Returns:
memoryview or None: A memory view of drift time values, or None if

no IM data is present or array is empty.

Warning:

The returned memory view refers directly to the underlying data in a FloatDataArray. You must keep a reference to the FloatDataArray (via getFloatDataArrays()) to ensure the data remains valid.

For safer access, use get_drift_time_array() which returns a copy.

Example:
>>> if spectrum.containsIMData():
...     # Keep reference to data arrays to prevent garbage collection
...     fdas = spectrum.getFloatDataArrays()
...     idx, unit = spectrum.getIMData()
...     drift_mv = spectrum.get_drift_time_array_mv()
...     total = sum(drift_mv)
get_drift_time_unit()#

Get the drift time unit for ion mobility data.

Returns:
int or None: The DriftTimeUnit enum value, or None if no IM data present.

Values: 0=NONE, 1=MILLISECOND, 2=VSSC, 3=FAIMS_COMPENSATION_VOLTAGE

Example:
>>> unit = spectrum.get_drift_time_unit()
>>> if unit == 1:  # DriftTimeUnit.MILLISECOND
...     print("Drift time is in milliseconds")
get_intensity_array()#

Get the intensity values of the spectrum as a numpy array.

Returns:
np.ndarray: A 1D numpy array (float32) containing the intensity values

for each peak in the spectrum.

Example:
>>> spectrum = MSSpectrum()
>>> intensities = spectrum.get_intensity_array()
>>> print(f"Total ion current: {intensities.sum():.2f}")
get_mz_array()#

Get the m/z values of the spectrum as a numpy array.

Returns:
np.ndarray: A 1D numpy array (float64) containing the m/z values

for each peak in the spectrum.

Example:
>>> spectrum = MSSpectrum()
>>> mz_values = spectrum.get_mz_array()
>>> print(f"m/z range: {mz_values.min():.2f} - {mz_values.max():.2f}")
get_peaks()#

Cython signature: numpy_vector, numpy_vector get_peaks()

Will return a tuple of two numpy arrays (m/z, intensity) corresponding to the peaks in the MSSpectrum. Provides fast access to peaks.

Returns:
tuple: A tuple of (mz_array, intensity_array) where:
  • mz_array is np.ndarray[float64] of m/z values

  • intensity_array is np.ndarray[float32] of intensity values

Example:
>>> spectrum = MSSpectrum()
>>> spectrum.set_peaks(([100.0, 200.0, 300.0], [1000.0, 2000.0, 500.0]))
>>> mz, intensities = spectrum.get_peaks()
>>> print(f"Base peak m/z: {mz[intensities.argmax()]}")
intensityInRange()#
isMetaEmpty(self) bool#

Returns if the MetaInfo is empty

isSorted(self) bool#

Returns true if the spectrum is sorte by m/z

metaRegistry(self) MetaInfoRegistry#

Returns a reference to the MetaInfoRegistry

metaValueExists(self, in_0: bytes | str | String) bool#

Returns whether an entry with the given name exists

push_back(self, in_0: Peak1D) None#

Append a peak

removeMetaValue(self, in_0: bytes | str | String) None#

Removes the DataValue corresponding to name if it exists

reserve(self, n: int) None#
resize(self, n: int) None#

Resize the peak array

select(self, indices: List[int]) MSSpectrum#

Subset the spectrum by indices. Also applies to associated data arrays if present.

setAcquisitionInfo(self, in_0: AcquisitionInfo) None#

Sets the acquisition info

setComment(self, in_0: bytes | str | String) None#

Sets the free-text comment

setDataProcessing(self, in_0: List[DataProcessing]) None#
setDriftTime(self, in_0: float) None#

Sets the drift time (-1 if not set)

setDriftTimeUnit(self, dt: int) None#
setFloatDataArrays(self, fda: List[FloatDataArray]) None#

Sets the additional float data arrays to store e.g. meta data

setIMFormat(self, im_format: int) None#

Sets the ion mobility format

setInstrumentSettings(self, in_0: InstrumentSettings) None#

Sets the instrument settings of the current spectrum

setIntegerDataArrays(self, ida: List[IntegerDataArray]) None#

Sets the additional int data arrays to store e.g. meta data

setMSLevel(self, in_0: int) None#

Sets the MS level

setMetaValue(self, in_0: bytes | str | String, in_1: int | float | bytes | str | List[int] | List[float] | List[bytes]) None#

Sets the DataValue corresponding to a name

setName(self, in_0: bytes | str | String) None#
setNativeID(self, in_0: bytes | str | String) None#

Sets the native identifier for the spectrum, used by the acquisition software

setPrecursors(self, in_0: List[Precursor]) None#

Sets the precursors

setProducts(self, in_0: List[Product]) None#

Sets the products

setRT(self, in_0: float) None#

Sets the absolute retention time (in seconds)

setSourceFile(self, in_0: SourceFile) None#

Sets the source file

setStringDataArrays(self, sda: List[StringDataArray]) None#

Sets the additional string data arrays to store e.g. meta data

setType(self, in_0: int) None#

Sets the spectrum type

set_peaks()#

Cython signature: set_peaks((numpy_vector, numpy_vector))

Takes a tuple or list of two arrays (m/z, intensity) and populates the MSSpectrum. The arrays can be numpy arrays (faster).

size(self) int#

Returns the number of peaks in the spectrum

sortByIntensity(self, reverse: bool) None#
sortByPosition(self) None#
unify(self, in_0: SpectrumSettings) None#
updateRanges(self) None#