Wrapping Workflows and New Classes#
How pyOpenMS Wraps C++ Classes#
pyOpenMS uses nanobind to expose OpenMS
C++ classes to Python. The binding code is hand-maintained in C++ source files
under src/pyOpenMS/bindings/.
The wrapping process works as follows:
Step 1: The developer writes nanobind binding code in the appropriate
bind_<domain>.cppfile undersrc/pyOpenMS/bindings/. Each file corresponds to a domain of the OpenMS library (e.g.,bind_kernel.cppfor kernel classes,bind_format.cppfor file format classes).Step 2: CMake compiles each binding file into a separate Python extension module (e.g.,
_pyopenms_kernel.so). All modules share type information via theNB_DOMAIN "pyopenms"mechanism.Step 3: The top-level
pyopenms/__init__.pyimports and re-exports all classes, so users simply writeimport pyopenms.
In addition to the C++ bindings, pure Python convenience methods can be added
via the addon system in pyopenms/addons/. These are injected into the
wrapped classes at import time.
Binding File Organization#
Each binding file covers a domain of the OpenMS C++ library:
C++ Header Path |
Binding File |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Everything else |
|
Each class in a binding file is preceded by a section comment for navigation:
// --- ClassName ---
Maintaining Existing Wrappers#
If the C++ API of a wrapped class changes (e.g., a method is renamed, removed,
or has its signature changed), the corresponding binding code in
src/pyOpenMS/bindings/bind_<domain>.cpp must be updated to match.
How to Wrap New Methods in Existing Classes#
To expose a new C++ method to Python, find the class in the appropriate
bind_<domain>.cpp file and add a .def() call.
For a simple method:
.def("methodName", &OpenMS::ClassName::methodName,
"Short description of the method")
For methods that need argument adaptation (e.g., filling output parameters by reference), use a lambda wrapper:
.def("getValues", [](const OpenMS::ClassName& self) {
std::vector<double> result;
self.getValues(result);
return result;
}, "Returns the values as a list")
Use named arguments for clarity:
.def("setValue", &OpenMS::ClassName::setValue,
"value"_a, "Sets the value")
How to Wrap New Classes#
A Simple Example#
To wrap a new OpenMS class:
Choose the binding file based on the C++ header location (see table above).
Add the include for the C++ header.
Add the class binding with constructors, copy support, and methods.
#include <OpenMS/MODULE/MyClass.h>
// --- MyClass ---
nb::class_<OpenMS::MyClass>(m, "MyClass", "Short class description")
.def(nb::init<>())
.def(nb::init<const OpenMS::MyClass &>())
.def("__copy__", [](const OpenMS::MyClass& self) {
return OpenMS::MyClass(self);
})
.def("__deepcopy__", [](const OpenMS::MyClass& self, nb::dict) {
return OpenMS::MyClass(self);
}, "memo"_a)
.def("getValue", &OpenMS::MyClass::getValue,
"Gets value (between 0 and 5)")
.def("setValue", &OpenMS::MyClass::setValue,
"v"_a, "Sets value (between 0 and 5)")
;
Key points:
Always provide both a default constructor and a copy constructor.
Always add
__copy__and__deepcopy__methods.Use
"param"_afor named parameters.Add a short docstring to each method. For longer docstrings, use raw strings:
.def("myMethod", ..., R"doc( Longer description of the method. :param x: Description of x :returns: Description of return value )doc")
Inheritance#
Simple inheritance (no virtual destructor mismatch):
If neither the class nor its base introduces a virtual destructor mismatch, declare the base class directly:
nb::class_<OpenMS::Acquisition, OpenMS::MetaInfoInterface>(
m, "Acquisition", "An acquisition")
.def(nb::init<>())
// ...
;
Virtual destructor mismatch (critical):
If the derived class has a virtual destructor (virtual ~ClassName() or
~ClassName() override) but the base class (e.g., MetaInfoInterface)
does not, you must not declare the base in nanobind. Instead, use
the helper templates from binding_utils.h:
// WRONG - will segfault!
nb::class_<OpenMS::PeptideHit, OpenMS::MetaInfoInterface>(m, "PeptideHit", ...)
// CORRECT - use helper template
auto peptidehit = nb::class_<OpenMS::PeptideHit>(m, "PeptideHit", "A peptide hit")
.def(nb::init<>())
// ... other methods ...
;
def_MetaInfoInterface<OpenMS::PeptideHit>(peptidehit);
This pattern is required because nanobind computes incorrect pointer offsets when a derived class introduces a vtable pointer that the base class lacks.
How to check: Look for virtual ~ClassName() or ~ClassName() override
in the C++ header file. If present and the base class has no virtual destructor,
use the helper template pattern.
Helper Templates#
Reusable helper templates in binding_utils.h bind common OpenMS interfaces:
def_MetaInfoInterface<T>(cls)– bindsgetMetaValue,setMetaValue,isMetaEmpty, etc.def_UniqueIdInterface<T>(cls)– bindsgetUniqueId,setUniqueId, etc.def_CVTermList<T>(cls)– bindsaddCVTerm,getCVTerms,hasCVTerm, etc.def_DefaultParamHandler<T>(cls)– bindssetParameters,getParameters,getDefaults, etc.def_ProgressLogger<T>(cls)– bindssetLogType,startProgress,endProgress, etc.def_DocumentIdentifier<T>(cls)– bindssetIdentifier,getLoadedFilePath, etc.
Usage:
auto cls = nb::class_<OpenMS::MyAlgorithm>(m, "MyAlgorithm", "...")
.def(nb::init<>())
// ... class-specific methods ...
;
def_DefaultParamHandler<OpenMS::MyAlgorithm>(cls);
def_ProgressLogger<OpenMS::MyAlgorithm>(cls);
Enums#
Basic enum:
nb::enum_<OpenMS::MyEnum>(m, "MyEnum")
.value("VALUE1", OpenMS::MyEnum::VALUE1)
.value("VALUE2", OpenMS::MyEnum::VALUE2)
;
Enum supporting int() conversion (add nb::is_arithmetic()):
nb::enum_<OpenMS::DriftTimeUnit>(m, "DriftTimeUnit", nb::is_arithmetic())
.value("NONE", OpenMS::DriftTimeUnit::NONE)
.value("MILLISECOND", OpenMS::DriftTimeUnit::MILLISECOND)
;
Nested enum (scoped under a class):
nb::enum_<OpenMS::MyClass::InnerEnum>(myclass, "InnerEnum")
.value("A", OpenMS::MyClass::InnerEnum::A)
;
Operator Overloading#
Common operators can be bound using nanobind’s operator support:
.def(nb::self == nb::self)
.def(nb::self != nb::self)
.def(nb::self < nb::self)
.def(nb::self <= nb::self)
.def(nb::self > nb::self)
.def(nb::self >= nb::self)
Methods Returning NumPy Arrays#
For C++ methods that fill output vectors by reference, wrap them with a lambda that returns the data as numpy arrays:
.def("getData", [](const OpenMS::MyClass& self) {
std::vector<double> mz, intensity;
self.getData(mz, intensity);
// Return as numpy arrays using nanobind's ndarray support
// ...
}, "Returns (mz, intensity) arrays")
Adding Pure Python Methods via Addons#
For convenience methods that don’t need C++ performance, use the addon system
in pyopenms/addons/. Addon methods are pure Python functions injected into
wrapped classes at import time.
Create a file pyopenms/addons/classname.py (snake_case filename):
from __future__ import annotations
from . import addon
@addon("ClassName")
def to_tuple(self) -> tuple:
"""Return as a (mz, intensity) tuple."""
return (self.getMZ(), self.getIntensity())
The @addon("ClassName") decorator (with PascalCase class name) attaches the
function as a method on the specified class.
Guidelines for addons:
Keep addons minimal – only for non-performance-critical convenience methods.
Performance-critical methods should be C++ lambdas in the binding files.
Each addon file should document its methods with standard Python docstrings.
Build and Test#
# Build pyOpenMS
cmake --build OpenMS-build --target pyopenms -j$(nproc)
# Run tests (from /tmp to avoid import shadowing with source pyopenms/)
cd /tmp && PYTHONPATH=/path/to/OpenMS-build/pyOpenMS \
python3 -m pytest /path/to/src/pyOpenMS/tests/ -v
# Run a specific test file
PYTHONPATH=OpenMS-build/pyOpenMS \
python3 -m pytest src/pyOpenMS/tests/unittests/test_MyClass.py -v