Memory Management
On order to save memory, we can avoid loading the whole file into memory and
use the OnDiscMSExperiment
for reading data.
1from pyopenms import *
2
3od_exp = OnDiscMSExperiment()
4od_exp.openFile("test.mzML")
5
6e = MSExperiment()
7for k in range(od_exp.getNrSpectra()):
8 s = od_exp.getSpectrum(k)
9 if s.getNativeID().startswith("scan="):
10 e.addSpectrum(s)
11
12MzMLFile().store("test_filtered.mzML", e)
Note that using the approach the output data e
is still completely in
memory and may end up using a substantial amount of memory. We can avoid that
by using
1od_exp = OnDiscMSExperiment()
2od_exp.openFile("test.mzML")
3
4consumer = PlainMSDataWritingConsumer("test_filtered.mzML")
5
6e = MSExperiment()
7for k in range(od_exp.getNrSpectra()):
8 s = od_exp.getSpectrum(k)
9 if s.getNativeID().startswith("scan="):
10 consumer.consumeSpectrum(s)
11
12del consumer
Make sure you do not forget del consumer
since otherwise the final part of
the mzML may not get written to disk (and the consumer is still waiting for new
data).