If you have to profile application, in python for example, it’s good to read this blog post which I found very useful information.
The best tool so far seems to be the massif profiler, which comes with the valgrind suite. How valgrind works:
This will run the script through valgrind
valgrind --tool=massif python test_scal.py
This produces a “massif.out.?????” file which is a text file, but not in a very readable format. To get a more human-readable file, use ms_print
ms_print massif.out.????? > profile.txt
So I’ve run some test to check the scalability of HDF5.
import tables import numpy as np h5file = tables.openFile('test4.h5', mode='w', title="Test Array") array_len = 10000000 arrays = np.arange(1) for x in arrays: x_a = np.zeros(array_len, dtype=float) h5file.createArray(h5file.root, "test" + str(x), x_a) h5file.close()
This is the memory used for one array
This is for two arrays
And this is for fifty
As soon you enter the loop the efficiency is preserved in a really nice way
- one ~ 87 Mb
- two ~ 163 Mb
- four ~ 163 Mb
- fifty ~ 163 Mb
So the problem is not on pytables, but it lies somewhere else..