I\'m trying to increase cache size for my HDF5 files, but it doesn\'t seem to be working. This is what I have:
import h5py
with h5py.File(\"test.h5\", \'w\'
The h5py-cache project might be helpful, although I haven't used it:
import h5py_cache
with h5py_cache.File('test.h5', chunk_cache_mem_size=1024**3, 'a') as f:
f.create_dataset(...)
As of h5py version 2.9.0, this behavior is now available directly through the main h5py.File
interface. There are three parameters that control the "raw data chunk cache" — rdcc_nbytes
, rdcc_w0
, and rdcc_nslots
— which are documented here. The OP was trying to adjust the rdcc_nbytes
setting, which can now simply be done as
import h5py
with h5py.File("test.h5", "w", rdcc_nbytes=5242880) as fid:
# Use fid for something here
The only difference is that you have to know how much space you actually need, rather than just multiplying by 5 as the OP wanted. The current default values are the same as the OP found. Of course, if you really wanted to do this programatically, you could just open it once, get the cache, close it, and then reopen with the desired parameters.
If you are using h5py version 2.9.0 or newer, see Mike's answer.
According to the docs, get_access_plist()
returns a copy of the file access property list. So it is not surprising that modifying the copy does not affect the original.
It appears the high-level interface does not provide a way to change the cache settings.
Here is how you could do it using the low-level interface.
propfaid = h5py.h5p.create(h5py.h5p.FILE_ACCESS)
settings = list(propfaid.get_cache())
print(settings)
# [0, 521, 1048576, 0.75]
settings[2] *= 5
propfaid.set_cache(*settings)
settings = propfaid.get_cache()
print(settings)
# (0, 521, 5242880, 0.75)
The above creates a PropFAID. We can then open the file and get a FileID this way:
import contextlib
with contextlib.closing(h5py.h5f.open(
filename, flags=h5py.h5f.ACC_RDWR, fapl=propfaid)) as fid:
# <h5py.h5f.FileID object at 0x9abc694>
settings = list(fid.get_access_plist().get_cache())
print(settings)
# [0, 521, 5242880, 0.75]
And we can use the fid
to open the file with the high-level interface by passing fid
to h5py.File
:
f = h5py.File(fid)
print(f.id.get_access_plist().get_cache())
# (0, 521, 5242880, 0.75)
Thus, you can still use the high-level interface, but it takes some fiddling to get there. On the other hand, if you distill it to just the essentials, perhaps it isn't so bad:
import h5py
import contextlib
filename = '/tmp/foo.hdf5'
propfaid = h5py.h5p.create(h5py.h5p.FILE_ACCESS)
settings = list(propfaid.get_cache())
settings[2] *= 5
propfaid.set_cache(*settings)
with contextlib.closing(h5py.h5f.open(filename, fapl=propfaid)) as fid:
f = h5py.File(fid)