What is the best beetween multiple small h5 files or one huge?
问题 I'm working with huge sattelite data that i'm splitting into small tiles to feed a deep learning model. I'm using pytorch, which means the data loader can work with multiple thread. [settings : python, Ubuntu 18.04] I can't find any answer of which is the best in term of data accessing and storage between : registering all the data in one huge HDF5 file (over 20Go) splitting it into multiple (over 16 000) small HDF5 files (approx 1.4Mo). Is there any problem of multiple access of one file by