Anyone out there have enough experience w/ NetCDF and HDF5 to give some pluses / minuses about them as a way of storing scientific data?
I\'ve used HDF5 and would like
I know this is an older post, and the original poster has indicated they've moved on, but for anyone that ends up here...the netCDF-Java library (as of 4.3.13) has netCDF-4 write support via the netCDF C library. It's still in beta, but it does work and feedback is certainly appreciated!
Please see the netCDF-Java reference docs for more details.
NetCDF, which translates HDF5 into its own data model, looks and works great... until you find out that NetCDF doesn't support unsigned values! See also my question on how to detect unsigned values in existing HDF5 files using NetCDF.
Update: Actually, it turns out that although NetCDF-3 doesn't support signed values, NetCDF-4 supports signed values, even though the NetCDF API in Java for determining signedness is a little convoluted.
I'll have to admit using HDF5 is very much easier in the long run. It's not hard to get simple data structures into NetCDF format, but manipulating them down the road is kind of a pain.
The "H" in HDF5 stands for "heirarchical", which translated (for me anyway) into a REALLY easy way to manipulate data, by just moving nodes around and referencing nodes from other places.
Can I ask what kind of project this is? I use these both for a lot of HPC scientific modeling tasks. Can I assume you're doing the same? If so, the trend I'm seeing is people moving to HDF5, but that might be different in your particular domain.
However you end up going, best of luck!
NetCDF, starting with version 4.0 (2008) can read and write most HDF5 files, and provides access to the hierarchical features of HDF5 via the enhanced data model.
HDF5 is extremely feature-rich, and has some great performance features.
NetCDF has a simpler API, and a much wider tool base. There are many tools that handle netCDF data.
Try writing some small sample application in each, and compare the experience. If future scalability of your code to parallel execution (via MPI or the like) is important to you, I know that HDF has a parallel implementation, which people are constantly working to improve. I'm not sure about NetCDF.
Late edit: For NetCDF, there is now Parallel NetCDF from Argonne. It works quite well, and the development team is quite active in improving it further.
I strongly suggest you HDF5 instead of NetCDF. NetCDF is flat, and it gets very dirty after a while if you are not able to classify stuff. Of course classification is also a matter of debate, but at least you have this flexibility.
We performed an accurate evaluation of HDF5 vs. NetCDF when I wrote Q5Cost, and the final result was for HDF5 hands down.