Inexplicable behavior when using vlen with h5py

前端未结

关注

 1  1401

I am using h5py to build a dataset. Since I want to store arrays with different #of rows dimension, I use the h5py special_type vlen. However, I experience behavior I can\'t

相关标签:

1条回答

北荒

2020-12-07 04:35
I think
```
train_targets[0] = test
```
has stored your (11,5) array as an F ordered array in a row of train_targets. According to the (9549,5) shape, that's a row of 5 elements. And since it is vlen, each element is a 1d array of length 11.

That's what you get back in train_targets[0] - an array of 5 arrays, each shape (11,), with values taken from test (order F).

So I think there are 2 issues - what a 2d shape means, and what vlen allows.

My version of h5py is pre v2.3, so I only get string vlen. But I suspect your problem may be that vlen only works with 1d arrays, an extension, so to speak, of byte strings.

Does the 5 in shape=(9549, 5,) have anything to do with 5 in the test.shape? I don't think it does, at least not as numpy and h5py see it.

When I make a file following the string vlen example:
```
>>> f = h5py.File('foo.hdf5')
>>> dt = h5py.special_dtype(vlen=str)
>>> ds = f.create_dataset('VLDS', (100,100), dtype=dt)
```
and then do:
```
ds[0]='this one string'
```
and look at ds[0], I get an object array with 100 elements, each being this string. That is, I've set a whole row of ds.
```
ds[0,0]='another'
```
is the correct way to set just one element.

vlen is 'variable length', not 'variable shape'. While the https://www.hdfgroup.org/HDF5/doc/TechNotes/VLTypes.html documentation is not entirely clear on this, I think you can store 1d arrays with shape (11,) and (38,) with vlen, but not 2d ones.

Actually, train_targets output is reproduced with:
```
In [54]: test1=np.empty((5,),dtype=object)
In [55]: for i in range(5):
    test1[i]=test.T.flatten()[i:i+11]
```
It's 11 values taken from the transpose (F order), but shifted for each sub array.
0 讨论(0)
发布评论:

提交评论
- 加载中...