TypeError when appending fields to a structured array of size ONE

家住魔仙堡 提交于 2019-12-10 18:28:45


I'm getting a run-time error when trying to append field(s) to a structured array of size ONE. I've written a simple example below:

import numpy as np
import numpy.lib.recfunctions as rcfuncs

dtype_ = np.dtype( { 'names': ["field_a","field_b","field_c"]
                  , 'formats': ['S32', 'i4', 'f8']}
data_ = [("1",17, 123.45)]
numpy_array = np.array(data_, dtype_)            

# append 2 fields
numpy_array = rcfuncs.append_fields( numpy_array,["field_d","field_e"],data=[ "1","3" ] )

# append 1 field fails :(
numpy_array = rcfuncs.append_fields( numpy_array, "field_f", data=["123456"] )

I'm getting the error:

TypeError: descriptor 'ravel' requires a 'numpy.ndarray' object but received a 'numpy.void'

As well, if I 'invert' the appends, the statement with the two fields append will fail:

# append 1 field
numpy_array = rcfuncs.append_fields( numpy_array, "field_f", data=["123456"] )

# append 2 fields fails :(
numpy_array = rcfuncs.append_fields( numpy_array,["field_d","field_e"],data=[ "1", "3" ] )

I am running with python 2.7.11 and numpy 1.11.0 and I do not have the issue when the initial array is of size greater than 2.

How to solve the type error?



We do not get the TypeError when setting the optional parameter usemask to False

numpy_array = \
  rcfuncs.append_fields(numpy_array, "field_f", data=["123456"], usemask=False)
numpy_array = \
  rcfuncs.append_fields(numpy_array,["field_d","field_e"],data=[ "1", "3" ], usemask=False)


For reference, here's the full traceback

Start with a structured array, with one record:

array([('1', 17, 123.45)], 
      dtype=[('field_a', 'S32'), ('field_b', '<i4'), ('field_c', '<f8')])

After the first append, we have a masked array, still with 1 record:

masked_array(data = [('1', 17, 123.45, '1', '3')],
             mask = [(False, False, False, False, False)],
       fill_value = ('N/A', 999999, 1e+20, 'N', 'N'),
            dtype = [('field_a', 'S32'), ('field_b', '<i4'), ('field_c', '<f8'), ('field_d', 'S1'), ('field_e', 'S1')])

The error looks like it has more to do with the masked array code than recfunctions - though I'll have to look at the code to see why it is using ravel.

Traceback (most recent call last):
  File "stack36440557.py", line 15, in <module>
    numpy_array2 = rcfuncs.append_fields( numpy_array1, "field_f", data=["123456"] ,usemask=False)
  File "/usr/local/lib/python2.7/site-packages/numpy/lib/recfunctions.py", line 633, in append_fields
    base = merge_arrays(base, usemask=usemask, fill_value=fill_value)
  File "/usr/local/lib/python2.7/site-packages/numpy/lib/recfunctions.py", line 389, in merge_arrays
    seqarrays = seqarrays.ravel()
  File "/usr/local/lib/python2.7/site-packages/numpy/ma/core.py", line 4022, in ravel
    r = ndarray.ravel(self._data).view(type(self))
TypeError: descriptor 'ravel' requires a 'numpy.ndarray' object but received a 'numpy.void'

So one fix is to turn off the use of masked_arrays. That isn't needed unless the added fields are missing some data.

Another is to put the 1st argument in a list:

rcfuncs.append_fields( [numpy_array1], "field_f", data=['12345'])

append_fields(base, ....) calls

merge_arrays(base, usemask=usemask, fill_value=fill_value)

which in turn calls

base.ravel()  # now call seq_arrays

But first it checks it is length one

# Only one item in the input sequence ?
if (len(seqarrays) == 1):
    seqarrays = np.asanyarray(seqarrays[0])

For a simple structured array, y, and its masked equivalent, ym:

In [405]: y
array([(b'xxx', 1)], 
      dtype=[('f0', 'S5'), ('f1', '<i4')])
In [406]: ym=np.ma.masked_array(y)

This length 1 action produces another array for the regular structured array:

In [407]: np.asanyarray(y[0])
array((b'xxx', 1), 
      dtype=[('f0', 'S5'), ('f1', '<i4')])

but a void (structured array record/element) for the masked one:

In [408]: np.asanyarray(ym[0])
Out[408]: (b'xxx', 1)
In [409]: type(np.asanyarray(ym[0]))
Out[409]: numpy.ma.core.mvoid

np.asanyarray(ym[0]).ravel() produces this TypeError.

If the base is a list, [ym], this just extracts ym. If the base is (2,) or longer, it doesn't pass though this statement.

I haven't thought of fix yet - other the user level kludge of passing the masked array in a list.

A possible fix is to simply remove this base=merge_arrays(base...) line (in append_fields). But I need to know why it is there in the first place. The intent may be to cleanup certain base array inputs.

The unit test file, test/test_recfunctions.py runs fine with this line commented out.

I've added a comment on this to an old numpy issue


