问题
I'm trying to write a subclass a masked_array
. What I've got so far is this:
class gridded_array(ma.core.masked_array):
def __init__(self, data, dimensions, mask=False, dtype=None,
copy=False, subok=True, ndmin=0, fill_value=None,
keep_mask=True, hard_mask=None, shrink=True):
ma.core.masked_array.__init__(data, mask, dtype, copy, subok,
ndmin, fill_value, keep_mask, hard_mask,
shrink)
self.dimensions = dimensions
However, when now I create a gridded_array
, I don't get what I expect:
dims = OrderedDict()
dims['x'] = np.arange(4)
gridded_array(np.random.randn(4), dims)
masked_array(data = [-- -- -- --],
mask = [ True True True True],
fill_value = 1e+20)
I would expect an unmasked array. I have the suspicion that the dimensions
argument I'm passing gets passed on the the masked_array.__init__
call, but since I'm quite new to OOP, I don't know how to resolve this.
Any help is greatly appreciated.
PS: I'm on Python 2.7
回答1:
A word of warning: if you're brand new to OOP, subclassing ndarrays
and MaskedArrays
is not the easiest way to get started, by far...
Before anything else, you should go and check this tutorial. That should introduce you to the mechanisms involved in subclassing ndarrays
.
MaskedArrays
, like ndarrays
, uses the __new__
method for creating class instances, not the __init__
. By the time you get to the __init__
of your subclass, you already have a fully instanciated object, with the actual initialization delegated to the __array_finalize__
method. In simpler terms: your __init__
doesn't work as you would expect with standard Python object. (actually, I wonder whether it's called at all... After __array_finalize__
, if I recall correctly...)
Now that you've been warned, you may want to consider whether you really need to go through the hassle of subclassing a ndarray
:
- What are your objectives with your
gridded_array
? - Should you support all methods of
ndarrays
, or only some? All dtypes? - What should happen when you take a single element or a slice of your object?
- Will you be using
gridded_arrays
extensively as inputs of NumPy functions ?
If you have a doubt, then it might be easier to design gridded_array
as a generic class that takes a ndarray
(or a MaskedArray
) as attribute (say, gridded_array._array
), and add only the methods you would need to operate on your self._array
.
Suggestions
- If you only need to "tag" each item of your
gridded_array
, you may be interested in pandas. - If you only have to deal with floats,
MaskedArray
might be a bit overkill: just usenans
to represent invalid data, a lot of numpy functions havenans
equivalent. At worst, you can always mask yourgridded_array
when needed: taking a view of a subclass ofndarray
with.view(np.ma.MaskedArray)
should return a masked version of your input...
回答2:
The issue is that masked_array
uses __new__
instead of __init__
, so your dimensions
argument is being misinterpreted.
To override __new__, use:
class gridded_array(ma.core.masked_array):
def __new__(cls, data, dimensions, *args, **kwargs):
self = super(gridded_array, cls).__new__(cls, data, *args, **kwargs)
self.dimensions = dimensions
return self
来源:https://stackoverflow.com/questions/12597827/how-to-subclass-numpy-ma-core-masked-array