How to keep column names when converting from pandas to numpy

前端 未结 4 1248
挽巷
挽巷 2021-02-20 08:03

According to this post, I should be able to access the names of columns in an ndarray as a.dtype.names

Howevever, if I convert a pandas DataFrame to an ndarray with df.a

4条回答
  •  我在风中等你
    2021-02-20 08:16

    Pandas dataframe also has a handy to_records method. Demo:

    X = pd.DataFrame(dict(age=[40., 50., 60.], 
                          sys_blood_pressure=[140.,150.,160.]))
    m = X.to_records(index=False)
    print repr(m)
    

    Returns:

    rec.array([(40.0, 140.0), (50.0, 150.0), (60.0, 160.0)], 
              dtype=[('age', '

    This is a "record array", which is an ndarray subclass that allows field access using attributes, e.g. m.age in addition to m['age'].

    You can pass this to a cython function as a regular float array by constructing a view:

    m_float = m.view(float).reshape(m.shape + (-1,))
    print repr(m_float)
    

    Which gives:

    rec.array([[  40.,  140.],
               [  50.,  150.],
               [  60.,  160.]], 
              dtype=float64)
    

    Note in order for this to work, the original Dataframe must have a float dtype for every column. To make sure use m = X.astype(float, copy=False).to_records(index=False).

提交回复
热议问题