Flatten numpy array but also keep index of value positions?

前端未结

关注

 9  1250

I have several 2D numpy arrays (matrix) and for each one I would like to convert it to vector containing the values of the array and a vector containing each row/column index.

相关标签:

9条回答

囚心锁ツ

2021-02-04 14:11

You can try this using itertools

import itertools
import numpy as np
import pandas as pd

def convert2dataframe(array):
    a, b = array.shape
    x, y = zip(*list(itertools.product(range(a), range(b))))
    df = pd.DataFrame(data={'V':array.ravel(), 'x':x, 'y':y})
    return df

This works for arrays of any shape, not necessarily square matrices.

0 讨论(0)

悲&欢浪女

2021-02-04 14:11

Like @miguel-capllonch I would suggest using np.ndindex which allows you to create the desired output like this:

np.array([(v, *i) for (i, v) in zip(np.ndindex(x.shape), x.ravel())])

which results in an array that looks like this:

array([[ 3.  0.  0.]
       [ 1.  0.  1.]
       [ 4.  0.  2.]
       [ 1.  1.  0.]
       [ 5.  1.  1.]
       [ 9.  1.  2.]
       [ 2.  2.  0.]
       [ 6.  2.  1.]
       [ 5.  2.  2.]])

Alternatively, using only numpy commands

np.hstack((list(np.ndindex(x.shape)), x.reshape((-1, 1))))

0 讨论(0)

无人及你

2021-02-04 14:16

You could also let pandas do the work for you since you'll be using it in a dataframe:

x = np.array([[3, 1, 4],
              [1, 5, 9],
              [2, 6, 5]])
df=pd.DataFrame(x)
#unstack the y columns so that they become an index then reset the
#index so that indexes become columns.
df=df.unstack().reset_index()
df

   level_0  level_1  0
0        0        0  3
1        0        1  1
2        0        2  2
3        1        0  1
4        1        1  5
5        1        2  6
6        2        0  4
7        2        1  9
8        2        2  5

#name the columns and switch the column order
df.columns=['x','y','V']
cols = df.columns.tolist()
cols = cols[-1:] + cols[:-1]
df = df[cols]
df

   V  x  y
0  3  0  0
1  1  0  1
2  2  0  2
3  1  1  0
4  5  1  1
5  6  1  2
6  4  2  0
7  9  2  1
8  5  2  2

0 讨论(0)

无人共我

2021-02-04 14:25
The class np.ndindex is especially meant for this, and easily does the trick. Similar efficiency to the np.mesgrid method above, but it requires less code:
```
indices = np.array(list(np.ndindex(x.shape)))
```
For the dataframe, do:
```
df = pd.DataFrame({'V': x.flatten(), 'x': indices[:, 0], 'y': indices[:, 1]})
```
If you don't need the dataframe, just do list(np.ndindex(x.shape)).

Note: don't get confused between x (the array at hand), and 'x' (the name of the second column).

I know this question was posted a very long time ago, but just in case it's useful to anyone, as I didn't see np.ndindex being mentioned.
0 讨论(0)
发布评论:

提交评论
- 加载中...

旧时难觅i

2021-02-04 14:25

I am resurrecting this because I think I know a different answer that is way easier to understand. Here is how I do it:

xn = np.zeros((np.size(x), np.ndim(x)+1), dtype=np.float32)
row = 0
for ind, data in np.ndenumerate(x):
    xn[row, 0] = data
    xn[row, 1:] = np.asarray(ind)
    row += 1

In xn we have

[[ 3.  0.  0.]
 [ 1.  0.  1.]
 [ 4.  0.  2.]
 [ 1.  1.  0.]
 [ 5.  1.  1.]
 [ 9.  1.  2.]
 [ 2.  2.  0.]
 [ 6.  2.  1.]
 [ 5.  2.  2.]]

0 讨论(0)

说谎

2021-02-04 14:30

Another way:

arr = np.array([[3, 1, 4],
                [1, 5, 9],
                [2, 6, 5]])

# build out rows array
x = np.arange(arr.shape[0]).reshape(arr.shape[0],1).repeat(arr.shape[1],axis=1)
# build out columns array
y = np.arange(arr.shape[1]).reshape(1,arr.shape[0]).repeat(arr.shape[0],axis=0)

# combine into table
table = np.vstack((arr.reshape(arr.size),x.reshape(arr.size),y.reshape(arr.size))).T
print(table)

0 讨论(0)

1 2 下一页