Flatten numpy array but also keep index of value positions?

前端 未结 9 1252
北荒
北荒 2021-02-04 13:58

I have several 2D numpy arrays (matrix) and for each one I would like to convert it to vector containing the values of the array and a vector containing each row/column index.

相关标签:
9条回答
  • 2021-02-04 14:31

    I don't know if it's most efficient, but numpy.meshgrid is designed for this:

    x = np.array([[3, 1, 4],
                  [1, 5, 9],
                  [2, 6, 5]])
    XX,YY = np.meshgrid(np.arange(x.shape[1]),np.arange(x.shape[0]))
    table = np.vstack((x.ravel(),XX.ravel(),YY.ravel())).T
    print(table)
    

    This produces:

    [[3 0 0]
     [1 1 0]
     [4 2 0]
     [1 0 1]
     [5 1 1]
     [9 2 1]
     [2 0 2]
     [6 1 2]
     [5 2 2]]
    

    Then I think df = pandas.DataFrame(table) will give you your desired data frame.

    0 讨论(0)
  • 2021-02-04 14:36

    You can simply use loops.

    x = np.array([[3, 1, 4],
                  [1, 5, 9],
                  [2, 6, 5]])
    values = []
    coordinates = []
    data_frame = []
    for v in xrange(len(x)):
        for h in xrange(len(x[v])):
            values.append(x[v][h])
            coordinates.append((h, v))
            data_frame.append(x[v][h], h, v)
            print '%s | %s | %s' % (x[v][h], v, h)
    
    0 讨论(0)
  • 2021-02-04 14:36

    Update November 2020 (tested on pandas v1.1.3 and numpy v1.19):

    This should be a no-brainer by using np.meshgrid and .reshape(-1).

    x = np.array([[3, 1, 4],
                  [1, 5, 9]])
    
    x_coor, y_coor = np.meshgrid(range(x.shape[1]), range(x.shape[0]))    
    df = pd.DataFrame({"V": x.reshape(-1), "x": x_coor.reshape(-1), "y": y_coor.reshape(-1)})
    

    For 2-dimensional cases, you don't even need a meshgrid. Just np.tile the range of the column axis and np.repeat for the row axis.

    df = pd.DataFrame({
        "V": x.reshape(-1),
        "x": np.tile(np.arange(x.shape[1]), x.shape[0]),
        "y": np.repeat(np.arange(x.shape[0]), x.shape[1])
    })
    

    The sample data is trimmed to shape=(2, 3) to better reflect the axes location.

    Result

    print(df)
    
       V  x  y
    0  3  0  0
    1  1  1  0
    2  4  2  0
    3  1  0  1
    4  5  1  1
    5  9  2  1
    
    0 讨论(0)
提交回复
热议问题