Putting incomplete nested lists in a rectangular ndarray

大憨熊 提交于 2019-12-24 00:27:30

问题


In Python (also using numpy) I have a list of lists of lists, with each list being different lengths.

[
    [
         ["header1","header2"],
         ["---"],
         [],
         ["item1","value1"]
    ],

    [
         ["header1","header2","header3"],
         ["item2","value2"],
         ["item3","value3","value4","value5"]
    ]
]

I want to make this data structure rectangular: i.e. guarantee that len(list[x]) is constant for all x, len(list[x][y]) is constant for all x,y, etc.

(This is because I want to import the data structure into numpy)

I can think of various unpythonic ways of doing such a thing (iterate over structure, record maximum length at each level, have second pass and pad values with None, but there must be a better way.

(I also would like the solution to not be dependant on the dimensionality of the structure; i.e. it should work on lists of such structures, too...)

Is there a simple way of doing this that I'm missing?


回答1:


You can create a ndarray with the desired dimensions and readily read your list. Since your list is incomplete you must catch the IndexError, which can be done in a try / exception block.

Using numpy.ndenumerate allows the solution to be easily extensible to more dimensions (adding more indexes i,j,k,l,m,n,... in the for loop below):

import numpy as np
test = [ [ ["header1","header2"],
           ["---"],
           [],
           ["item1","value1"] ],
         [ ["header1","header2","header3"],
           ["item2","value2"],
           ["item3","value3","value4","value5"] ] ]


collector = np.empty((2,4,4),dtype='|S20')

for (i,j,k), v in np.ndenumerate( collector ):
    try:
        collector[i,j,k] = test[i][j][k]
    except IndexError:
        collector[i,j,k] = ''


print collector
#array([[['header1', 'header2', '', ''],
#        ['---', '', '', ''],
#        ['', '', '', ''],
#        ['item1', 'value1', '', '']],
#       [['header1', 'header2', 'header3', ''],
#        ['item2', 'value2', '', ''],
#        ['item3', 'value3', 'value4', 'value5'],
#        ['', '', '', '']]],  dtype='|S10')


来源:https://stackoverflow.com/questions/16689048/putting-incomplete-nested-lists-in-a-rectangular-ndarray

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!