问题
In Python (also using numpy) I have a list of lists of lists, with each list being different lengths.
[
[
["header1","header2"],
["---"],
[],
["item1","value1"]
],
[
["header1","header2","header3"],
["item2","value2"],
["item3","value3","value4","value5"]
]
]
I want to make this data structure rectangular: i.e. guarantee that len(list[x])
is constant for all x
, len(list[x][y])
is constant for all x,y, etc.
(This is because I want to import the data structure into numpy)
I can think of various unpythonic ways of doing such a thing (iterate over structure, record maximum length at each level, have second pass and pad values with None
, but there must be a better way.
(I also would like the solution to not be dependant on the dimensionality of the structure; i.e. it should work on lists of such structures, too...)
Is there a simple way of doing this that I'm missing?
回答1:
You can create a ndarray
with the desired dimensions and readily read your list. Since your list is incomplete you must catch the IndexError
, which can be done in a try / exception
block.
Using numpy.ndenumerate
allows the solution to be easily extensible to more dimensions (adding more indexes i,j,k,l,m,n,...
in the for loop below):
import numpy as np
test = [ [ ["header1","header2"],
["---"],
[],
["item1","value1"] ],
[ ["header1","header2","header3"],
["item2","value2"],
["item3","value3","value4","value5"] ] ]
collector = np.empty((2,4,4),dtype='|S20')
for (i,j,k), v in np.ndenumerate( collector ):
try:
collector[i,j,k] = test[i][j][k]
except IndexError:
collector[i,j,k] = ''
print collector
#array([[['header1', 'header2', '', ''],
# ['---', '', '', ''],
# ['', '', '', ''],
# ['item1', 'value1', '', '']],
# [['header1', 'header2', 'header3', ''],
# ['item2', 'value2', '', ''],
# ['item3', 'value3', 'value4', 'value5'],
# ['', '', '', '']]], dtype='|S10')
来源:https://stackoverflow.com/questions/16689048/putting-incomplete-nested-lists-in-a-rectangular-ndarray