问题
Given some arbitrary dictionary
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
}
}
}
I've written a small routine to flatten it in the process of writing an answer to another question.
def recursive_flatten(mydict):
d = {}
for k, v in mydict.items():
if isinstance(v, dict):
for k2, v2 in recursive_flatten(v).items():
d[k + '.' + k2] = v2
else:
d[k] = v
return d
It works, giving me what I want:
new_dict = recursive_flatten(mydict)
print(new_dict)
{'first.second.third.fourth': 'the end'}
And should work for just about any arbitrarily structured dictionary. Unfortunately, it does not:
mydict['new_key'] = mydict
Now recursive_flatten(mydict)
will run until I run out of stack space. I'm trying to figure out how to gracefully handle self-references (basically, ignore or remove them). To complicate matters, self-references may occur for any sub-dictionary... not just the top level. How would I handle self-references elegantly? I can think of a mutable default argument, but there should be a better way... right?
Pointers appreciated, thanks for reading. I welcome any other suggestions/improvements to recursive_flatten
if you have them.
回答1:
One way you can do it using set and id. Note this solution also uses generators which means we can start using our flattened dict before the entire result is computed
def recursive_flatten (mydict):
def loop (seen, path, value):
# if we've seen this value, skip it
if id(value) in seen:
return
# if we haven't seen this value, now we have
else:
seen.add(id(value))
# if this value is a dict...
if isinstance (value, dict):
for (k, v) in value.items ():
yield from loop(seen, path + [k], v)
# base case
else:
yield (".".join(path), value)
# init the loop
yield from loop (set(), [], mydict)
Program demo
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
}
}
}
for (k,v) in recursive_flatten (mydict):
print (k, v)
# first.second.third.fourth the end
mydict['new_key'] = mydict
for (k,v) in recursive_flatten (mydict):
print (k, v)
# first.second.third.fourth the end
We can make a slight modification if you would like to see output for self-referential values
# if we've seen this value, skip it
if (id(value) in seen):
# this is the new line
yield (".".join(path), "*self-reference* %d" % id(value))
return
Now the output of the program will be
first.second.third.fourth the end
first.second.third.fourth the end
new_key *self-reference* 139700111853032
回答2:
I'm not sure what your definition of "graceful" is, but this can be done with some bookkeeping of what has been seen before in a set
of object ids:
class RecursiveFlatten:
def __init__(self):
self.seen = set()
def __call__(self, mydict):
self.seen.add(id(mydict))
d = {}
for k, v in mydict.items():
if isinstance(v, dict):
if id(v) not in self.seen:
self.seen.add(id(v))
for k2, v2 in self(v).items():
d[k + '.' + k2] = v2
else:
d[k] = v
return d
def recursive_flatten(mydict):
return RecursiveFlatten()(mydict)
Testing it out gives me what I expect
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
},
'second2': {
'third2': 'the end2'
}
}
}
mydict['first']['second']['new_key'] = mydict
mydict['new_key'] = mydict
print(recursive_flatten(mydict))
Out:
{'first.second2.third2': 'the end2', 'first.second.third.fourth': 'the end'}
来源:https://stackoverflow.com/questions/49946398/handle-self-references-when-flattening-dictionary