Python - Flatten the list of dictionaries

后端 未结 4 1303
情话喂你
情话喂你 2021-01-01 09:54

List of dictionaries:

data = [{
         \'a\':{\'l\':\'Apple\',
                \'b\':\'Milk\',
                \'d\':\'Meatball\'},
         \'b\':{\'favou         


        
相关标签:
4条回答
  • 2021-01-01 10:31

    You can use functools.reduce along with a simple list comprehension to flatten out the list the of dicts

    >>> from functools import reduce 
    
    >>> data = [{'b': {'dislike': 'juice', 'favourite': 'coke'}, 'a': {'l': 'Apple', 'b': 'Milk', 'd': 'Meatball'}}, {'b': {'dislike': 'juice3', 'favourite': 'coke2'}, 'a': {'l': 'Apple1', 'b': 'Milk1', 'd': 'Meatball2'}}]
    >>> [reduce(lambda x,y: {**x,**y},d.values()) for d in data]
    >>> [{'dislike': 'juice', 'l': 'Apple', 'd': 'Meatball', 'b': 'Milk', 'favourite': 'coke'}, {'dislike': 'juice3', 'l': 'Apple1', 'd': 'Meatball2', 'b': 'Milk1', 'favourite': 'coke2'}]
    

    Time benchmark is as follows:

    >>> import timeit
    >>> setup = """
          from functools import reduce
          data = [{'b': {'dislike': 'juice', 'favourite': 'coke'}, 'a': {'l': 'Apple', 'b': 'Milk', 'd': 'Meatball'}}, {'b': {'dislike': 'juice3', 'favourite': 'coke2'}, 'a': {'l': 'Apple1', 'b': 'Milk1', 'd': 'Meatball2'}}]
      """
    >>> min(timeit.Timer("[reduce(lambda x,y: {**x,**y},d.values()) for d in data]",setup=setup).repeat(3,1000000))
    >>> 1.525032774952706
    

    Time benchmark of other answers on my machine

    >>> setup = """
            data = [{'b': {'dislike': 'juice', 'favourite': 'coke'}, 'a': {'l': 'Apple', 'b': 'Milk', 'd': 'Meatball'}}, {'b': {'dislike': 'juice3', 'favourite': 'coke2'}, 'a': {'l': 'Apple1', 'b': 'Milk1', 'd': 'Meatball2'}}]
        """
    >>> min(timeit.Timer("[{k: v for x in d.values() for k, v in x.items()} for d in data]",setup=setup).repeat(3,1000000))
    >>> 2.2488374650129117
    
    >>> min(timeit.Timer("[{k: x[k] for x in d.values() for k in x} for d in data]",setup=setup).repeat(3,1000000))
    >>> 1.8990078769857064
    
    >>> code = """
          L = []
          for d in data:
              temp = {}
              for key in d:
                  temp.update(d[key])
    
              L.append(temp)
        """
    
    >>> min(timeit.Timer(code,setup=setup).repeat(3,1000000))
    >>> 1.4258553800173104
    
    >>> setup = """
          from itertools import chain
          data = [{'b': {'dislike': 'juice', 'favourite': 'coke'}, 'a': {'l': 'Apple', 'b': 'Milk', 'd': 'Meatball'}}, {'b': {'dislike': 'juice3', 'favourite': 'coke2'}, 'a': {'l': 'Apple1', 'b': 'Milk1', 'd': 'Meatball2'}}]
        """
    >>> min(timeit.Timer("[dict(chain(*map(dict.items, d.values()))) for d in data]",setup=setup).repeat(3,1000000))
    >>> 3.774383604992181
    
    0 讨论(0)
  • 2021-01-01 10:44

    If you have nested dictionaries with only 'a' and 'b' keys, then I suggest the following solution I find fast and very easy to understand (for readability purpose):

    L = [x['a'] for x in data]
    b = [x['b'] for x in data]
    
    for i in range(len(L)):
        L[i].update(b[i])
    
    # timeit ~1.4
    
    print(L)
    
    0 讨论(0)
  • 2021-01-01 10:45

    You can do this with 2 nested loops, and dict.update() to add inner dictionaries to a temporary dictionary and add it at the end:

    L = []
    for d in data:
        temp = {}
        for key in d:
            temp.update(d[key])
    
        L.append(temp)
    
    # timeit ~1.4
    print(L)
    

    Which Outputs:

    [{'l': 'Apple', 'b': 'Milk', 'd': 'Meatball', 'favourite': 'coke', 'dislike': 'juice'}, {'l': 'Apple1', 'b': 'Milk1', 'd': 'Meatball2', 'favourite': 'coke2', 'dislike': 'juice3'}]
    
    0 讨论(0)
  • 2021-01-01 10:46

    You can do the following, using itertools.chain:

    >>> from itertools import chain
    # timeit: ~3.40
    >>> [dict(chain(*map(dict.items, d.values()))) for d in data]
    [{'l': 'Apple', 
      'b': 'Milk', 
      'd': 'Meatball', 
      'favourite': 'coke', 
      'dislike': 'juice'}, 
     {'l': 'Apple1', 
      'b': 'Milk1', 
      'dislike': 'juice3', 
      'favourite': 'coke2', 
      'd': 'Meatball2'}]
    

    The usage of chain, map, * make this expression a shorthand for the following doubly nested comprehension which actually performs better on my system (Python 3.5.2) and isn't that much longer:

    # timeit: ~2.04
    [{k: v for x in d.values() for k, v in x.items()} for d in data]
    # Or, not using items, but lookup by key
    # timeit: ~1.67
    [{k: x[k] for x in d.values() for k in x} for d in data]
    

    Note:

    RoadRunner's loop-and-update approach outperforms both these one-liners at timeit: ~1.37

    0 讨论(0)
提交回复
热议问题