remove duplicates from nested dictionaries in list

前端 未结 6 1903
逝去的感伤
逝去的感伤 2021-02-06 16:13

quick and very basic newbie question.

If i have list of dictionaries looking like this:

L = []
L.append({\"value1\": value1, \"value2\": value2, \"value3         


        
相关标签:
6条回答
  • 2021-02-06 16:49

    You can use a temporary array to store an items dict. The previous code was bugged for removing items in the for loop.

    (v,r) = ([],[])
    for i in l:
        if ('value4', i['value4']) not in v and ('value3', i['value3']) not in v:
            r.append(i)
        v.extend(i.items())
    l = r
    

    Your test:

    l = [{"value1": 'fssd', "value2": 'dsfds', "value3": 'abcd', "value4": 'gk'},
        {"value1": 'asdasd', "value2": 'asdas', "value3": 'dafdd', "value4": 'sdfsdf'},
        {"value1": 'sdfsf', "value2": 'sdfsdf', "value3": 'abcd', "value4": 'gk'},
        {"value1": 'asddas', "value2": 'asdsa', "value3": 'abcd', "value4": 'gk'},
        {"value1": 'asdasd', "value2": 'dskksks', "value3": 'ldlsld', "value4": 'sdlsld'}]
    

    ouputs

    {'value4': 'gk', 'value3': 'abcd', 'value2': 'dsfds', 'value1': 'fssd'}
    {'value4': 'sdfsdf', 'value3': 'dafdd', 'value2': 'asdas', 'value1': 'asdasd'}
    {'value4': 'sdlsld', 'value3': 'ldlsld', 'value2': 'dskksks', 'value1': 'asdasd'}
    
    0 讨论(0)
  • 2021-02-06 16:54

    In Python 2.6 or 3.*:

    import itertools
    import pprint
    
    L = [{"value1": "fssd", "value2": "dsfds", "value3": "abcd", "value4": "gk"},
        {"value1": "asdasd", "value2": "asdas", "value3": "dafdd", "value4": "sdfsdf"},
        {"value1": "sdfsf", "value2": "sdfsdf", "value3": "abcd", "value4": "gk"},
        {"value1": "asddas", "value2": "asdsa", "value3": "abcd", "value4": "gk"},
        {"value1": "asdasd", "value2": "dskksks", "value3": "ldlsld", "value4": "sdlsld"}]
    
    getvals = operator.itemgetter('value3', 'value4')
    
    L.sort(key=getvals)
    
    result = []
    for k, g in itertools.groupby(L, getvals):
        result.append(g.next())
    
    L[:] = result
    pprint.pprint(L)
    

    Almost the same in Python 2.5, except you have to use g.next() instead of next(g) in the append.

    0 讨论(0)
  • 2021-02-06 16:54
    for dic in list: 
      for anotherdic in list:
        if dic != anotherdic:
          if dic["value3"] == anotherdic["value3"] or dic["value4"] == anotherdic["value4"]:
            list.remove(anotherdic)
    

    Tested with

    list = [{"value1": 'fssd', "value2": 'dsfds', "value3": 'abcd', "value4": 'gk'},
    {"value1": 'asdasd', "value2": 'asdas', "value3": 'dafdd', "value4": 'sdfsdf'},
    {"value1": 'sdfsf', "value2": 'sdfsdf', "value3": 'abcd', "value4": 'gk'},
    {"value1": 'asddas', "value2": 'asdsa', "value3": 'abcd', "value4": 'gk'},
    {"value1": 'asdasd', "value2": 'dskksks', "value3": 'ldlsld', "value4": 'sdlsld'}]
    

    worked fine for me :)

    0 讨论(0)
  • 2021-02-06 17:00

    That's a list of one dictionary and but, assuming there are more dictionaries in the list l:

    l = [ldict for ldict in l if ldict.get("value3") != value3 or ldict.get("value4") != value4]
    

    But is that what you really want to do? Perhaps you need to refine your description.

    BTW, don't use list as a name since it is the name of a Python built-in.

    EDIT: Assuming you started with a list of dictionaries, rather than a list of lists of 1 dictionary each that should work with your example. It wouldn't work if either of the values were None, so better something like:

    l = [ldict for ldict in l if not ( ("value3" in ldict and ldict["value3"] == value3) and ("value4" in ldict and ldict["value4"] == value4) )]
    

    But it still seems like an unusual data structure.

    EDIT: no need to use explicit gets.

    Also, there are always tradeoffs in solutions. Without more info and without actually measuring, it's hard to know which performance tradeoffs are most important for the problem. But, as the Zen sez: "Simple is better than complex".

    0 讨论(0)
  • 2021-02-06 17:13

    If I understand correctly, you want to discard matches that come later in the original list but do not care about the order of the resulting list, so:

    (Tested with 2.5.2)

    tempDict = {}
    for d in L[::-1]:
        tempDict[(d["value3"],d["value4"])] = d
    L[:] = tempDict.itervalues()
    tempDict = None
    
    0 讨论(0)
  • 2021-02-06 17:14

    Here's one way:

    keyfunc = lambda d: (d['value3'], d['value4'])
    
    from itertools import groupby
    giter = groupby(sorted(L, key=keyfunc), keyfunc)
    
    L2 = [g[1].next() for g in giter]
    print L2
    
    0 讨论(0)
提交回复
热议问题