How to count number of unique lists within list?

后端 未结 4 480
心在旅途
心在旅途 2021-01-27 12:10

I\'ve tried using Counter and itertools, but since a list is unhasable, they don\'t work.

My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]

I would like to

相关标签:
4条回答
  • 2021-01-27 12:17

    I think, using the Counter class on tuples like

    Counter(tuple(item) for item in li)
    

    Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).

    The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.

    In order to improve on performance, you would have to

    • Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
    • Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)

    At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.

    As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.

    So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.

    0 讨论(0)
  • 2021-01-27 12:20
    list =  [ [1,2,3], [2,3,4], [1,2,3] ]
    repeats = []
    unique = 0
    for i in list:
        count = 0;
        if i not in repeats:
            for i2 in list:
                if i == i2:
                    count += 1
        if count > 1:
            repeats.append(i)
        elif count == 1:
            unique += 1
    
    print "Repeated Items"
    for r in repeats:
        print r,
    
    print "\nUnique items:", unique
    

    loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

    0 讨论(0)
  • 2021-01-27 12:21
    >>> from collections import Counter
    >>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
    >>> Counter(str(e) for e in li)
    Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})
    

    The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:

    >>> Counter(tuple(e) for e in li)
    Counter({(1, 2, 3): 2, (2, 3, 4): 1})
    

    If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).

    0 讨论(0)
  • 2021-01-27 12:31
    ll = [ [1,2,3], [2,3,4], [1,2,3] ]
    print(len(set(map(tuple, ll))))
    

    Also, if you wanted to count the occurences of a unique* list:

     print(ll.count([1,2,3]))
    

    *value unique, not reference unique)

    0 讨论(0)
提交回复
热议问题