Find and replace duplicates in Array, but replace each nth instance with a different string

后端 未结 6 1638
栀梦
栀梦 2021-01-22 09:00

I have an array below which consists of repeated strings. I want to find and replace those strings, but each time a match is made I want to change the value of the replace strin

相关标签:
6条回答
  • 2021-01-22 09:23

    I'd use collections.Counter:

    from collections import Counter
    
    numbers = { 
        word: iter([""] if count == 1 else xrange(1, count + 1)) 
        for word, count in Counter(sample).items()
    }
    
    result = [
        word + str(next(numbers[word])) 
        for word in sample
    ]
    

    This doesn't require the list to be sorted or grouped in any way.

    This solution uses iterators to generate sequential numbers:

    • first, we calculate how many times each word occurs in the list (Counter(sample)).

    • then we create a dictionary numbers, which, for each word, contains its "numbering" iterator iter(...). If the word occurs only once count==1, this iterator will return ("yield") an empty string, otherwise it will yield sequential numbers in range from 1 to count [""] if count == 1 else xrange(1, count + 1).

    • finally, we iterate over the list once again, and, for each word, pick the next value from its own numbering iterator next(numbers[word]). Since our iterators return numbers, we have to convert them to strings str(...).

    0 讨论(0)
  • 2021-01-22 09:24

    groupby is a convenient way to group duplicates:

    >>> from itertools import groupby
    >>> FinalArray = []
    >>> for k, g in groupby(SampleArray):
        # g is an iterator, so get a list of it for further handling
        items = list(g)
        # If only one item, add it unchanged
        if len(items) == 1:
            FinalArray.append(k)
        # Else add index at the end
        else:
            FinalArray.extend([j + str(i) for i, j in enumerate(items, 1)])
    
    
    >>> FinalArray
    ['champ', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
    
    0 讨论(0)
  • 2021-01-22 09:26

    One way would be to convert your array into a dictionary like that:

    SampleDict = {}
    for key in SampleArray:
        if key in SampleDict:
            SampleDict[key][0] = True # means: duplicates
            SampleDict[key][1] += 1 
        else:
            SampleDict[key] = [False, 1] # means: no duplicates
    

    Now you can easily convert that dict back to an array. However if the order in SampleArray is important, then you can do it like that:

    for i in range(len(SampleArray)):
        key = SampleArray[i]
        counter = SampleDict[key]
        if index[0]:
            SampleArray[i] = key + str(counter[1])
        counter[1] -= 1
    

    This will give you the reversed order however, i.e.

    SampleArray = ['champ', 'king2', 'king1', 'mak3', 'mak2', 'mak1']
    

    But I'm sure you'll be able to tweak it to your needs.

    0 讨论(0)
  • 2021-01-22 09:31

    EDIT

    Counter and than sorting is simpler:

    L = ['champ', 'king', 'king', 'mak', 'mak', 'mak']
    counts = Counter(L)
    res = []
    for word in sorted(counts.keys()):
        if counts[word] == 1:
            res.append(word)
        else:
            res.extend(['{}{}'.format(word, index) for index in 
                       range(1, counts[word] + 1)])
    

    So this

    ['champ', 'mak', 'king', 'king', 'mak', 'mak']
    

    also gives:

    ['champ', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
    
    0 讨论(0)
  • 2021-01-22 09:43

    Assuming you want the array sorted :

    import collections    
    counter = collections.Counter(SampleArray)
    res = []
    for key in sorted(counter.keys()):
        if counter[key] == 1:
            res.append(key)
        else:
            res.extend([key+str(i) for i in range(1, counter[key]+1)])
    
    >>> res
    ['champ', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
    
    0 讨论(0)
  • 2021-01-22 09:48
    f = ['champ', 'king', 'king', 'mak', 'mak', 'mak']
    
    fields_out = [x + str(f.count(x) - f[i + 1:].count(x)) for i, x in enumerate(f)]
    print(fields_out)
    
    >>['champ1', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
    

    or

    fields_out = [(x if i == f.index(x) else x + str(f.count(x) - f[i + 1:].count(x))) for i, x in enumerate(f)]
    print(fields_out)
    
    >>['champ', 'king', 'king2', 'mak', 'mak2', 'mak3']
    
    0 讨论(0)
提交回复
热议问题