I have an array below which consists of repeated strings. I want to find and replace those strings, but each time a match is made I want to change the value of the replace strin
I'd use collections.Counter:
from collections import Counter
numbers = {
word: iter([""] if count == 1 else xrange(1, count + 1))
for word, count in Counter(sample).items()
}
result = [
word + str(next(numbers[word]))
for word in sample
]
This doesn't require the list to be sorted or grouped in any way.
This solution uses iterators to generate sequential numbers:
first, we calculate how many times each word occurs in the list (Counter(sample)
).
then we create a dictionary numbers
, which, for each word, contains its "numbering" iterator iter(...)
. If the word occurs only once count==1
, this iterator will return ("yield") an empty string, otherwise it will yield sequential numbers in range from 1 to count [""] if count == 1 else xrange(1, count + 1)
.
finally, we iterate over the list once again, and, for each word, pick the next value from its own numbering iterator next(numbers[word])
. Since our iterators return numbers, we have to convert them to strings str(...)
.
groupby
is a convenient way to group duplicates:
>>> from itertools import groupby
>>> FinalArray = []
>>> for k, g in groupby(SampleArray):
# g is an iterator, so get a list of it for further handling
items = list(g)
# If only one item, add it unchanged
if len(items) == 1:
FinalArray.append(k)
# Else add index at the end
else:
FinalArray.extend([j + str(i) for i, j in enumerate(items, 1)])
>>> FinalArray
['champ', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
One way would be to convert your array into a dictionary like that:
SampleDict = {}
for key in SampleArray:
if key in SampleDict:
SampleDict[key][0] = True # means: duplicates
SampleDict[key][1] += 1
else:
SampleDict[key] = [False, 1] # means: no duplicates
Now you can easily convert that dict back to an array. However if the order in SampleArray
is important, then you can do it like that:
for i in range(len(SampleArray)):
key = SampleArray[i]
counter = SampleDict[key]
if index[0]:
SampleArray[i] = key + str(counter[1])
counter[1] -= 1
This will give you the reversed order however, i.e.
SampleArray = ['champ', 'king2', 'king1', 'mak3', 'mak2', 'mak1']
But I'm sure you'll be able to tweak it to your needs.
EDIT
Counter and than sorting is simpler:
L = ['champ', 'king', 'king', 'mak', 'mak', 'mak']
counts = Counter(L)
res = []
for word in sorted(counts.keys()):
if counts[word] == 1:
res.append(word)
else:
res.extend(['{}{}'.format(word, index) for index in
range(1, counts[word] + 1)])
So this
['champ', 'mak', 'king', 'king', 'mak', 'mak']
also gives:
['champ', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
Assuming you want the array sorted :
import collections
counter = collections.Counter(SampleArray)
res = []
for key in sorted(counter.keys()):
if counter[key] == 1:
res.append(key)
else:
res.extend([key+str(i) for i in range(1, counter[key]+1)])
>>> res
['champ', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
f = ['champ', 'king', 'king', 'mak', 'mak', 'mak']
fields_out = [x + str(f.count(x) - f[i + 1:].count(x)) for i, x in enumerate(f)]
print(fields_out)
>>['champ1', 'king1', 'king2', 'mak1', 'mak2', 'mak3']
or
fields_out = [(x if i == f.index(x) else x + str(f.count(x) - f[i + 1:].count(x))) for i, x in enumerate(f)]
print(fields_out)
>>['champ', 'king', 'king2', 'mak', 'mak2', 'mak3']