I have a data frame that looks like this:
id1 | id2 ---------------------------- ab51c-ee-1a | cga--=%abd21
I am looking to randomize the letters only:
id1 | id2 ---------------------------- ge51r-eq-1b | olp--=%cqw21
I think I can do something like this:
newid1 = [] for index, row in df.iterrows(): string = '' for i in row['id1']: if i.isalpha(): string+=random.choice(string.letters) else: string+=i newcolumn.append(string)
But it doesn't seem very efficient. Is there a better way?
Lets use apply
, with the power of str.replace
to replace only alphabets using regex i.e
import string import random letters = list(string.ascii_lowercase) def rand(stri): return random.choice(letters) df.apply(lambda x : x.str.replace('[a-z]',rand))
Output :
id1 id2 0 gp51e-id-1v jvj--=%glw21
For one specific column use
df['id1'].str.replace('[a-z]',rand)
Added by @antonvbr: For future reference, if we want to change upper and lower case we could do this:
letters = dict(u=list(string.ascii_uppercase),l=list(string.ascii_lowercase)) (df['id1'].str.replace('[a-z]',lambda x: random.choice(letters['l'])) .str.replace('[A-Z]',lambda x: random.choice(letters['u'])))
How about this:
import pandas as pd from string import ascii_lowercase as al import random df = pd.DataFrame({'id1': ['ab51c-ee-1a'], 'id2': ['cga--=%abd21']}) al = list(al) df = df.applymap(lambda x: ''.join([random.choice(al) if i in al else i for i in list(x)]))