Pandas: Randomize letters in a column

匿名 (未验证) 提交于 2019-12-03 01:00:01

问题:

I have a data frame that looks like this:

id1           | id2 ---------------------------- ab51c-ee-1a   | cga--=%abd21 

I am looking to randomize the letters only:

id1           | id2 ---------------------------- ge51r-eq-1b   | olp--=%cqw21 

I think I can do something like this:

newid1 = [] for index, row in df.iterrows():     string = ''     for i in row['id1']:         if i.isalpha():             string+=random.choice(string.letters)         else:             string+=i     newcolumn.append(string) 

But it doesn't seem very efficient. Is there a better way?

回答1:

Lets use apply, with the power of str.replace to replace only alphabets using regex i.e

import string  import random  letters = list(string.ascii_lowercase)  def rand(stri):     return random.choice(letters)  df.apply(lambda x : x.str.replace('[a-z]',rand)) 

Output :

            id1            id2 0  gp51e-id-1v      jvj--=%glw21 

For one specific column use

df['id1'].str.replace('[a-z]',rand) 

Added by @antonvbr: For future reference, if we want to change upper and lower case we could do this:

letters = dict(u=list(string.ascii_uppercase),l=list(string.ascii_lowercase))  (df['id1'].str.replace('[a-z]',lambda x: random.choice(letters['l']))           .str.replace('[A-Z]',lambda x: random.choice(letters['u']))) 


回答2:

How about this:

import pandas as pd from string import ascii_lowercase as al import random  df = pd.DataFrame({'id1': ['ab51c-ee-1a'],                    'id2': ['cga--=%abd21']})  al = list(al) df = df.applymap(lambda x: ''.join([random.choice(al) if i in al else i for i in list(x)])) 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!