Slice/split string Series at various positions

前端 未结 3 1632
借酒劲吻你
借酒劲吻你 2021-01-18 21:48

I\'m looking to split a string Series at different points depending on the length of certain substrings:

In [47]: df = pd.DataFrame([\'group9class1\', \'grou         


        
3条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-18 22:40

    Use a regular expression to split the string

     import re
    
     regex = re.compile("(class)")
     str="group1class23"
     # this will split the group and the class string by adding a space between them, and using a simple split on space.
     split_string = re.sub(regex, " \\1", str).split(" ")
    

    This will return the array:

     ['group9', 'class23']
    

    So to append two new columns to your DataFrame you can do:

    new_cols = [re.sub(regex, " \\1", x).split(" ") for x in df.group_class]
    df['group'], df['class'] = zip(*new_cols)
    

    Which results in:

          group_class    group    class
    0    group9class1   group9   class1
    1   group10class2  group10   class2
    2  group11class20  group11  class20
    

提交回复
热议问题