Extract string from column following a specific pattern

前端 未结 1 523
别跟我提以往
别跟我提以往 2020-12-22 04:58

Please forgive my panda newbie question, but I have a column of U.S. towns and states, such as the truncated version shown below (For some strange reason, the name of the co

相关标签:
1条回答
  • 2020-12-22 05:18

    Without much context or access to your data, I'd suggest something along these lines. First, modify the code that reads your data:

    df = pd.read_csv(..., header=None, names=['RegionName']) 
    # add header=False so as to read the first row as data
    

    Now, extract the state name using str.extract, this should only extract names as long as they are succeeded by the substring "[edit]". You can then forward fill all NaN values using ffill.

    df['State'] = df['RegionName'].str.extract(
        r'(?P<State>.*)(?=\s*\[edit\])'
    ).ffill()
    
    0 讨论(0)
提交回复
热议问题