Replace values in multiple untitled columns to 0, 1, 2 depending on column

前端 未结 1 832
被撕碎了的回忆
被撕碎了的回忆 2021-01-19 04:45

EDITED AS PER COMMENTS

Background: Here is what the current dataframe looks like. The row labels are information texts in original excel file. But I

1条回答
  •  逝去的感伤
    2021-01-19 05:38

    Here is one way to do it:

    1. Define a function to replace the x:
    import re
    
    def replaceX(col):
        cond = ~((col == "x") | (col == "X"))
        # Check if the name of the column is undefined
        if not re.match(r'Unnamed: \d+', col.name):
            return col.where(cond, 0)
        else:
            # Check what is the value of the first row
            if col.iloc[0] == "Commented":
                return col.where(cond, 1)
            elif col.iloc[0] == "No comment":
                return col.where(cond, 2)
        return col
    

    Or if your first row don't contain "Commented" or "No comment" for titled columns you can have a solution without regex:

    def replaceX(col):
        cond = ~((col == "x") | (col == "X"))
        # Check what is the value of the first row
        if col.iloc[0] == "Commented":
            return col.where(cond, 1)
        elif col.iloc[0] == "No comment":
            return col.where(cond, 2)
        return col.where(cond, 0)
    
    1. Apply this function on the DataFrame:
    # Apply the function on every column (axis not specified so equal 0)
    df.apply(lambda col: replaceX(col))
    

    Output:

      title Unnamed: 2  Unnamed: 3
    0        Commented  No comment
    1                             
    2     0                      2
    3                1            
    

    Documentation:

    • Apply: apply a function on every columns/rows depending on the axis
    • Where: check where a condition is met on a series, if it is not met, replace with value specified.

    0 讨论(0)
提交回复
热议问题