Drop columns whose name contains a specific string from pandas DataFrame

后端 未结 11 1883
臣服心动
臣服心动 2020-11-29 16:19

I have a pandas dataframe with the following column names:

Result1, Test1, Result2, Test2, Result3, Test3, etc...

I want to drop all the columns whose name c

相关标签:
11条回答
  • 2020-11-29 16:57

    Don't drop. Catch the opposite of what you want.

    df = df.filter(regex='^((?!badword).)*$').columns
    
    0 讨论(0)
  • 2020-11-29 17:01

    Question states 'I want to drop all the columns whose name contains the word "Test".'

    test_columns = [col for col in df if 'Test' in col]
    df.drop(columns=test_columns, inplace=True)
    
    0 讨论(0)
  • 2020-11-29 17:03

    the shortest way to do is is :

    resdf = df.filter(like='Test',axis=1)
    
    0 讨论(0)
  • 2020-11-29 17:06

    Here is one way to do this:

    df = df[df.columns.drop(list(df.filter(regex='Test')))]
    
    0 讨论(0)
  • 2020-11-29 17:07

    Cheaper, Faster, and Idiomatic: str.contains

    In recent versions of pandas, you can use string methods on the index and columns. Here, str.startswith seems like a good fit.

    To remove all columns starting with a given substring:

    df.columns.str.startswith('Test')
    # array([ True, False, False, False])
    
    df.loc[:,~df.columns.str.startswith('Test')]
    
      toto test2 riri
    0    x     x    x
    1    x     x    x
    

    For case-insensitive matching, you can use regex-based matching with str.contains with an SOL anchor:

    df.columns.str.contains('^test', case=False)
    # array([ True, False,  True, False])
    
    df.loc[:,~df.columns.str.contains('^test', case=False)] 
    
      toto riri
    0    x    x
    1    x    x
    

    if mixed-types is a possibility, specify na=False as well.

    0 讨论(0)
  • 2020-11-29 17:10

    Solution when dropping a list of column names containing regex. I prefer this approach because I'm frequently editing the drop list. Uses a negative filter regex for the drop list.

    drop_column_names = ['A','B.+','C.*']
    drop_columns_regex = '^(?!(?:'+'|'.join(drop_column_names)+')$)'
    print('Dropping columns:',', '.join([c for c in df.columns if re.search(drop_columns_regex,c)]))
    df = df.filter(regex=drop_columns_regex,axis=1)
    
    0 讨论(0)
提交回复
热议问题