I have a pandas dataframe with the following column names:
Result1, Test1, Result2, Test2, Result3, Test3, etc...
I want to drop all the columns whose name c
Don't drop. Catch the opposite of what you want.
df = df.filter(regex='^((?!badword).)*$').columns
Question states 'I want to drop all the columns whose name contains the word "Test".'
test_columns = [col for col in df if 'Test' in col]
df.drop(columns=test_columns, inplace=True)
the shortest way to do is is :
resdf = df.filter(like='Test',axis=1)
Here is one way to do this:
df = df[df.columns.drop(list(df.filter(regex='Test')))]
str.contains
In recent versions of pandas, you can use string methods on the index and columns. Here, str.startswith seems like a good fit.
To remove all columns starting with a given substring:
df.columns.str.startswith('Test')
# array([ True, False, False, False])
df.loc[:,~df.columns.str.startswith('Test')]
toto test2 riri
0 x x x
1 x x x
For case-insensitive matching, you can use regex-based matching with str.contains
with an SOL anchor:
df.columns.str.contains('^test', case=False)
# array([ True, False, True, False])
df.loc[:,~df.columns.str.contains('^test', case=False)]
toto riri
0 x x
1 x x
if mixed-types is a possibility, specify na=False
as well.
Solution when dropping a list of column names containing regex. I prefer this approach because I'm frequently editing the drop list. Uses a negative filter regex for the drop list.
drop_column_names = ['A','B.+','C.*']
drop_columns_regex = '^(?!(?:'+'|'.join(drop_column_names)+')$)'
print('Dropping columns:',', '.join([c for c in df.columns if re.search(drop_columns_regex,c)]))
df = df.filter(regex=drop_columns_regex,axis=1)