I have a huge DataFrame
, where some columns have the same names. When I try to pick a column that exists twice, (eg del df[\'col name\']
or df2=
The following function removes columns with dublicate names and keeps only one. Not exactly what you asked for, but you can use snips of it to solve your problem. The idea is to return the index numbers and then you can adress the specific column indices directly. The indices are unique while the column names aren't
def remove_multiples(df,varname):
"""
makes a copy of the first column of all columns with the same name,
deletes all columns with that name and inserts the first column again
"""
from copy import deepcopy
dfout = deepcopy(df)
if (varname in dfout.columns):
tmp = dfout.iloc[:, min([i for i,x in enumerate(dfout.columns == varname) if x])]
del dfout[varname]
dfout[varname] = tmp
return dfout
where
[i for i,x in enumerate(dfout.columns == varname) if x]
is the part you need