I am creating a dataframe
from a CSV file. I have gone through the docs, multiple SO posts, links as I have just started Pandas but didn\'t get it. The CSV file
the relevant parameter is mangle_dupe_cols
from the docs
mangle_dupe_cols : boolean, default True Duplicate columns will be specified as 'X.0'...'X.N', rather than 'X'...'X'
by default, all of your 'a'
columns get named 'a.0'...'a.N'
as specified above.
if you used mangle_dupe_cols=False
, importing this csv
would produce an error.
you can get all of your columns with
df.filter(like='a')
demonstration
from StringIO import StringIO
import pandas as pd
txt = """a, a, a, b, c, d
1, 2, 3, 4, 5, 6
7, 8, 9, 10, 11, 12"""
df = pd.read_csv(StringIO(txt), skipinitialspace=True)
df
df.filter(like='a')