I could not find such an option in the documentation. A measuring device spits out everything in Excel:
<>
I have the same problem. My first line is:
# id ra dec ...
Where #
is the commenting-character in Python. import_csv
thinks that #
is a column header, but it's not.
The workaround I used was to define the headers manually:
headerlist = ['id', 'ra', 'dec', ...]
df = pd.read_csv('data.txt', index_col=False, header=0, names=headerlist)
Note that index_col
is optional in regards to this problem.
If there is any option to ignore a certain character in header line, I haven't found it. Hope this solution can be improved upon.
Another option would be:
f = open(fname, 'r')
line1 = f.readline()
data1 = pd.read_csv(f, sep='\s+', names=line1.replace(' #', '').split(), dtype=np.float)
You might have a different separator though.
I have the same problem. My first line is
# id x y ...
So pandas header keyword doesn't work. I did the following by reading it twice:
cos_phot_header = pd.read_csv(table, delim_whitespace=True, header=None, engine='python', nrows=1)
cos_plot_text_header = cos_phot_header.drop(0, axis=1).values.tolist()
cos_phot_data = pd.read_csv(table, skip_blank_lines=True, comment='#',
delim_whitespace=True, header=None, engine='python', names=cos_plot_text_header[0])
I don't understand why there is no such option in pandas to do this, it is a very common problem that everyone encounters. You can also read the table with no lines (nrows=0) and use .columns, but honestly I think it is an equally ugly solution to the problem.
Pandas read_csv()
supports regex
. You can avoid matching the white space if it is preceded by something (in your case #). Just as an example, avoiding "!":
sep='(?<!\\!)\s+'
if you want you could rename the column to remove the initial character and white space.
cheers