Given this data frame from some other question:
Constraint Name TotalSP Onpeak Offpeak
Constraint_ID
77127 aaaaaaaaaaaaaaaa
read_clipboard
by default uses whitespace to separate the columns. The problem you see is because of the whitespace in the first column. If you specify two or more spaces as the separator, based on the table format it will figure out the index column itself:
df = pd.read_clipboard(sep='\s{2,}')
df
Out:
Constraint Name TotalSP Onpeak Offpeak
Constraint_ID
77127 aaaaaaaaaaaaaaaaaa -2174.50 -2027.21 -147.29
98333 bbbbbbbbbbbbbbbbbb -1180.62 -1180.62 0.00
1049 cccccccccccccccccc -1036.53 -886.77 -149.76
index_col
argument can also be used to tell pandas the first column is the index, in case the structure cannot be inferred from the separator alone:
df = pd.read_clipboard(index_col=0, sep='\s{2,}')
This is not as cool as @ayhan's answer, but most of the time works pretty well. Assuming you are using ipython or jupyter, just copy and paste the data into %%file
:
Then do some quick edits. With multi-indexes, just move the index up a line, something like this (also shortening "Constraint ID" to "ID" to save a little space in this case):
%%file foo.txt
ID Constraint Name TotalSP Onpeak Offpeak
77127 aaaaaaaaaaaaaaaaaa -2174.5 -2027.21 -147.29
98333 bbbbbbbbbbbbbbbbbb -1180.62 -1180.62 0
1049 cccccccccccccccccc -1036.53 -886.77 -149.76
pd.read_fwf('foo.txt')
Out[338]:
ID Constraint Name TotalSP Onpeak Offpeak
0 77127 aaaaaaaaaaaaaaaaaa -2174.50 -2027.21 -147.29
1 98333 bbbbbbbbbbbbbbbbbb -1180.62 -1180.62 0.00
2 1049 cccccccccccccccccc -1036.53 -886.77 -149.76
read_fwf
generally works pretty well on tabular stuff like this, correctly dealing with spaces in column names (usually). Of course, you can also use this basic method with read_csv
.
The nice thing about this method is that for small sample data you can deal with just about any of the weird ways that users post data here. And there are a lot of weird ways. ;-)