I\'m using pandas to manage a large array of 8-byte integers. These integers are included as space-delimited elements of a column in a comma-delimited CSV file, and the array si
There are two issues here, one fundamental and one you simply haven't come across yet. :^)
First, after you write to c
, you're at the end of the (virtual) file. You need to seek
back to the start. We'll use a smaller grid as an example:
>>> a = np.random.randint(0,256,(10,10)).astype('uint8')
>>> b = pd.DataFrame(a)
>>> c = StringIO()
>>> b.to_csv(c, delimiter=' ', header=False, index=False)
>>> next(c)
Traceback (most recent call last):
File "<ipython-input-57-73b012f9653f>", line 1, in <module>
next(c)
StopIteration
which generates the "no columns" error. If we seek
first, though:
>>> c.seek(0)
>>> next(c)
'103,3,171,239,150,35,224,190,225,57\n'
But now you'll notice the second issue-- commas? I thought we requested space delimiters? But to_csv
only accepts sep
, not delimiter
. Seems to me it should either accept it or object that it doesn't, but silently ignoring it feels like a bug. Anyway, if we use sep
(or delim_whitespace=True
):
>>> a = np.random.randint(0,256,(10,10)).astype('uint8')
>>> b = pd.DataFrame(a)
>>> c = StringIO()
>>> b.to_csv(c, sep=' ', header=False, index=False)
>>> c.seek(0)
>>> d = pd.read_csv(c, sep=' ', header=None, dtype='uint8')
>>> d
0 1 2 3 4 5 6 7 8 9
0 209 65 218 242 178 213 187 63 137 145
1 161 222 50 92 157 31 49 62 218 30
2 182 255 146 249 115 91 160 53 200 252
3 192 116 87 85 164 46 192 228 104 113
4 89 137 142 188 183 199 106 128 110 1
5 208 140 116 50 66 208 116 72 158 169
6 50 221 82 235 16 31 222 9 95 111
7 88 36 204 96 186 205 210 223 22 235
8 136 221 98 191 31 174 83 208 226 150
9 62 93 168 181 26 128 116 92 68 153