pandas unable to read from large StringIO object

后端 未结 1 975
暗喜
暗喜 2021-02-14 07:42

I\'m using pandas to manage a large array of 8-byte integers. These integers are included as space-delimited elements of a column in a comma-delimited CSV file, and the array si

1条回答
  •  闹比i
    闹比i (楼主)
    2021-02-14 07:56

    There are two issues here, one fundamental and one you simply haven't come across yet. :^)

    First, after you write to c, you're at the end of the (virtual) file. You need to seek back to the start. We'll use a smaller grid as an example:

    >>> a = np.random.randint(0,256,(10,10)).astype('uint8')
    >>> b = pd.DataFrame(a)
    >>> c = StringIO()
    >>> b.to_csv(c, delimiter=' ', header=False, index=False)
    >>> next(c)
    Traceback (most recent call last):
      File "", line 1, in 
        next(c)
    StopIteration
    

    which generates the "no columns" error. If we seek first, though:

    >>> c.seek(0)
    >>> next(c)
    '103,3,171,239,150,35,224,190,225,57\n'
    

    But now you'll notice the second issue-- commas? I thought we requested space delimiters? But to_csv only accepts sep, not delimiter. Seems to me it should either accept it or object that it doesn't, but silently ignoring it feels like a bug. Anyway, if we use sep (or delim_whitespace=True):

    >>> a = np.random.randint(0,256,(10,10)).astype('uint8')
    >>> b = pd.DataFrame(a)
    >>> c = StringIO()
    >>> b.to_csv(c, sep=' ', header=False, index=False)
    >>> c.seek(0)
    >>> d = pd.read_csv(c, sep=' ', header=None, dtype='uint8')
    >>> d
         0    1    2    3    4    5    6    7    8    9
    0  209   65  218  242  178  213  187   63  137  145
    1  161  222   50   92  157   31   49   62  218   30
    2  182  255  146  249  115   91  160   53  200  252
    3  192  116   87   85  164   46  192  228  104  113
    4   89  137  142  188  183  199  106  128  110    1
    5  208  140  116   50   66  208  116   72  158  169
    6   50  221   82  235   16   31  222    9   95  111
    7   88   36  204   96  186  205  210  223   22  235
    8  136  221   98  191   31  174   83  208  226  150
    9   62   93  168  181   26  128  116   92   68  153
    

    0 讨论(0)
提交回复
热议问题