Convert Pandas DataFrame to & from In-Memory Feather

后端 未结 1 485
夕颜
夕颜 2021-01-19 07:23

Using the IO tools in pandas it is possible to convert a DataFrame to an in-memory feather buffer:

import pandas as pd  
from io import BytesIO         


        
相关标签:
1条回答
  • 2021-01-19 08:03

    With pandas==0.25.2 this can be accomplished in the following way:

    import pandas
    import io
    df = pandas.DataFrame(data={'a': [1, 2], 'b': [3.0, 4.0]})
    buf = io.BytesIO()
    df.to_feather(buf)
    output = pandas.read_feather(buf)
    

    Then a call to output.head(2) returns:

        a    b
     0  1  3.0
     1  2  4.0
    

    If you have a DataFrame with multiple indexes, you may see an error like

    ValueError: feather does not support serializing for the index; you can .reset_index()to make the index into column(s)

    In which case you need to call .reset_index() before to_feather, and call .set_index([...]) after read_feather


    Last thing I would like to add, is that if you are doing something with the BytesIO, you need to seek back to 0 after writing the feather bytes. For example:

    buffer = io.BytesIO()
    df.reset_index(drop=False).to_feather(buffer)
    buffer.seek(0)
    s3_client.put_object(Body=buffer, Bucket='bucket', Key='file')
    
    0 讨论(0)
提交回复
热议问题