Read tab-delimited fields with pandas, some lines with more than one tabs

后端 未结 2 631
粉色の甜心
粉色の甜心 2021-01-22 03:20

I am trying to read a tab separated txt file using Pandas. The file looks like this:

data file sample

14.38   14.21   0.8951  5.386   3.312   2.462   4.9         


        
2条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-22 03:23

    If I use this code:

    import pandas as pd
    parsed_csv_txt = pd.read_csv("tabbed.txt",sep="\t")
    print(parsed_csv_txt)
    

    On this file:

    a   b   c   d   e
    14.69   2452    982 234 12
    14.11   5435    234     12
    16.63   1       12  66
    

    I get:

           a     b      c      d   e
    0  14.69  2452  982.0  234.0  12
    1  14.11  5435  234.0    NaN  12
    2  16.63     1    NaN   12.0  66
    

    Are there any issues with the output that we see here?

    If you would like a different output such as:

           a     b    c    d     e
    0  14.69  2452  982  234  12.0
    1  14.11  5435  234   12   NaN
    2  16.63     1   12   66   NaN
    

    Use this code:

    import pandas as pd
    parsed_csv_txt = pd.read_csv("tabbed.txt",delim_whitespace=True)
    print(parsed_csv_txt)
    

    Note

    For a longer discussion around the topic of variable amounts of whitespace between values check out this discussion: Can pandas handle variable-length whitespace as column delimiters

提交回复
热议问题