How do I read CSV data into a record array in NumPy?

后端 未结 11 1324
广开言路
广开言路 2020-11-22 02:55

I wonder if there is a direct way to import the contents of a CSV file into a record array, much in the way that R\'s read.table(), read.delim(), a

11条回答
  •  误落风尘
    2020-11-22 03:58

    As I tried both ways using NumPy and Pandas, using pandas has a lot of advantages:

    • Faster
    • Less CPU usage
    • 1/3 RAM usage compared to NumPy genfromtxt

    This is my test code:

    $ for f in test_pandas.py test_numpy_csv.py ; do  /usr/bin/time python $f; done
    2.94user 0.41system 0:03.05elapsed 109%CPU (0avgtext+0avgdata 502068maxresident)k
    0inputs+24outputs (0major+107147minor)pagefaults 0swaps
    
    23.29user 0.72system 0:23.72elapsed 101%CPU (0avgtext+0avgdata 1680888maxresident)k
    0inputs+0outputs (0major+416145minor)pagefaults 0swaps
    

    test_numpy_csv.py

    from numpy import genfromtxt
    train = genfromtxt('/home/hvn/me/notebook/train.csv', delimiter=',')
    

    test_pandas.py

    from pandas import read_csv
    df = read_csv('/home/hvn/me/notebook/train.csv')
    

    Data file:

    du -h ~/me/notebook/train.csv
     59M    /home/hvn/me/notebook/train.csv
    

    With NumPy and pandas at versions:

    $ pip freeze | egrep -i 'pandas|numpy'
    numpy==1.13.3
    pandas==0.20.2
    

提交回复
热议问题