Google Datalab) How to read multiple csv files existing in Google Cloud Storage by pandas read_csv() to?

纵然是瞬间 提交于 2019-12-07 21:01:33

问题


I found the solution for reading a "single" csv file in a Datalab : How can i load my csv from google dataLab to a pandas data frame?

But I wonder how I could read "multiple" csv files in Datalab.

What I tried is like this :

variable_list = ['IBM', 'SPY']
for variable in variable_list:
  file_path = "gs://chois-trader-bucket/data/" + variable + ".csv"
  %%storage read --object file_path --variable variable

But this one failed because python variable is not compatible with magic command.

How can I deal with multiple csv files effectively?


回答1:


You can use variables enclosed in braces.

e.g) %storage read --object {file_path} --variable variable



回答2:


Alternatively, you can do a union of all the csv files in a folder using a one liner in bash as such (assuming no headers in the csv files) :

cat *.csv > unioned_file_name.csv

Then import that one file into pandas.



来源:https://stackoverflow.com/questions/45532796/google-datalab-how-to-read-multiple-csv-files-existing-in-google-cloud-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!