Python Pandas add Filename Column CSV

戏子无情 提交于 2020-05-09 22:53:10

问题


My python code works correctly in the below example. My code combines a directory of CSV files and matches the headers. However, I want to take it a step further - how do I add a column that appends the filename of the CSV that was used?

import pandas as pd
import glob

globbed_files = glob.glob("*.csv") #creates a list of all csv files

data = [] # pd.concat takes a list of dataframes as an agrument
for csv in globbed_files:
    frame = pd.read_csv(csv)
    data.append(frame)

bigframe = pd.concat(data, ignore_index=True) #dont want pandas to try an align row indexes
bigframe.to_csv("Pandas_output2.csv")

回答1:


This should work:

import os

for csv in globbed_files:
    frame = pd.read_csv(csv)
    frame['filename'] = os.path.basename(csv)
    data.append(frame)

frame['filename'] creates a new column named filename and os.path.basename() turns a path like /a/d/c.txt into the filename c.txt.




回答2:


Mike's answer above works perfectly. In case any googlers run into the following error:

>>> TypeError: cannot concatenate object of type "<type 'str'>"; 
    only pd.Series, pd.DataFrame, and pd.Panel (deprecated) objs are valid

It's possibly because the separator is not correct. I was using a custom csv file so the separator was ^. Becuase of that I needed to include the separator in the pd.read_csv call.

import os

for csv in globbed_files:
    frame = pd.read_csv(csv, sep='^')
    frame['filename'] = os.path.basename(csv)
    data.append(frame)


来源:https://stackoverflow.com/questions/41857659/python-pandas-add-filename-column-csv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!