I have a csv file with, say, 50 rows of data, and I would like to split it into separate csv files for each row, which includes first row (header) and the the relevant row.
I have tried this and it works fine for your purpose. Unfortunately, I didn't get any csvfile_out error
and your with statement works correctly in my Python 2.7.12 console.
import csv
counter = 1
with open('mock_data.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
row1 = next(reader) # here you save your first line of the .csv file
for row in reader:
if row: # if row is not empty, write a file with this row
filename = "file_%s" % str(counter)
with open(filename, 'w') as csvfile_out:
writer = csv.writer(csvfile_out)
writer.writerow(row1) #here you write your row1 as first row of csvfile_out
writer.writerow(row)
counter = counter + 1
Here is a solution with pandas
. Assume the content of csv as follows:
Name, Age, Gender
John, 20, Male
Jack, 22, Male
Jill, 18, Female
And my code is as follows:
import pandas as pd
df = pd.read_csv("mock_data.csv")
for index, row in df.iterrows():
file_name = row['Name']+".csv" #Change the column name accordingly
pd.DataFrame(row).T.to_csv(file_name, index=None)
This will create filenames based on the values of the column "Name" (i.e. Jack, John and Jill) to produce three files John.csv
, Jack.csv
and Jill.csv
. Content of John.csv
is as follows:
Name | Age | Gender |
---------------------------
John | 20 | Male |
Content of Jack.csv
is as follows:
Name | Age | Gender |
---------------------------
Jack | 22 | Male |
Content of Jill.csv
is as follows:
Name | Age | Gender |
---------------------------
Jill | 20 | Female |
P.S: If you don't want the header, just add header = None
when calling .to_csv()
function. For example:
pd.DataFrame(row).T.to_csv(file_name, index=None, Header=None)
You can use DictReader too...
import csv
counter = 1
with open('mock_data.csv', 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
filename = "file_%s" % str(counter)
with open(filename, 'w') as csvfile_out:
writer = csv.DictWriter(csvfile_out, fieldnames=reader.fieldnames)
headers = dict((n, n) for n in reader.fieldnames)
writer.writerow(headers)
writer.writerow(row)
counter = counter + 1