I have a csv file with the dimensions 100*512
, I want to process it further in spark
. The problem with the file is that it doesn\'t contain header i.e
Unix:
cat header_file.csv data_file.csv > data_file.csv
Windows:
type header_file.csv data_file.csv > data_file.csv
Bit a old way ...
Content of demo.csv before columns:
4444,Drowsy,bit drowsy
45888,Blurred see - hazy,little seeing vision
45933,Excessive upper pain,pain problems
112397013,air,agony
76948002,pain,agony
Content of xyz.txt :
Col 1,Col 2,Col 3
Code with comments inline
#Open CSV file
with open("demo.csv", "r+") as f:
#Open file which has header
with open("xyz.txt",'r') as fh:
#Read header
header = fh.read()
#Read complete data of CSV file
old = f.read()
#Get cursor to start of file
f.seek(0)
#Write header and old data to file.
f.write(header+ "\n" + old)
Content of demo.csv:
Col 1,Col 2,Col 3
4444,Drowsy,bit drowsy
45888,Blurred see - hazy,little seeing vision
45933,Excessive upper pain,pain problems
112397013,air,agony
76948002,pain,agony
First read your csv file:
from pandas import read_csv
df = read_csv('test.csv')
If there are two columns in your dataset(column a, and column b) use:
df.columns = ['a', 'b']
Write this new dataframe to csv
df.to_csv('test_2.csv')
you can use it :
import csv
with open('names.csv', 'w') as csvfile:
fieldnames = ['first_name', 'last_name']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})
writer.writerow({'first_name': 'Lovely', 'last_name': 'Spam'})
writer.writerow({'first_name': 'Wonderful', 'last_name': 'Spam'})