I am pretty new to python. I need to create a class that loads csv data into a dictionary.
I want to be able to control the keys and value So let say the following c
Have you looked at the csv module?
import csv
import csv
reader = csv.reader(open('workers.csv', newline=''), delimiter=',', quotechar='"')
workers = [ageName(row[0], row[1]) for row in reader]
workers now has a list of all the workers
>>> workers[0].name
'jon'
added edit after question was altered
Is there any reason you're using old style classes? I'm using new style here.
class Student:
sports = []
def __init__(self, row):
self.lname, self.fname, self.ID, self.sport = row
self.sports.append(self.sport)
def get(self):
return (self.lname, self.fname, self.ID, self.sport)
reader = csv.reader(open('copy-john.csv'), delimiter=',', quotechar='"')
print "%-14s|%-10s|%-5s|%-11s" % tuple(reader.next()) # read header line from csv
print "-" * 45
students = list(map(Student, reader)) # read all remaining lines
for student in students:
print "%-14s|%-10s|%-5s|%3s" % student.get()
# Printing all sports that are specified by students
for s in set(Student.sports): # class attribute
print s, Student.sports.count(s)
# Printing sports that are not picked
allsports = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport']
for s in set(allsports) - set(Student.sports):
print s, 0
Hope this gives you some ideas of the power of python sequences. ;)
edit 2, shortened as much as possible... just to show off :P
Ladies and gentlemen, 7(.5) lines.
allsports = ['Basketball','Football','Other','Baseball','Handball',
'Soccer','Volleyball','I do not like sport']
sports = []
reader = csv.reader(open('copy-john.csv'))
for row in reader:
if reader.line_num: sports.append(s[3])
print "%-14s|%-10s|%-5s|%-11s" % tuple(s)
for s in allsports: print s, sports.count(s)
I know this is a pretty old question, but it's impossible to read this, and not think of the amazing new(ish) Python library, pandas
. Its main unit of analysis is a think called a DataFrame which is modelled after the way R handles data.
Let's say you have a (very silly) csv file called example.csv
which looks like this:
day,fruit,sales
Monday,Banana,10
Monday,Orange,20
Tuesday,Banana,12
Tuesday,Orange,22
If you want to read in a csv in double-quick time, and do 'stuff' with it, you'd be hard pressed to beat the following code for either brevity or ease of use:
>>> import pandas as pd
>>> csv = pd.read_csv('example.csv')
>>> csv
day fruit sales
0 Monday Banana 10
1 Monday Orange 20
2 Tuesday Banana 12
3 Tuesday Orange 22
>>> csv[csv.fruit=='Banana']
day fruit sales
0 Monday Banana 10
2 Tuesday Banana 12
>>> csv[(csv.fruit=='Banana') & (csv.day=='Monday')]
day fruit sales
0 Monday Banana 10
In my opinion, this is really fantastic stuff. Never iterate over a csv.reader object again!
I second Mark's suggestion. In particular, look at DictReader from csv module that allows reading a comma separated (or delimited in general) file as a dictionary.
Look at PyMotW's coverage of csv module for a quick reference and examples of usage of DictReader, DictWriter