问题
I'm trying to iterate over each row in a list of lists, append an element from each row to a new list, then find the unique elements in the new list.
I understand that I can do this easily with a for loop. I'm trying a different route because I want to learn more about classes and functions.
Here's an example of the list of lists. The first row is the header:
legislators = [
['last_name', 'first_name', 'birthday', 'gender', 'type', 'state', 'party'],
['Bassett', 'Richard', '1745-04-02', 'M', 'sen', 'DE', 'Anti-Administration'],
['Bland', 'Theodorick', '1742-03-21', '', 'rep', 'VA', ''],
['Burke', 'Aedanus', '1743-06-16', '', 'rep', 'SC', ''],
['Carroll', 'Daniel', '1730-07-22', 'M', 'rep', 'MD', ''],
['Clymer', 'George', '1739-03-16', 'M', 'rep', 'PA', ''],
['Contee', 'Benjamin', '', 'M', 'rep', 'MD', ''],...]
Here's my code:
import csv
f = open("legislators.csv")
csvreader = csv.reader(f)
legislators = list(csvreader)
class Dataset:
def __init__(self, data):
self.header = data[0] #Isolate header from CSV file
self.data = data[1:] #Subset CSV data to remove header
legislators_dataset = Dataset(legislators)
def the_set_maker(dataset):
gender = []
for each in dataset:
gender.append(each[3])
return set(gender)
t=the_set_maker(legislators_dataset)
print(t)
I get the following error:
TypeErrorTraceback (most recent call last)
<ipython-input-1-d65cb459931b> in <module>()
20 return set(gender)
21
---> 22 t=the_set_maker(legislators_dataset)
23 print(t)
<ipython-input-1-d65cb459931b> in the_set_maker(dataset)
16 def the_set_maker(dataset):
17 gender = []
---> 18 for each in dataset:
19 gender.append(each[3])
20 return set(gender)
TypeError: 'Dataset' object is not iterable
I think the answer is to try to create a method using def __iter__(self)
in my Dataset
class, but I haven't been able to get this to work. Is this the right track? If not, what's a better one?
回答1:
According to the documentation for __iter__:
This method should return a new iterator object that can iterate over all the objects in the container.
You might try the following class definition:
class Dataset:
def __init__(self, data):
self.header = data[0] #Isolate header from CSV file
self.data = data[1:] #Subset CSV data to remove header
def __iter__(self):
return iter(self.data)
If you're open to trying new options, consider using Pandas:
import pandas as pd
df = pd.read_csv('legislators.csv')
t=df['gender']
Or, if you really want to read in the CSV yourself,
df = pd.DataFrame(legislators[1:], columns=legislators[0])
回答2:
As you mentioned, you'll need to implement __iter__
in class Dataset
. Note that this is actually the set(...)
call that's throwing the error, since it iterates through your class to enumerate the set elements.
Luckily your set elements are likely just Dataset.data
, which makes this easy to write Dataset.__iter__
.
class Dataset(object):
...
def __iter__(self):
return iter(self)
I would point out, however, that your the_set_maker
function seems a little too specialized to be top-level. It's also a bit trivial, since it's literally set([el[3] for el in container])
. I would put this in Dataset
as well.
class Dataset(object):
...
def to_set(self):
return set([el[3] for el in self.data])
# Note that this throws away your header!
回答3:
you need to change a little
class Dataset:
i = 0
def __init__(self, data):
self.header = data[0] #Isolate header from CSV file
self.data = data[1:] #Subset CSV data to remove header
def __iter__(self):
return self
def __next__(self):
return self.next()
def next(self):
if self.i < len(self.data):
self.i += 1
return self.data[self.i-1]
else:
raise StopIteration()
来源:https://stackoverflow.com/questions/45700016/making-objects-iterable