Disturbing odd behavior/bug in Python itertools groupby?

前端未结

关注

 3  1663

I am using itertools.groupby to parse a short tab-delimited textfile. the text file has several columns and all I want to do is group all the entries that have a pa

相关标签:

3条回答

迷失自我

2021-01-23 08:17

I don't know what your data looks like but my guess is it's not sorted. groupby works on sorted data

0 讨论(0)
发布评论:

提交评论
- 加载中...

灰色年华

2021-01-23 08:33

You're going to want to change your code to force the data to be in key order...

data = csv.DictReader(open(f), delimiter="\t", fieldnames=fieldnames)
sorted_data = sorted(data, key=operator.itemgetter(col_name))
for name, entries in itertools.groupby(data, key=operator.itemgetter(col_name)):
    pass # whatever

The main use though, is when the datasets are large, and the data is already in key order, so when you have to sort anyway, then using a defaultdict is more efficient

from collections import defaultdict
name_entries = defaultdict(list)
for row in data:
    name_entries[row[col_name]].append(row)

0 讨论(0)

梦毁少年i

2021-01-23 08:34

According to the documentation, groupby() groups only consecutive occurrences of the same key.

0 讨论(0)
发布评论:

提交评论
- 加载中...