defaultdict

Is the defaultdict in Python's collections module really faster than using setdefault?

ぃ、小莉子 提交于 2019-12-03 00:22:34
I've seen other Python programmers use defaultdict from the collections module for the following use case: from collections import defaultdict s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)] def main(): d = defaultdict(list) for k, v in s: d[k].append(v) I've typically approached this problem by using setdefault instead: def main(): d = {} for k, v in s: d.setdefault(k, []).append(v) The docs do in fact claim that using defaultdict is faster , but I've seen the opposite to be true when testing myself: $ python -mtimeit -s "from withsetdefault import main; s = [('yellow

Aggregate sets according to keys with defaultdict python

ε祈祈猫儿з 提交于 2019-12-02 20:31:39
问题 I have a bunch of lines in text with names and teams in this format: Team (year)|Surname1, Name1 e.g. Yankees (1993)|Abbot, Jim Yankees (1994)|Abbot, Jim Yankees (1993)|Assenmacher, Paul Yankees (2000)|Buddies, Mike Yankees (2000)|Canseco, Jose and so on for several years and several teams. I would like to aggregate names of players according to team (year) combination deleting any duplicated names (it may happen that in the original database there is some redundant information). In the

Aggregate sets according to keys with defaultdict python

試著忘記壹切 提交于 2019-12-02 07:45:58
I have a bunch of lines in text with names and teams in this format: Team (year)|Surname1, Name1 e.g. Yankees (1993)|Abbot, Jim Yankees (1994)|Abbot, Jim Yankees (1993)|Assenmacher, Paul Yankees (2000)|Buddies, Mike Yankees (2000)|Canseco, Jose and so on for several years and several teams. I would like to aggregate names of players according to team (year) combination deleting any duplicated names (it may happen that in the original database there is some redundant information). In the example, my output should be: Yankees (1993)|Abbot, Jim|Assenmacher, Paul Yankees (1994)|Abbot, Jim Yankees

TypeError: first argument must be callable, defaultdict

时光毁灭记忆、已成空白 提交于 2019-12-02 06:13:05
The error comes from publishDB = defaultdict(defaultdict({})) I want to make a database like {subject1:{student_id:{assignemt1:marks, assignment2:marks,finals:marks}} , {student_id:{assignemt1:marks, assignment2:marks,finals:marks}}, subject2:{student_id:{assignemt1:marks, assignment2:marks,finals:marks}} , {student_id:{assignemt1:marks, assignment2:marks,finals:marks}}} . I was trying to populate it as DB[math][10001] = a dict and later read out as d = DB[math][10001] . Since, I am on my office computer I can not try different module. Am I on right track to do so? Such a nested dict structure

Python how to create a dict of dict of list with defaultdict

北城以北 提交于 2019-12-01 21:14:57
问题 How do I create a dict of dict of lists using defaultdict? I am getting the following error. >>> from collections import defaultdict >>> a=defaultdict() >>> a["testkey"]=None >>> a defaultdict(None, {'testkey': None}) >>> a["testkey"]["list"]=[] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'NoneType' object does not support item assignment 回答1: It's a little tricky. You make a defaultdict of defaultdicts, like so: defaultdict(lambda: defaultdict(list)) 回答2

List of values for duplicate keys in dictionary Python

旧街凉风 提交于 2019-12-01 12:41:57
Apologies in advance if this question has already been explored here - I looked at different answers here but couldn't find what I need. My goal is to create a dictionary like this -- {'a':[10, 9, 10, 10], 'b':[10, 9, 1, 0], 'c':[0, 5, 0, 1], and so on} What I have is multiple dictionaries with duplicate keys (same keys in every other dictionary), something like this {'a':10, 'b': 0, 'c': 2} {'a':7, 'b': 4, 'c': 4} {'a':4, 'b': 5, 'c': 3} I have no way of knowing the number of such dictionaries, or if there are keys continuing up to 'f', or a 'g' in them but I know that the keys are duplicated

List of values for duplicate keys in dictionary Python

六月ゝ 毕业季﹏ 提交于 2019-12-01 10:32:39
问题 Apologies in advance if this question has already been explored here - I looked at different answers here but couldn't find what I need. My goal is to create a dictionary like this -- {'a':[10, 9, 10, 10], 'b':[10, 9, 1, 0], 'c':[0, 5, 0, 1], and so on} What I have is multiple dictionaries with duplicate keys (same keys in every other dictionary), something like this {'a':10, 'b': 0, 'c': 2} {'a':7, 'b': 4, 'c': 4} {'a':4, 'b': 5, 'c': 3} I have no way of knowing the number of such

python defaultdict: 0 vs. int and [] vs list

做~自己de王妃 提交于 2019-11-30 06:05:37
Is there any difference between passing int and lambda: 0 as arguments? Or between list and lambda: [] ? It looks like they do the same thing: from collections import defaultdict dint1 = defaultdict(lambda: 0) dint2 = defaultdict(int) dlist1 = defaultdict(lambda: []) dlist2 = defaultdict(list) for ch in 'abracadabra': dint1[ch] += 1 dint2[ch] += 1 dlist1[ch].append(1) dlist2[ch].append(1) print dint1.items() print dint2.items() print dlist1.items() print dlist2.items() ## -- Output: -- [('a', 5), ('r', 2), ('b', 2), ('c', 1), ('d', 1)] [('a', 5), ('r', 2), ('b', 2), ('c', 1), ('d', 1)] [('a',

How to read two lines from a file and create dynamics keys in a for-loop?

北战南征 提交于 2019-11-29 11:17:56
In the following data, I am trying to run a simple markov model. Say I have a data with following structure: pos M1 M2 M3 M4 M5 M6 M7 M8 hybrid_block S1 S2 S3 S4 S5 S6 S7 S8 1 A T T A A G A C A|C C G C T T A G A 2 T G C T G T T G T|A A T A T C A A T 3 C A A C A G T C C|G G A C G C G C G 4 G T G T A T C T G|T C T T T A T C T Block M represents data from one set of catergories, so does block S . The data are the strings which are made by connecting letter along the position line. So, the string value for M1 is A-T-C-G , and so is for every other block. There is also one hybrid block that has two

Is collections.defaultdict thread-safe?

Deadly 提交于 2019-11-29 05:46:44
I have not worked with threading in Python at all and asking this question as a complete stranger. I am wondering if defaultdict is thread-safe. Let me explain it: I have d = defaultdict(list) which creates a list for missing keys by default. Let's say I have multiple threads started doing this at the same time: d['key'].append('value') At the end, I'm supposed to end up with ['value', 'value'] . However, if the defaultdict is not thread-safe, if the thread 1 yields to thread 2 after checking if 'key' in dict and before d['key'] = default_factory() , it will cause interleaving, and the other