I have a list of data that looks like the following:
// timestep,x_position,y_position
0,4,7
0,2,7
0,9,5
0,6,7
1,2,5
1,4,7
1,9,0
1,6,8
... and
In the case your data is not already sorted by desired criteria, here's the code that might help to group the data:
#!/usr/bin/env python
"""
$ cat data_shuffled.txt
0,2,7
1,4,7
0,4,7
1,9,0
1,2,5
0,6,7
1,6,8
0,9,5
"""
from itertools import groupby
from operator import itemgetter
# load the data and make sure it is sorted by the first column
sortby_key = itemgetter(0)
data = sorted((map(int, line.split(',')) for line in open('data_shuffled.txt')),
key=sortby_key)
# group by the first column
grouped_data = []
for key, group in groupby(data, key=sortby_key):
assert key == len(grouped_data) # assume the first column is 0,1, ...
grouped_data.append([trio[1:] for trio in group])
# print the data
for i, pairs in enumerate(grouped_data):
print i, pairs
Output:
0 [[2, 7], [4, 7], [6, 7], [9, 5]]
1 [[4, 7], [9, 0], [2, 5], [6, 8]]
Let's look at
d[t].append(c)
What is the value of d[t]
? Try it.
d = {}
t = 0
d[t]
What do you get? Oh. There's nothing in d
that has a key of t
.
Now try this.
d[t] = []
d[t]
Ahh. Now there's something in d
with a key of t
.
There are several things you can do.
setdefault
. d.setdefault(t,[]).append(c)
.defaultdict(list)
instead of a simple dictionary, {}
.Edit 1. Optimization
Given input lines from a file in the above form: ts, x, y, the grouping process is needless. There's no reason to go from a simple list of ( ts, x, y ) to a more complex list of ( ts, (x,y), (x,y), (x,y), ... ). The original list can be processed exactly as it arrived.
d= collections.defaultdict(list)
for ts, x, y in someFileOrListOrQueryOrWhatever:
d[ts].append( (x,y) )
Edit 2. Answer Question
"when initialising a dictionary, you need to tell the dictionary what the key-value data structure will look like?"
I'm not sure what the question means. Since, all dictionaries are key-value structures, the question's not very clear. So, I'll review the three alternatives, which may answer the question.
Example 2.
Initialization
d= {}
Use
if t not in d:
d[t] = list()
d[t].append( c )
Each dictionary value must be initialized to some useful structure. In this case, we check to see if the key is present; when the key is missing, we create the key and assign an empty list.
Setdefault
Initialization
d= {}
Use
d.setdefault(t,list()).append( c )
In this case, we exploit the setdefault
method to either fetch a value associated with a key or create a new value associated with a missing key.
default dict
Initialization
import collections
d = collections.defaultdict(list)
Use
d[t].append( c )
The defaultdict
uses an initializer function for missing keys. In this case, we provide the list
function so that a new, empty list is created for a missing key.
I think you want to use setdefault. It's a bit weird to use but does exactly what you need.
d.setdefault(t, []).append(c)
The .setdefault
method will return the element (in our case, a list) that's bound to the dict's key t
if that key exists. If it doesn't, it will bind an empty list to the key t
and return it. So either way, a list will be there that the .append
method can then append the tuple c
to.
dict=[] //it's not a dict, it's a list, the dictionary is dict={}
elem=[1,2,3]
dict.append(elem)
you can access the single element in this way:
print dict[0] // 0 is the index
the output will be:
[1, 2, 3]