问题
I have a following path data:
id1 p1 p2
0 1 7.935 5.103
1 1 7.934 5.112
2 1 7.936 5.102
3 1 7.938 5.145
4 2 7.930 5.191
5 2 7.945 5.161
6 2 7.954 5.127
In the above data frame, (p1,p2) forms the coordinate data and all the points belonging to the same "id1" forms one separate path; in the above df rows(0-3) belonging to id1 = 1 is one path and so on.
I am trying to implement Quadtree for the analysis of these trajectories. To implement Quadtrees I am trying to use "pyqtree" https://github.com/karimbahgat/Pyqtree python package.
In the code "len(spindex)" are the total number of items, while bounding box, "bbox" is in the format (xmin, ymin, xmax, ymax), "testitem" is the intersection bounding box, while len(matches) will give the number of nodes in the intersection.
I am trying to use the above df to implement quadtree. Please let me know how to use the above df as "items" in the code. And then how to give different bounding box for these trajectories. Also, how will I know or query the tree to find which trajectories are located in which area of the quadtree.
回答1:
So you want to query the location of each trajectory, which means you need to calculate and insert the bbox for each. Ususally this type of data would have one row for each trajectory with a geometry field describing the sequence of xy coordinates. But since your coordinates go downwards we must do a workaround to first group all xy points belonging to each trajectory id and then calculate the bbox.
Here is untested example code to popoulate the index (my pandas is quite rusty so prob some mistakes in there):
for group in df.groupby('voygid'):
bbox = [ group['x'].min(), group['y'].min(), group['x'].max(), group['y'].max() ]
spindex.insert(group['voygid'][0], bbox)
Not really sure how you plan to cluster, which would be a separate question. The main purpose of a quadtree is not to ask which quad an item is located in, but rather to ask which items intersect with any arbitrary bbox region.
So if you divide your coordinate area into separate cluster regions you can then query which voygid trajectories are located in each.
for clusterbbox in clusters:
voygids = spindex.intersects(clusterbbox)
Note that an item can span across and be located in multiple quads, so you may or may not need need additional fleshing out afterwards.
回答2:
To turn the trajectories into a list of items per voygid, we can use pandas.groupby:
Code:
def get_items(group):
return [Item(row[1].x, row[1].y) for row in group.iterrows()]
voygids = dict(df.groupby('voygid').apply(get_items))
How?
The groupby will collect the rows associated with a particular voygid
. Then we can use pandas.DataFrame.apply() to call a function which will return a list of items for the x, y
pairs in the group. The function uses a list comprehension to construct the list of Items()
.
Test Code:
df = pd.read_fwf(StringIO(u"""
voygid x y
1 -7.935513 5.103579
1 -7.935781 5.103300
1 -7.936354 5.102726
1 -7.935915 5.102802
2 -7.935306 5.103424
2 -7.945678 5.119876
2 -7.954764 5.128738"""), header=1)
print(df)
class Item:
def __init__(self, x, y):
left = x-1
right = x+1
top = y-1
bottom = y+1
self.bbox = [left, top, right, bottom]
def __repr__(self):
return '[%s]' % ' '.join('%.4f' % x for x in self.bbox)
def get_items(group):
return [Item(row[1].x, row[1].y) for row in group.iterrows()]
voygids = dict(df.groupby('voygid').apply(get_items))
for voygid, items in voygids.items():
print(voygid)
for item in items:
print(' ' + repr(item))
Results:
voygid x y
0 1 -7.935513 5.103579
1 1 -7.935781 5.103300
2 1 -7.936354 5.102726
3 1 -7.935915 5.102802
4 2 -7.935306 5.103424
5 2 -7.945678 5.119876
6 2 -7.954764 5.128738
1
[-8.9355 4.1036 -6.9355 6.1036]
[-8.9358 4.1033 -6.9358 6.1033]
[-8.9364 4.1027 -6.9364 6.1027]
[-8.9359 4.1028 -6.9359 6.1028]
2
[-8.9353 4.1034 -6.9353 6.1034]
[-8.9457 4.1199 -6.9457 6.1199]
[-8.9548 4.1287 -6.9548 6.1287]
来源:https://stackoverflow.com/questions/44147628/kd-qtree-implementation