I have a 3D pointcloud and I\'d like to efficiently query all points within distance d from an arbitrary point p (which is not necessarily part of the stored pointcloud)
I don't understand your API, you can round up all points in a PointCloud that lie inside an arbitrary sphere, but you also say that the point-clouds are stored? In that case shouldn't you get a list of PointClouds that is inside the given sphere, otherwise what is the point (excuse the pun) with having the PointClouds stored?
Instead of trying to define the API in advance, define it when you need it. There is no need to implement something that never will be used, let alone optimize a function that never will be called (unless it's for fun of course :)).
I think you should implement the bounding-box culling, followed by the more detailed sphere search as a first implementation. Perhaps it's not such a bottleneck as you think, and perhaps you will have far more serious bottlenecks to consider. It's always possible to optimize later when you actually see that you have everything working toghether as you have planned.
Well, it depends on what other uses you need for the data structure.
You can have a list of distances from point p to other points, ordered by distance, and map these lists to the points with a hashmap.
map:
p1 -> [{p2, d12}, {p4, d14}, {p3, d13}]
p2 -> ...
...
You can look up the point in the map, and iterate the list until the distance is higher than required.
Have a look at A Template for the Nearest Neighbor Problem (Larry Andrews at DDJ). Its only 2D, having a retrival complexity of O(log n), but it might be adopted for 3D as well.
VTK can help:
void vtkAbstractPointLocator::FindPointsWithinRadius ( double R, double x, double y, double z, vtkIdList * result )
Subclasses of vtkAbstractPointLocator contain different data structures for search acceleration: regular buckets, kd-trees, and octrees.
A map with key equal to the distance and value being the Point itself would allow you to query for all Points less than a given distance or within a given range.
What you want is a structure that decomposes space so that particular regions can be found efficiently. A properly decomposed octree or kD-tree should allow you to do this well, as you would only 'open' the section of the tree containing your point p
to look for points nearby. This should let you put a fairly low asymptotic bound on how many extra points you need to compare distance to (knowing that below some level of decomposition, all points are close enough). Unfortunately, I don't know the literature in this area well enough to give more detailed pointers. My encounter with these things is from the Barnes-Hut n-Body simulation algorithm.
Here's another question closely related to this one. And another. And a third, mentioning a data structure (Hilbert R-Trees) that I hadn't previously heard of.