问题
I have a list of points XY and I want to group them by a given distance, let's say all the points that are at x distance between them should be grouped in different list.
Basically if I have A=(0,0), B=(0,1), C=(0,2), I want to group all points that have a maxDistance of 1, in order to obtain :[[A,B],[C]] ;
回答1:
I didn't really understand your question, so I'm not really sure how you want to do the grouping, but this might start you off in the right direction, at least.
(Written in VB, but near-identical in C# - You also didn't state your language preference):
Dim MyPoints As New List(Of Point)
MyPoints.Add(New Point(0, 0))
MyPoints.Add(New Point(0, 1))
MyPoints.Add(New Point(0, 2))
Dim query = From pt1 In MyPoints
From pt2 In MyPoints
Where Not (pt1.Equals(pt2))
Select New With {.pt1 = pt1, .pt2 = pt2, .dist = Math.Sqrt((pt1.X - pt2.X) ^ 2 + (pt1.Y - pt2.Y) ^ 2)}
回答2:
What are you trying to do is named clustering, which means grouping a set of data (two dimensional points in your case) into a set of groups with some characteristics (a given distance between points). I strongly recommend to read link provided above to understand it better. You might be interested in two types of clustering:
- hierarchical clustering, which creates groups based on distance connectivity,
- centroids, which creates groups "surrounding" centers of groups
It all depends how much data you have. For small sets you can try to implement some simple algorithms by yourself. For bigger data, I would prefer to use third-party library like Numl which contains methods for both abovementioned types.
Here is an example code of clustering using Numl. Given class:
class Point
{
[Feature]
public double X { get; set; }
[Feature]
public double Y { get; set; }
public Point(double X, double Y)
{
this.X = X;
this.Y = Y;
}
public override string ToString()
{
return string.Format("({0}; {1})", X, Y);
}
}
you can write:
var model = new HClusterModel();
var desc = Descriptor.Create<Point>();
var linker = new CentroidLinker(new EuclidianDistance());
var data = new List<Point>() { new Point(0.0, 1.0),
new Point(0.0, 2.0),
new Point (10.0, 0.0) };
var result = model.Generate(desc, data, linker);
foreach (var cluster in result.Children)
{
Console.WriteLine("Cluster:");
Console.WriteLine(string.Join(", ", cluster.Members.OfType<Point>()));
}
which results in:
回答3:
I had a stab at it, although this probably isn't a fantastically efficient way to do things; the link in Konrad's answer seems like a good place to explore.
I'm not entirely sure how you're defining "within range", so I assumed a simple distance calculation.
// Set up some points
List<Point> Points = new List<Point>();
Points.Add(new Point(0, 0));
Points.Add(new Point(0, 1));
Points.Add(new Point(0, 2));
// Distance
int maxDistance = 1;
// Replace as appropriate
Func<Point, Point, int, bool> myDistanceFunction = delegate(Point p1, Point p2, int range)
{
// Same coordinate.
if (p1 == p2)
return true;
int xDelta = p1.X - p2.X;
int yDelta = p1.Y - p2.Y;
double distance = Math.Sqrt(xDelta * xDelta + yDelta * yDelta);
return (distance <= range);
};
// Loop through all points and calculate distance to all other points.
var Results = Points.Select(firstPoint => new
{
TargetPoint = firstPoint,
PointsInRange = Points
.Where(secondPoint =>
(secondPoint != firstPoint) && // Will you allow same coordinates?
myDistanceFunction(secondPoint, firstPoint, maxDistance))
});
// Spit the results out.
foreach (var result in Results)
{
Console.WriteLine("Point {0} - Points within {1} unit(s):", result.TargetPoint, maxDistance);
foreach (var point in result.PointsInRange)
{
Console.WriteLine("\t{0}", point);
}
}
Output:
Point {X=0,Y=0} - Points within 1 unit(s):
{X=0,Y=1}
Point {X=0,Y=1} - Points within 1 unit(s):
{X=0,Y=0}
{X=0,Y=2}
Point {X=0,Y=2} - Points within 1 unit(s):
{X=0,Y=1}
There's room for improvement e.g. it doesn't feel smart to calculate distances for pairs of points twice, and I'm not if you'll allow duplicate coordinates, but there might be something of use in there.
You could also write the distance function as a lamba expression, although I'm not sure it's clearer.
Func<Point, Point, int, bool> myDistanceFunction =
(
(p1, p2, range) => Math.Sqrt(
((p1.X - p2.X) * (p1.X - p2.X)) +
((p1.Y - p2.Y) * (p1.Y - p2.Y))
) <= range
);
回答4:
Sorry to all, i made a post not so clear, basically im using c# and i was excluding clustering for specific purpose, let's say i have some points and their ids and i need to "cluster them", keeping information about ids , then simply made a medium point on X axis, cos im interested only in grouping by that position attribute.
At the end, points are maximum 10, and keeping information about ids is very important to know who is where, so i thought to collect ids of points close enough and then use that list of list of coordinates to make out results, did it very raw, cos im in a rush, but fully opened to further implementation, just im not able to use linq :)
So i used something like this :
// class to hold information
public class userObject{
public string id;
public Vector3D position=Vector3D.Zero;
public userObject(string Id, Vector3D Position){
id=Id;
position=Position;
}
}
// list of grouped ids (nanocluster :)
public Dictionary<int, List<userObject>> slots ;
private void forceCheck(){
// create list of object from incoming coordinates (ids and point vector3d)
List<userObject> users=new List<userObject>();
for(int a=0;a<FId_In.SliceCount;a++){
userObject uo=new userObject(FId_In[a],FPositions_In[a]);
users.Add(uo);
}
// Clean result, this is different in another version im working on
slots =new Dictionary<int,List<userObject>>();
// check for close points ( a couple of lines should be changed to achieve a real clustring, but this way i can control all points will not create an horizontal cluster, told u raw mode on
for(int k=0;k<users.Count;k++){
List<userObject> matches=new List<userObject>();
// Check if ids is already registered in one slot
int isInSlot=checkIdInSlots(users[k].id);
if(isInSlot==-1){
matches.Add(users[k]);
for(int j=k+1;j<users.Count;j++){
// call a function to check x distance, but can use full vector3d when needed
if(checkClose(users[k].position,users[j].position,FXThreshold_In[0])){
matches.Add(users[j]);
}
}
// finally add entry with grouped ids....sure all this is a line of linq :D
addNewSlot(matches);
}
}
}
WOuld be nice to understand better how linq can be used to achive same result, sure can be more robust, thank you all :)
来源:https://stackoverflow.com/questions/20907567/filter-points-xy-by-distance-with-linq