I\'m trying to reduce and combine a number of points to the center point of those locations. Right now I\'m brute-forcing it by finding the closest pair, combining those and re
As for an efficient way, have you considered laying down a grid over the map and then assigning each point to its corresponding cell in the grid? This should have good performance.
A better (yet slower) approach would be to have dynamic cells instead of fixed cells like the suggestion above. You start with no cells at all. Then drop the first point in the map and define a cell with some predetermined dimensions around it. Then drop the next point on the map. If it falls inside the previous cell you add it to it, and possibly recenter the cell around the two points. If the point falls outside the cell then you create a second cell for it. Now you add the third point to the map and check it against the two cells. This process continues until you have added all the points to the map. I hope you get the idea. I think you could approximately limit the number of reduced points by changing the size of the cells.
EDIT (based on comment from rrenaud): You can start using a big cell size and apply one of the algorithms above. If the number of cells you end up with is too low, then you can repeat the algorithm on each of the cells and subdivide them even more. While this won't allow you to exactly reduce to a fixed number of points, you can get pretty close.
Have you considered looking at K-Cluster algorithms?
These kind of algorithms are used to "group" close/related objects (in your case, points) into clusters, based on their nearest Mean. These algorithms are usually quite optimized, and are built to handle large amount of data. In the case of 4000 points -> 1000 points, you would run a 1000-Cluster run on your data, and get back 1000 groups of points, each can be merged to a single point.
To speed up working out distances between points:
If you do some elementary algebra you get:
D = R*Sqrt(Lat2^2 + Lat1^2 - 2*Lat1*Lat2 + cos^2((Lat2 + Lat1) /2)(Lon2^2 + Lon1^2 - 2*Lon1*Lon2))
The first thing you can do to speed this up is normalise to the radius of the Earth (R) and compare squared distances rather than distances, thus avoiding the square root and R term, saving yourself 2 calculations per comparison. Leaving:
valToCompare = Lat2^2 + Lat1^2 - 2*Lat1*Lat2 + cos^2((Lat2 + Lat1) /2)(Lon2^2 + Lon1^2 - 2*Lon1*Lon2)
Another thing you could do is precalculate Lat^2 and Lon^2 for each coordinate - reducing the number of calculations for each comparison by 4.
Further, if the points are all relatively close together in latitude you could take an approximation for the cos^2 term by precalculating it using the latitude of a random point or the average latitude of all the points, rather than the average latitude of the two points being compared. This reduces the number of calculations for each comparison by another 4.
Finally, you could pre-calculate 2*Lat and 2*Lon for each point cutting out 2 more calculations for each comparison.
None of this improves your algorithm per se but it should make it run faster and can be applied to any algorithm that needs to compare the distances between points.