问题
What is an example of a data set one would use with the k-Nearest Neighbors
algorithm?
I understand the concept but I am unsure about what kind of data one would use for the x, y coordinates.
Can one provide an example of a dataset (with x, y
coordinates) for the nearest-neighbor-k algorithm
?
回答1:
NN
search is in a simple way this:
- You have a database of elements (here you have 2 dimensional points, with
coordinates
x
andy
). - A
query
comes, which is the same type of the elements of the database, thus a 2D point in your case. - The goal is to find which is the most identical point of the
query
point inside the database.
There are many algorithms which allow us not to search the whole database, but to search only what is interest for the query
, thus answering the query
, efficiently.
Example:
Database has 6 2D points: (thus is the datatset
you are refering to)
0 0
1 1
2 2
3 3
4 4
5 5
A query
2D point comes:
q = (9, 9)
The answer is the closest point to q
, which in this example is the (5, 5)
.
In a kNN
search, the query
asks for the k
most identical elements of the database, which in our example is the k
closest points of the database presented above to the query point q
.
So, for k = 3
, for example the answer should be:
5 5 // the 1st closest point to q
4 4 // the 2nd closest point to q
3 3 // the 3rd closest point to q
回答2:
You do not understand the concept.
k-NN
isn't limited to datasets with only 2 dimensional points (with x
& y
coordinates).
Any dataset could be used with k-NN
, regardless of the number of features - and you could use many different distance metrics
(even ones that are not technically valid metrics).
来源:https://stackoverflow.com/questions/23728813/example-data-set-for-the-k-nearest-neighbors-algorithm