Saving and incrementally updating nearest-neighbor model in R

前端 未结 2 490
谎友^
谎友^ 2021-02-01 14:07

There are several nearest neighbor R packages (e.g., FNN, RANN, yaImpute) but none of them seem to allow saving off the NN data structure (cover tree, KD tree etc.) so that the

相关标签:
2条回答
  • 2021-02-01 14:46

    There is a good reason why no NN package does that.

    The reason is that the "NN data structure" necessarily includes all the input data points (in the form of a KD tree), so there is no space savings against the input data. It appears that there would be time savings in not having to re-create the KD-tree for each new input, but this is not the case, alas.

    The reason is that the time to build a KD-tree is, in general, worse than linearithmic. This means that, for large inputs, it makes sense to sort the data before building the KD-tree because that will produce the KD-tree faster and it will be better balanced, which will improve the search too (it is also worse than logarithmic, in general). This approach would speed up modeling and evaluation but discourage incremental updates, of course.

    Your best bet, I think, if to find a generic KD-tree package and use it instead.

    0 讨论(0)
  • 2021-02-01 15:00

    The nabor package lets you build a tree and subsequently perform queries on it. But I don't think it lets you update the tree incrementally.

    0 讨论(0)
提交回复
热议问题