Minimize the sum of errors of representative integers

前端 未结 8 720
走了就别回头了
走了就别回头了 2021-02-07 20:13

Given n integers between [0,10000] as D1,D2...,Dn, where there may be duplicates, and n can be huge:

I want to find k distinct represent

相关标签:
8条回答
  • 2021-02-07 20:45

    This is is similar to one-dimensional k-medians clustering.

    The DP I suggested previously won't work; I think we need a table from (n', k', i) to the optimal solution on D1 ≤ … ≤ Dn' with k' representatives of which the greatest is i. Given the bounds on D, the running time is on the order of n2 k with a very large constant, so you should probably adapt one of the heuristics that people use for k-means.

    0 讨论(0)
  • 2021-02-07 20:46

    If the distribution is near random and the selection (n) is large enough, you are wasting time, generally, trying to optimize for what will amount to real costs in time calculating to gain decreasing improvements in % from expected averages. The fastest average solution is to set the lower k-1 at the low end of intervals M/(k-1), where M is the lowest upper bound - the greatest lower bound (ie, M = max number possible - 0) and the last k at M+1. It would take order k (the best we can do with the information presented in this problem) to figure those values out. Stating what I just did is not a proof of course.

    My point is this. The above discussion is one simplification that I think is very practical for one large class of sets. At the other end, it's straightforward to compute every error possible for all permutations and then select the smallest one. The running time for this makes that solution intractable in many cases. That the person asking this question expects more than the most direct and exact (intractable) answer leaves much that is open-ended. We can trim at the edges from here to eternity trying to quantify all sorts of properties along the infinite solution space for all possible permutations (or combinations) of n numbers and all k values.

    0 讨论(0)
提交回复
热议问题