What .NET dictionary supports a “find nearest key” operation?

冷暖自知 提交于 2019-12-29 07:31:09

问题


I'm converting some C++ code to C# and it calls std::map::lower_bound(k) to find an entry in the map whose key is equal to or greater than k. However, I don't see any way to do the same thing with .NET's SortedDictionary. I suspect I could implement a workaround using SortedList, but unfortunately SortedList is too slow (O(n) for inserting and deleting keys). What can I do?

Note: I found a workaround using that takes advantage of my particular scenario... Specifically, my keys are a dense population of integers starting at just over 0, so I used a List<TValue> as my dictionary with the list index serving as the key, and searching for a key equal or greater than k can be done in only a few loop iterations. But it would still be nice to see the original question answered.


回答1:


It took a couple months of work, but at last I can offer at least a partial solution to this problem... I call it the Compact Patricia Trie, a sorted dictionary that offers a "find next larger key" operation.

http://www.codeproject.com/KB/recipes/cptrie.aspx

It's only a partial solution since only certain kinds of keys are supported, namely byte[], string, and all primitive integer types (Int8..UInt64). Also, string sorting is case-sensitive.




回答2:


The problem is that a dictionary/hash table is designed to arrive at a unique memory location based on an input value, so you'll need a data structure that is designed to accommodate a range related to each value you store, and at the same time update each interval correctly

I think skip lists (or balanced binary trees) can help you. Although they cannot perform lookups in O(n), they can do logarithmically and still faster than trees.

I know this is not a proper answer since I cannot say that the .NET BCL already contains such a class, you'll unfortunately have to implement one yourself, or find a 3rd party assembly that supports it for you. There seems to be a nice example over at The CodeProject here, though.




回答3:


You can try the code i wrote below. it using binary search, therefore assuming the list/array is pre-sorted.

public static class ListExtensions
{
    public static int GetAtMostIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer)
    {
        return GetAtMostIndex(list, value, comparer, 0, list.Count);
    }

    public static int GetAtLeastIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer)
    {
        return GetAtLeastIndex(list, value, comparer, 0, list.Count);
    }

    public static int GetAtMostIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer, int index, int count)
    {
        if (count == 0)
        {
            return -1;
        }

        int startIndex = index;
        int endIndex = index + count - 1;
        int middleIndex = 0;
        int compareResult = -1;

        while (startIndex < endIndex)
        {
            middleIndex = (startIndex + endIndex) >> 1; //  / 2
            compareResult = comparer.Invoke(list[middleIndex], value);

            if (compareResult > 0)
            {
                endIndex = middleIndex - 1;
            }
            else if (compareResult < 0)
            {
                startIndex = middleIndex + 1;
            }
            else
            {
                return middleIndex;
            }
        }

        if (startIndex == endIndex)
        {
            compareResult = comparer.Invoke(list[startIndex], value);

            if (compareResult <= 0)
            {
                return startIndex;
            }
            else
            {
                int returnIndex = startIndex - 1;

                if (returnIndex < index)
                {
                    return -1;
                }
                else
                {
                    return returnIndex;
                }
            }
        }
        else
        {
            //todo: test
            return startIndex - 1;
        }
    }

    public static int GetAtLeastIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer, int index, int count)
    {
        if (count == 0)
        {
            return -1;
        }

        int startIndex = index;
        int endIndex = index + count - 1;
        int middleIndex = 0;
        int compareResult = -1;

        while (startIndex < endIndex)
        {
            middleIndex = (startIndex + endIndex) >> 1; //  / 2
            compareResult = comparer.Invoke(list[middleIndex], value);

            if (compareResult > 0)
            {
                endIndex = middleIndex - 1;
            }
            else if (compareResult < 0)
            {
                startIndex = middleIndex + 1;
            }
            else
            {
                return middleIndex;
            }
        }

        if (startIndex == endIndex)
        {
            compareResult = comparer.Invoke(list[startIndex], value);

            if (compareResult >= 0)
            {
                return startIndex;
            }
            else
            {
                int returnIndex = startIndex + 1;

                if (returnIndex >= index + count)
                {
                    return -1;
                }
                else
                {
                    return returnIndex;
                }
            }
        }
        else
        {
            return endIndex + 1;
        }
    }
}



回答4:


I created three data structures related to B+ trees that provide this functionality for any data type: BList<T>, BDictionary<K,V> and BMultiMap<K,V>. Each of these data structures provide FindLowerBound() and FindUpperBound() methods that work like C++'s lower_bound and upper_bound.




回答5:


find nearest to K:

dict.Keys.Where(i => i >= K).OrderBy(i => i).First();

or much faster:

public int? GetNearestKey(dict, K) 
{
    int? lowerK = null;
    foreach (int key in dict.Keys)
    {
        if (key == K) 
        {
            lowerK = K;
            break; 
        }
        else if (key >= K && (!lowerK.HasValue || key < lowerK))
        {
            lowerK = key;
        }
    }
    return lowerK;
}



回答6:


There isn't a binary search tree collection implementation in the base framework, so you'll either have to build one or find an implementation. As you noted, SortedList is closest in terms of searching but is slower (due to its underlying array implementation) for insertion/deletion.




回答7:


I think there's a mistake in the question about SortedList complexity.

SortedList has O(log(n)) amortized complexity for inserting new item. If you know in advance the capacity it can be done in O(Log(n)) in the worst case.




回答8:


You can do this for SortedSet<T> with following extension methods:

public static class SortedSetExtensions
{
    public static bool FindLowerOrEqualThan<T>(this SortedSet<T> set, T value, out T first)
    {
        if(set.Count == 0)
        {
            first = default(T);
            return false;
        }

        var minimum = set.Min;

        if(set.Comparer.Compare(minimum, value) > 0)
        {
            first = default(T);
            return false;
        }

        first = set.GetViewBetween(minimum, value).Max;
        return true;
    }

    public static bool FindGreaterOrEqualThan<T>(this SortedSet<T> set, T value, out T first)
    {
        if (set.Count == 0)
        {
            first = default(T);
            return false;
        }

        var maximum = set.Max;

        if (set.Comparer.Compare(maximum, value) < 0)
        {
            first = default(T);
            return false;
        }

        first = set.GetViewBetween(value, maximum).Min;
        return true;
    }
}


来源:https://stackoverflow.com/questions/1690929/what-net-dictionary-supports-a-find-nearest-key-operation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!