Prefix search through list/dictionary using .NET StringDictionary?

前端 未结 5 1162
情歌与酒
情歌与酒 2021-01-15 09:06

I was wondering if .NET offers any standard functionality for doing a prefix search through a list or a dictionary object. I came across the StringDictionary, b

相关标签:
5条回答
  • 2021-01-15 09:39

    Below is a basic implementation of a set of strings that can be searched efficiently by prefix.

    The idea is to keep all the words of the set in a trie, and when queried to find all words that start with some prefix, we find the node corresponding to the last character in the prefix, and in DFS from there we collect and return all its descendants.

    public class PrefixSearchableSet
    {
        private readonly Dictionary<char, TrieNode> _letterToNode = new Dictionary<char, TrieNode>();
        private bool _isEmptyWordIncluded;
    
        public PrefixSearchableSet(IEnumerable<string> words = null)
        {
            if (words is null) return;
            foreach (string word in words)
            {
                AddWord(word);
            }
        }
    
        public void AddWord(string word)
        {
            if (word is null) return;
    
            if (word is "") _isEmptyWordIncluded = true;
    
            else
            {
                TrieNode node = FindOrAdd(_letterToNode, word[0]);
                foreach (char c in word.Skip(1))
                {
                    node = FindOrAdd(node.Children, c);
                }
    
                node.Word = word;
            }
        }
    
        public List<string> GetWords(string prefix)
        {
            List<string> words = new List<string>();
    
            if (prefix is null) return words;
    
            if (prefix is "")
            {
                if (_isEmptyWordIncluded) words.Add("");
                foreach (TrieNode trieNode in _letterToNode.Values)
                {
                    trieNode.CollectWords(words);
                }
                return words;
            }
    
            _letterToNode.TryGetValue(prefix[0], out TrieNode node);
            foreach (char c in prefix.Skip(1))
            {
                if (node is null) break;
                node.Children.TryGetValue(c, out node);
            }
            node?.CollectWords(words);
    
            return words;
        }
    
        private static TrieNode FindOrAdd(Dictionary<char, TrieNode> letterToNode, char key)
        {
            if (letterToNode.TryGetValue(key, out TrieNode node)) return node;
            return letterToNode[key] = new TrieNode();
        }
    
        private class TrieNode
        {
            public Dictionary<char, TrieNode> Children { get; } = new Dictionary<char, TrieNode>();
    
            public string Word { get; set; }
    
            public void CollectWords(List<string> words)
            {
                if (Word != null) words.Add(Word);
                foreach (TrieNode child in Children.Values)
                {
                    child.CollectWords(words);
                }
            }
        }
    }
    
    0 讨论(0)
  • 2021-01-15 09:42

    StringDictionary is merely a hash table where the keys and values are strings. This existed before generics (when Dictionary<string, string> was not possible).

    The data structure that you want here is a trie. There are implementations on CodeProject:

    1. Phone Directory Implementation Using TRIE
    2. A Reusable Prefix Tree using Generics in C# 2.0

    Or, if you're that kind of guy, roll your own (see CLRS).

    0 讨论(0)
  • 2021-01-15 09:48

    I made a generic implementation of this available here.

    Since string implements IEnumerable<char>, you can use it with char as parameter for TKeyElement.

    0 讨论(0)
  • 2021-01-15 09:50

    I think the StringDictionary is old school (pre-generics). You should probably use a Dictionary(Of String, String) instead because it implements IEnumerable (think LINQ). One extremely lame thing about StringDictionary is that it's case-insensitive.

    0 讨论(0)
  • 2021-01-15 09:52

    I don't believe StringDictionary supports a prefix search, but if you use a SortedList<,> you can binary search through the range of keys until you find the first entry before and after your prefix.

    0 讨论(0)
提交回复
热议问题