Comparison of search speed for B-Tree and Trie

问题

I am trying to find out which will be more efficient in terms of speed of search, whether trie or B-Tree. I have a dictionary of English words and I want to locate a word in that dictionary efficiently.

回答1:

If by "more efficient in time of search" you refer to theoretical time complexity, then B Tree offers O(logn * |S|)¹ time complexity for search, while a trie offers O(|S|) time complexity, where |S| is the length of the searched string, and n is the number of elements in dictionary.

If by "more efficient in time of search" you refer to actual real life run time, that depends on the actual implementation, actual data and actual search behavior. Some examples that might influence the answer:

Size of data
Storage system (for example: RAM/Flah/disk/distributed filesystem/...)
Distribution of searches
Code optimizations of each implementation
(and much more)

(1) There are O(logn) comparisons, and each comparison takes O(|S|) times, since you need to traverse the entire string to decide which is higher (worst case analysis).

回答2:

It depends on what's your need. If you want to get the whole subtree, a B+Tree is your best choice because it is space efficient and also the branching factor of the B+ Tree affects its performance (the number of intermediary nodes). If h is the height of the tree, then nmax ~~ bh. Therefore h ~~ log(nmax) / log(b).

With n = 1 000 000 000 and b = 100, we have h ~~ 5. Therefore it means only 5 pointer dereferencing for going from the root to the leaf. It's more cache-friendly than a Trie.

But if you want to get the first N children from a substree, then a Trie is the best choice because you simply visit less nodes than in a B+ Tree scenario. Also the word prefix completion is well handled by trie.

来源：https://stackoverflow.com/questions/43309232/comparison-of-search-speed-for-b-tree-and-trie

标签

algorithm

data-structures

trie

b-tree