T-Tree or B-Tree

≯℡__Kan透↙ 提交于 2019-12-12 17:30:43

问题


T-tree algorithm is described in this paper And T*-Tree is an improvement from T-tree for better use of query operations, including range queries and which contains all other good features of T-tree.
This algorithm is described in this paper "T*-tree: A Main Memory Database Index Structure for Real-Time Applications".
According to this research paper, T-Tree is faster than B-tree/B+tree when datasets fit in the memory. I implemented T-Tree/T*Tree as they described in these papers and compared the performance with B-tree/B+tree, but B-tree/B+tree perform better than T-Tree/T*Tree in all test cases (insertion, deletion, searching).
I read that T-Tree is an efficient index structure for in-memory database, and it used by Oracle TimesTen. But my results did not show that.
If anyone may know the reason or have any comment about that, it will be great to hear from her (or him).


回答1:


T-Trees are not a fundamental data structure in the same sense that AVL trees or B-trees are. They are just a hacked version of balanced binary trees and as such there may or may not be niche applications where they offer decent performance.

In this day and age they are bound to suffer horribly because of their poor locality, both in the sense of expected block/page transfer counts and in the sense of cache locality. The latter is evident since in all node accesses of a search except for the very last one, only the boundary values will be checked against the search key - all the rest is paged in or cached for nought.

Compare this to the excellent access locality of B-trees in general and B+trees in particular (not to mention cache-oblivious and cache-conscious versions that were designed explicitly with memory performance charactistics in mind).

Similar problems exist with the rebalancing. In the B-tree world many variations - starting with B+ and Blink - have been developed and perfected in order to achieve desired amortised performance characteristics, including aspects like concurrency (locking/latching) or the absence thereof. So most of the time you can simply go out and find a B-tree variation that fits your performance profile - or use the simple classic B+tree and be certain of decent results.

T-trees are more complicated than comparable B-trees and it seems that they have nothing to offer in the way of performance in general, given that the times of commodity hardware with a single-level memory 'hierarchy' have been gone for decades. Not only is the hard disk the new memory, the converse is also true and main memory is the new hard disk now. I.e. even without NUMA the cost of bringing data from main memory into the cache hierarchy is so high that it pays to minimise page transfers - which is precisely what B-trees and their variations do and the T-tree doesn't. Closer to the processor core it's the number of cache line accesses/transfers that matters but the picture remains the same.

In fact, if you take the idea of binary search - which is provably optimal - and think about ways of arranging the search keys in a manner that plays well with memory hierarchies (caches) then you invariably end up with something that looks uncannily like a B-tree...

If you program for performance then you'll find that winners are almost always located somewhere in the triangle between sorted arrays, B-trees and hashing. Even balanced binary trees are only competitive if their comparatively poor performance takes the back seat in the face of other considerations and key counts are fairly small, i.e. not more than a couple million.



来源:https://stackoverflow.com/questions/40150637/t-tree-or-b-tree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!