What would be the performance difference (reads/writes) between some C++ implementation of in-memory B-Tree (for example google btree) and the LMDB (without taking into consideration all the feacures of LMDB such as transactions, isolation, shared access etc.)?
This 2014 lmdb design presentation by its architect Howard Chu covers the design and tradeoffs of lmdb
.
To summarize: lmdb
is a copy-on-write, bottom-up updated, double-buffered, b-tree where the implementation always favors simplicity whenever it clashes with other considerations.
The smart design choices make it one of the highest performance and corruption-resistant B-tree implementations out there.
- copy-on-write means that data is never overwritten avoiding many possible corruption scenarios
- bottom-up updates from leaf to root make the root update equivalent to a commit
- double buffering of past versions keeps only the last-two roots in the db file
- complex multi-level caching schemes are avoided -
lmdb
relies on the underlying OS for caching - The whole code base is very small compared to other DBs avoiding CPU cache misses
Obviously, these choices mean that lmdb
is not friendly to complex scenarios such as:
- multi-version DB rollbacks (only last 2 are available)
- long-lived transactions and delayed commits: these lead to append-only behavior and potentially unlimited growth of the db file
- multiple concurrent writers:
lmdb
favors simpler multiple readers and single writer schemes
There's much more in the full (over 100 pages) presentation. The above is just a summary of the spirit of lmdb
.
lmdb
is used as the core storage engine in prominent open source projects such as Open LDAP and Memcached and in both cases speed-ups of orders of magnitude have been observed compared to alternatives as can be seen in micro-benchmark results.
来源:https://stackoverflow.com/questions/35279756/what-is-special-about-internal-design-of-lmdb