Data structure/algorithm for variable length record storage and lookup on disk withsearch only on primary keys

后端 未结 5 1845
感动是毒
感动是毒 2021-02-10 01:05

I am looking for an algorithm / data structure that works well for large block based devices (eg a mechanical hard drive) which is optimised for insert, get, update and delete w

5条回答
  •  太阳男子
    2021-02-10 01:55

    The easy way: Use something like Berkeley DB. It provides a key-value store for arbitrary byte strings, and does all the hard work for you. It even provides 'secondary databases' for indexing, if you want it.

    The do-it-yourself way: Use Protocol Buffers (or the binary format of your choice) to define B-Tree node and data item structures. Use an append-only file for your database. To write a new record or modify an existing record, you simply write the record itself to the end of the file, then write any modified B-Tree nodes (eg, the record's parent node, its parent node, and so forth up to the root). Then, write the location of the new root of the tree to the header block at the beginning of the file. To read the file, you simply find the most recent root node and read the B-Tree like you would in any other file. This approach has several advantages:

    • Since written data is never modified, readers don't need to take locks, and get a 'snapshot' view of the DB based on the root node at the time they started reading.
    • By adding 'previous version' fields to your nodes and records, you get the ability to access previous versions of the DB essentially for free.
    • It's really easy to implement and debug compared to most on-disk file formats that support modification.
    • Compacting the database consists of simply reading out the latest version of the data and B-Tree and writing it to a new file.

提交回复
热议问题