store list in key value database

前端 未结 4 1871
小鲜肉
小鲜肉 2021-01-24 19:25

I search for best way to store lists associated with key in key value database (like berkleydb or leveldb)

For example: I have users and orders

相关标签:
4条回答
  • 2021-01-24 19:33

    One way you could model this in a key-value store which supports scans , like leveldb, would be to add the order id to the key for each user. So the new keys would be userId_orderId for each order. Now to get orders for a particular user, you can do a simple prefix scan - scan(userId*). Now this makes the userId range query slow, in that case you can maintain another table just for userIds or use another key convention : Id_userId for getting userIds between [5000-5050]

    Recently I have seen hyperdex adding data types support on top of leveldb : ex: http://hyperdex.org/doc/04.datatypes/#lists , so you could give that a try too.

    0 讨论(0)
  • 2021-01-24 19:38

    In BerkeleyDB you can store multiple values per key, either in sorted or unsorted order. This would be the most natural solution. LevelDB has no such feature. You should look into LMDB(http://symas.com/mdb/) though, it also supports sorted multi-value keys, and is smaller, faster, and more reliable than either of the others.

    0 讨论(0)
  • 2021-01-24 19:50

    Let start with a single list. You can work with a single hashmap:

    1. store in row 0 the count of user's order
    2. for each new order store a new row with the count incremented

    So yoru hashmap looks like the following:

    key | value
    -------------
     0  |   5
     1  | tomato
     2  | celery
     3  | apple
     4  | pie
     5  | meat
    

    Steady increment of the key makes sure that every key is unique. Given the fact that the db is key ordered and that the pack function translates integers into a set of byte arrays that are correctly ordered you can fetch slices of the list. To fetch orders between 5000 and 5050 you can use bsddb Cursor.set_range or leveldb's createReadStream (js api)

    Now let's expand to multiple user orders. If you can open several hashmap you can use the above using several hashmap. Maybe you will hit some system issues (max nb of open fds or max num of files per directory). So you can use a single and share the same hashmap for several users.

    What I explain in the following works for both leveldb and bsddb given the fact that you pack keys correctly using the lexicographic order (byteorder). So I will assume that you have a pack function. In bsddb you have to build a pack function yourself. Have a look at wiredtiger.packing or bytekey for inspiration.

    The principle is to namespace the keys using the user's id. It's also called key composition.

    Say you database looks like the following:

       key   |  value
    -------------------
      1  | 0 |    2       <--- count column for user 1
      1  | 1 |  tomato
      1  | 2 |  orange 
        ...      ...
      32 | 0 |    1       <--- count column for user 32
      32 | 1 |  banna
        ...  |   ...
    

    You create this database with the following (pseudo) code:

    db.put(pack(1, make_uid(1)), 'tomato')
    db.put(pack(1, make_uid(1)), 'orange')
    ...
    db.put(pack(32, make_uid(32)), 'bannana')
    

    make_uid implementation looks like this:

    def make_uid(user_uid):
        # retrieve the current count
        counter_key = pack(user_uid, 0)
        value = db.get(counter_key)
        value += 1  # increment
        # save new count
        db.put(counter_key, value)
        return value
    

    Then you have to do the correct range lookup, it's similar to the single composite-key. Using bsddb api cursor.set_range(key) we retrieve all items between 5000 and 5050 for user 42:

    def user_orders_slice(user_id, start, end):
        key, value = cursor.set_range(pack(user_id, start))
        while True:
            user_id, order_id = unpack(key)
            if order_id > end:
                break
            else:
                # the value is probably packed somehow...
                yield value
                key, value = cursor.next()
    

    Not error checks are done. Among other things slicing user_orders_slice(42, 5000, 5050) is not guaranteed to tore 51 items if you delete items from the list. A correct way to query say 50 items, is to implement a user_orders_query(user_id, start, limit)`.

    I hope you get the idea.

    0 讨论(0)
  • 2021-01-24 19:51

    You can use Redis to store list in zset(sorted set), like this:

    // this line is called whenever a user place an order
    $redis->zadd($user_1_orders, time(), $order_id);
    // list orders of the user
    $redis->zrange($user_1_orders, 0, -1);
    

    Redis is fast enough. But one thing you should know about Redis is that it stores all data in memory, so if the data eventually exceed the physical memory, you have to shard the data by your own.

    Also you can use SSDB(https://github.com/ideawu/ssdb), which is a wrapper of leveldb, has similar APIs to Redis, but stores most data in disk, memory is only used for caching. That means SSDB's capacity is 100 times of Redis' - up to TBs.

    0 讨论(0)
提交回复
热议问题