store list in key value database

前端 未结 4 1874
小鲜肉
小鲜肉 2021-01-24 19:25

I search for best way to store lists associated with key in key value database (like berkleydb or leveldb)

For example: I have users and orders

4条回答
  •  不思量自难忘°
    2021-01-24 19:50

    Let start with a single list. You can work with a single hashmap:

    1. store in row 0 the count of user's order
    2. for each new order store a new row with the count incremented

    So yoru hashmap looks like the following:

    key | value
    -------------
     0  |   5
     1  | tomato
     2  | celery
     3  | apple
     4  | pie
     5  | meat
    

    Steady increment of the key makes sure that every key is unique. Given the fact that the db is key ordered and that the pack function translates integers into a set of byte arrays that are correctly ordered you can fetch slices of the list. To fetch orders between 5000 and 5050 you can use bsddb Cursor.set_range or leveldb's createReadStream (js api)

    Now let's expand to multiple user orders. If you can open several hashmap you can use the above using several hashmap. Maybe you will hit some system issues (max nb of open fds or max num of files per directory). So you can use a single and share the same hashmap for several users.

    What I explain in the following works for both leveldb and bsddb given the fact that you pack keys correctly using the lexicographic order (byteorder). So I will assume that you have a pack function. In bsddb you have to build a pack function yourself. Have a look at wiredtiger.packing or bytekey for inspiration.

    The principle is to namespace the keys using the user's id. It's also called key composition.

    Say you database looks like the following:

       key   |  value
    -------------------
      1  | 0 |    2       <--- count column for user 1
      1  | 1 |  tomato
      1  | 2 |  orange 
        ...      ...
      32 | 0 |    1       <--- count column for user 32
      32 | 1 |  banna
        ...  |   ...
    

    You create this database with the following (pseudo) code:

    db.put(pack(1, make_uid(1)), 'tomato')
    db.put(pack(1, make_uid(1)), 'orange')
    ...
    db.put(pack(32, make_uid(32)), 'bannana')
    

    make_uid implementation looks like this:

    def make_uid(user_uid):
        # retrieve the current count
        counter_key = pack(user_uid, 0)
        value = db.get(counter_key)
        value += 1  # increment
        # save new count
        db.put(counter_key, value)
        return value
    

    Then you have to do the correct range lookup, it's similar to the single composite-key. Using bsddb api cursor.set_range(key) we retrieve all items between 5000 and 5050 for user 42:

    def user_orders_slice(user_id, start, end):
        key, value = cursor.set_range(pack(user_id, start))
        while True:
            user_id, order_id = unpack(key)
            if order_id > end:
                break
            else:
                # the value is probably packed somehow...
                yield value
                key, value = cursor.next()
    

    Not error checks are done. Among other things slicing user_orders_slice(42, 5000, 5050) is not guaranteed to tore 51 items if you delete items from the list. A correct way to query say 50 items, is to implement a user_orders_query(user_id, start, limit)`.

    I hope you get the idea.

提交回复
热议问题