How to structure data in Riak?

后端 未结 2 978
终归单人心
终归单人心 2020-12-15 08:48

I\'m trying to figure out how to model data in Riak. Let\'s say you are building something like a CMS with two features, news and products. You need to be able to store this

相关标签:
2条回答
  • 2020-12-15 09:36

    An even more efficient approach to this than using key filtering (as per Kev Burns's recommendation) is to use Secondary Indexes or Riak Search, to model this scenario.

    Take a look at my answers to Which clustered NoSQL DB for a Message Storing purpose? and Links in Riak: what can they do/not do, compared to graph databases? for a discussion of similar cases.

    You have several decisions to make, depending on your use case. In all cases, you would start out with a company bucket, so that each company has a unique key.

    1) Whether to store the items of interest in 2 separate buckets (news and products) or in one (something like items_of_interest) depends on your preference and ease of querying. If you're always going to be querying for both news and products for a company in a single query, you might as well store them in a single bucket. But I recommend using 2 separate ones, to keep easier track of them, especially if you'll have something like separate tabs or pages for "Company X - Products" and "Company X - News". And if you need to combine them into a single feed, you would make 2 queries (one for news and one for products), and combine them in the client code (by date or whatever).

    2) If a news/product item can have one and only one company that it belongs to, create a secondary index on company_key for each item. That way, you can easily fetch all news or products for a company via a secondary index (2i) query for that company.

    3) If there's a many-to-many relationship (if a news/product item can belong to several companies (perhaps the news item is about a joint venture for 2 separate companies)), then I recommend modeling the relationship as a separate Riak object. For example, you could create a mentions bucket, and for each company mentioned in a news story, you would insert a Mention object, with its own unique key, a secondary index for company_key, and the value would contain a type ('news' or 'product') and an item_key (news key or product key). Extracting relationships to separate Riak objects like this allows you to do a lot of interesting things -- tag them arbitrarily using Riak Search, query them for subscription event notifications, etc.

    0 讨论(0)
  • 2020-12-15 09:41

    I'd create 2 buckets: news and products. Then I'd prefix keys in each bucket with client names. I'd probably also include dates in news keys for easy date ranging.

    news/acme_2011-02-23_01
    news/acme_2011-02-23_02
    news/bigcorp_2011-02-21_01
    

    And optionally prefix product names with category names

    products/acme_blacksmithing_anvil
    products/bigcorp_databases_oracle
    

    Then in your map/reduce you could use key filtering:

    // BigCorp News items
    {
      "inputs":{
         "bucket":"news",
         "key_filters":[["starts_with", "bigcorp"]]
      }
      // ... rest of mapreduce job
    }
    
    // Acme Blacksmithing items
    {
      "inputs":{
         "bucket":"products",
         "key_filters":[["starts_with", "acme_blacksmithing"]]
      }
      // ... rest of mapreduce job
    }
    
    // News for all clients from Feb 12th to 19th
    {
      "inputs":{
         "bucket":"news",
         "key_filters":[["tokenize", "_", 2],
                        ["between", "2011-02-12", "2011-02-19"]]
      }
      // ... rest of mapreduce job
    }
    
    0 讨论(0)
提交回复
热议问题