I am trying to learn NoSQL with Google Datastore but I am running into a problem with uniqueness.
Consider an ecommerce store, it has categories and products.
<Create a new kind called 'sku'. When you create a new product, you'll want to do a transactional insert of both the product entity and the sku entity.
For example, let's say you want to add a new product with the kind name product
with the id of abc
:
"product/abc" = {"sku": 1234, "product_name": "Test product"}
To ensure uniqueness on the property "sku", you'll always want to insert an entity with the kind name sku
and the id that equals the property's value:
"sku/1234" = {"created": "2017-05-11"}
The above example entity has a property for created date - just something optional I threw in as part of the example.
Now, as long as you insert both of these as part of the same transaction, you will be ensuring that the "sku" property has a unique value. This works because:
You can use "sku" as an "id" (if it's a number) or "name" (if it's a string) for your entity, instead of storing "sku" as a property. Then it's guaranteed to be unique as it becomes part of the unique entity key.
Data model is a big subject but IMO there are two approaches you can choose. This is more fundamental rather specific to your question. It gives some ideas.
The first approach – storing a reference as a property
Same as thinking of product contains product variants ...
This approach sort of the same from RDBMS world. You can create products separately, and each product will have a reference in each product variants. It is similar to how foreign keys work in databases. So, you will have a new property for the product variant entities, which will contain a reference to the product to which it belongs. The product attribute will actually contain the key of an entity of the Product Kind. If it sounds confusing this is how u can dissect it. I will use python as example:
# product model
class Product(ndb.Model):
name = ndb.StringProperty()
# product variant model
class ProductVariant(ndb.Model):
name = ndb.StringProperty()
price = ndb.IntegerProperty()
# product key.
product = ndb.KeyProperty(kind=Product)
hugoboss = Product(name="Hugo Boss", key=ndb.Key(Product, 'hugoboss'))
gap = Product(name="Gap", key=ndb.Key(Gap, 'gap'))
pants1 = ProductVariant(name="Black panst", price=300, product=hugoboss.key)
pants2 = ProductVariant(name="Grey pants", price=200, product=hugoboss.key)
tshirt = ProductVariant(name="White graphic tshirt", price=10, product=gap.key)
pants1.put()
pants2.put()
tshirt.put()
# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ProductVariant.product == hugoboss.key).fetch(10):
print pants.name
# You should get something:
Black pants
Grey panst
The second approach – a product within the key
To take full advantage of it you need to know about sorting feature of Bigtable(Datastore build on top of Bigtable) row keys and how data manipulated around it. if you want to deep dive there is great paper Bigtable: A Distributed Storage System for Structured Data
# product model
class Product(ndb.Model):
name = ndb.StringProperty()
# product variant model
class ProductVariant(ndb.Model):
name = ndb.StringProperty()
price = ndb.IntegerProperty()
hugoboss = ndb.Key(Product, 'hugoboss')
gap = ndb.Key(Product, 'gap')
Product(name="Hugo Boss", key=hugoboss).put()
Product(name="Gap", key=gap).put()
pants1 = ProductVariant(name="Black pants", price=300, parent=hugoboss)
pants2 = ProductVariant(name="Grey pants", price=200, parent=hugoboss)
tshirt = ProductVariant(name="White graphic tshirt", price=10, parent=gap)
pants1.put()
pants2.put()
tshirt.put()
# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ancestor=hugoboss).fetch(10):
print pants.name
# You should get something:
Black pants
Grey pants
Second approach is very powerful! I hope this helps.