I\'m trying to get my way with DynamoDB and NoSQL.
What is the best (right?) approach for modeling a student table and class tables with respect to the fact that I need
A very simple suggestion (without range keys) would be to have two tables: One per query type. This is not unusual in NoSQL databases.
In your case we'd have:
Student
with attribute StudentId
as (hash type) primary key. Each item might then have an attribute named Attends
, the value of which was a list of Ids on classes.Class
with attribute ClassId
as (hash type) primary key. Each item might then have an attribute named AttendedBy
, the value of which was a list of Ids on students.Performing your queries would be simple. Updating the database with one "attends"-relationship between a student and a class requires two separate writes, one to each table.
Another design would have one table Attends
with a hash and range primary key. Each record would represent the attendance of one student to one class. The hash attribute could be the Id of the class and the range key could be the Id of the student. Supplementary data on the class and the student would reside in other tables, then.
To join two Amazon DynamoDB tables
The following example maps two Hive tables to data stored in Amazon DynamoDB. It then calls a join across those two tables. The join is computed on the cluster and returned. The join does not take place in Amazon DynamoDB. This example returns a list of customers and their purchases for customers that have placed more than two orders.
CREATE EXTERNAL TABLE hive_purchases(customerId bigint, total_cost double, items_purchased array<String>)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" = "Purchases",
"dynamodb.column.mapping" = "customerId:CustomerId,total_cost:Cost,items_purchased:Items");
CREATE EXTERNAL TABLE hive_customers(customerId bigint, customerName string, customerAddress array<String>)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES ("dynamodb.table.name" = "Customers",
"dynamodb.column.mapping" = "customerId:CustomerId,customerName:Name,customerAddress:Address");
Select c.customerId, c.customerName, count(*) as count from hive_customers c
JOIN hive_purchases p ON c.customerId=p.customerId
GROUP BY c.customerId, c.customerName HAVING count > 2;