I\'m learning about data modeling in DocumentDb. Here\'s where I need some advice
Please see what my documents look like down below.
I can take two approaches he
I believe you're on the right track in considering the trade-offs between normalizing or de-normalizing your project and employee data. As you've mentioned:
Scenario 1) If you de-normalize your data model (couple projects and employee data together) - you may find yourself having to update many projects when you update an employee.
Scenario 2) If you normalize your data model (decouple projects and employee data) - you would have to query for projects to retrieve employeeIds and then query for the employees if you wanted to get the list of employees belonging to a project.
I would pick the appropriate trade-off given your application's use case. In general, I prefer de-normalizing when you have a read-heavy application and normalizing when you have a write-heavy application.
Note that you can avoid having to make multiple roundtrips between your application and the database by leveraging DocumentDB's store procedures (queries would be performed on DocumentDB-server-side).
Here's an example store procedure for retrieving employees belonging to a specific projectId:
function(projectId) {
/* the context method can be accessed inside stored procedures and triggers*/
var context = getContext();
/* access all database operations - CRUD, query against documents in the current collection */
var collection = context.getCollection();
/* access HTTP response body and headers from the procedure */
var response = context.getResponse();
/* Callback for processing query on projectId */
var projectHandler = function(documents) {
var i;
for (i = 0; i < documents[0].projectTeam.length; i++) {
// Query for the Employees
queryOnId(documents[0].projectTeam[i].id, employeeHandler);
}
};
/* Callback for processing query on employeeId */
var employeeHandler = function(documents) {
response.setBody(response.getBody() + JSON.stringify(documents[0]));
};
/* Query on a single id and call back */
var queryOnId = function(id, callbackHandler) {
collection.queryDocuments(collection.getSelfLink(),
'SELECT * FROM c WHERE c.id = \"' + id + '\"', {},
function(err, documents) {
if (err) {
throw new Error('Error' + err.message);
}
if (documents.length < 1) {
throw 'Unable to find id';
}
callbackHandler(documents);
}
);
};
// Query on the projectId
queryOnId(projectId, projectHandler);
}
Even though DocumentDB supports limited OR statements during the preview - you can still get relatively good performance by splitting the employeeId-lookups into a bunch of asynchronous server-side queries.