Should I denormalize or run multiple queries in DocumentDb?

前端 未结 1 821
慢半拍i
慢半拍i 2021-02-20 12:09

I\'m learning about data modeling in DocumentDb. Here\'s where I need some advice

Please see what my documents look like down below.

I can take two approaches he

1条回答
  •  长发绾君心
    2021-02-20 12:51

    I believe you're on the right track in considering the trade-offs between normalizing or de-normalizing your project and employee data. As you've mentioned:

    Scenario 1) If you de-normalize your data model (couple projects and employee data together) - you may find yourself having to update many projects when you update an employee.

    Scenario 2) If you normalize your data model (decouple projects and employee data) - you would have to query for projects to retrieve employeeIds and then query for the employees if you wanted to get the list of employees belonging to a project.

    I would pick the appropriate trade-off given your application's use case. In general, I prefer de-normalizing when you have a read-heavy application and normalizing when you have a write-heavy application.

    Note that you can avoid having to make multiple roundtrips between your application and the database by leveraging DocumentDB's store procedures (queries would be performed on DocumentDB-server-side).

    Here's an example store procedure for retrieving employees belonging to a specific projectId:

    function(projectId) {
      /* the context method can be accessed inside stored procedures and triggers*/
      var context = getContext();
      /* access all database operations - CRUD, query against documents in the current collection */
      var collection = context.getCollection();
      /* access HTTP response body and headers from the procedure */
      var response = context.getResponse();
    
      /* Callback for processing query on projectId */
      var projectHandler = function(documents) {
        var i;
        for (i = 0; i < documents[0].projectTeam.length; i++) {
          // Query for the Employees
          queryOnId(documents[0].projectTeam[i].id, employeeHandler);
        }
      };
    
      /* Callback for processing query on employeeId */
      var employeeHandler = function(documents) {
        response.setBody(response.getBody() + JSON.stringify(documents[0]));
      };
    
      /* Query on a single id and call back */
      var queryOnId = function(id, callbackHandler) {
        collection.queryDocuments(collection.getSelfLink(),
          'SELECT * FROM c WHERE c.id = \"' + id + '\"', {},
          function(err, documents) {
            if (err) {
              throw new Error('Error' + err.message);
            }
            if (documents.length < 1) {
              throw 'Unable to find id';
            }
            callbackHandler(documents);
          }
        );
      };
    
      // Query on the projectId
      queryOnId(projectId, projectHandler);
    }
    

    Even though DocumentDB supports limited OR statements during the preview - you can still get relatively good performance by splitting the employeeId-lookups into a bunch of asynchronous server-side queries.

    0 讨论(0)
提交回复
热议问题