MongoDB : How to design schema based on application access patterns?

徘徊边缘 提交于 2020-12-04 12:00:09

问题


As someone that comes from DynamoDB, modeling a MongoDB schema to really fit deeply into my application is kinda confusing, specially since it has the concept of references and from what I read is not recommended to keep duplicated data to accomodate your queries.

Take the following example (modeled in mongoengine, but shouldn't matter) :

    #User
    class User(Document):
        email = EmailFieldprimary_key=True)
        pswd_hash = StringField()
        #This also makes it easier to find the Projects the user has a Role
        roles = ListField(ReferenceField('Role')

    #Project
    class Project(Document):
        name = StringField()
        #This is probably unnecessary as the Role id is already the project id
        roles = ListField(ReferenceField('Role'))

    #Roles in project
    class Role(Document):
        project = ReferenceField('Project', primary_key=True)
        #List of permissions
        permissions = ListField(StringField())
        users = ListField(ReferenceField('User')

There are Projects and Users.

Each Project can have many Roles in it.

Each User can have one Role in a Project.


So, it's a Many-Many between Users and Projects

A Many-One between Users and Roles

A Many-One between Roles and Projects


The problem is when I try to accomodate the schema to the access, because on every page refresh on the application, I need :

  1. Project (the id is in the url)
  2. User (email is in session)
  3. User permissions in that project (server-side security checks)

So, considering this is the most common query, how should I model my schema to accomodate it?

Or is the way I'm doing at the moment okay already?


回答1:


Generally, you can model permissions in two ways. Either, there are static roles, which have the implicit permissions to do certain things. Or there are roles which are mere containers for explicit permissions.

Implicit permissions

There is a 16MB size limit on documents, so unless you have a lot of users AND a lot of roles, normalizing is not necessary.

{
 "_id": new ObjectID(),
 "name": "My Project",
 "roles": [
   {
     "role": "admin",
     "members": ["foo","bar"]
   },
   {
     "role": "user",
     "members": ["baz","foo"]
   }
 ]
}

Another way of having a simple data model here is to have one document per relation:

{"project":someObjectId,"role":"admin","user":"foo"}
{"project":someObjectId,"role":"admin","user":"bar"}
{"project":someObjectId,"role":"user","user":"baz"}

Now, you presumably know your project, so you can query the role of a specific user as easy as:

db.roles.find({"project":currentProjectId,"user":currentUser})

in case a user can have multiple roles, you can do an aggregation, for example:

// Add to above data
// db.roles.insert({"project":ObjectId("5d2f6f0fd2c6b42117ecbbe5"),role:"user",user:"foo"})
db.roles.aggregate([{
  $match:{
    user:"foo",
    project:ObjectId("5d2f6f0fd2c6b42117ecbbe5")
  }},{
  $group:{
    "_id":"$user",
    roles:{$addToSet:"$role"}
  }}
])

// Result
{ "_id" : "foo", "roles" : [ "user", "admin" ] }

With a compound index on user and project (order matters!), this aggregation query should be most sufficient.

Explicit permissions

First, we have to define how we want to set up our explicit permissions. A robust way is to use

domain:action[,action...]:instance

(blatantly taken from Apache Shiro's permission model). It is pretty hard to model that without knowing exactly what you want to achieve with your application, but for the sake of an example, let's assume a permission for changing the title of any project. So the abstract description would be:

project:editTitle:*

If you do not need instance level permissions, it gets even easier:

project:editTitle

That is parseable easy enough and roles could be defined as

{
  "_id":"editor",
  "permissions":[
    "project:editTitle",
    "project:addUser",
    "project:stop",
    "project:andSoOnAndSoForth",
    "comment:dlete"
  ]
}

Hey, wait, there is a typo! Let us correct it:

db.permissions.update(
  {permissions:"comment:dlete"},
  {$set:{"permissions.$":"comment:delete"}}
)

(Handy if you want to rephrase a permission, too – just do not forget to add {multi:true} as a third parameter).

Now given roles like

{ "project" : ObjectId("5d2f6f0fd2c6b42117ecbbe5"), "role" : "admin", "user" : "foo" }
{ "project" : ObjectId("5d2f6f0fd2c6b42117ecbbe5"), "role" : "admin", "user" : "bar" }
{ "project" : ObjectId("5d2f6f0fd2c6b42117ecbbe5"), "role" : "user", "user" : "baz" }
{ "project" : ObjectId("5d2f6f0fd2c6b42117ecbbe5"), "role" : "user", "user" : "foo" }
{ "project" : ObjectId("5d2f6f0fd2c6b42117ecbbe5"), "role" : "editor", "user" : "baz" }

and permissions like

{ "_id" : "editor", "permissions" : [ "project:editTitle", "project:addUser", "project:stop", "project:andSoOnAndSoForth", "comment:delete" ] }
{ "_id" : "user", "permissions" : [ "*:read" ] }
{ "_id" : "admin", "permissions" : [ "*:*" ] }

you can get a user's explicit permissions for a project via

db.roles.aggregate([
    // we only want to get the roles of the current user for a certain project
    { $match: { user: "baz", project: ObjectId("5d2f6f0fd2c6b42117ecbbe5") } },
    // We get the permissions associated with the role
    { $lookup: { from: "permissions", localField: "role", foreignField: "_id", as: "permissionDocs" } },
    // We pull the permissions into the root document...
    { $replaceRoot: { newRoot: { $mergeObjects: [{ $arrayElemAt: ["$permissionDocs", 0] }, "$$ROOT"] } } },
    // ... and get rid of all the stuff we do not need
    { $project: { permissionDocs: 0, role: 0, project: 0 } },
    // We flatten the various permission arrays of the result documents...
    { $unwind: "$permissions" },
    // ... and finally construct our set of permissions
    { $group: { "_id": "$user", permissions: { $addToSet: "$permissions" } } }
])

// Result:
{ "_id" : "baz", "permissions" : [ "comment:delete", "project:andSoOnAndSoForth", "*:read", "project:editTitle", "project:addUser", "project:stop" ] }

With that result, you can simply iterate over the set of permissions and allow the deletion for a comment, for example, if either one of the permissions *:*, comment:* or comment:delete is present.

Note that I did not normalize the permissions of roles. This saves us an additional lookup for a quite common use case at the expense that a rather rare use case (changing the permission domain or action) is slower.

EDIT:

You can wrap that into a function like:

function hasPermission(user, project, permission) {
    var has = db.roles.aggregate([{
        $match: {
            user: user,
            project: project
        }}, {
        $lookup: {
            from: "permissions",
            localField: "role",
            foreignField: "_id",
            as: "permissionDocs"
        }}, {
        $replaceRoot: {
            newRoot: {
                $mergeObjects: [{
                    $arrayElemAt: ["$permissionDocs", 0]
                }, "$$ROOT"]
            }
        }}, {
        $project: {
            permissionDocs: 0,
            role: 0,
            project: 0
        }}, {
        $unwind: "$permissions"
        }, {
        $group: {
            "_id": "$user",
            permissions: {
                $addToSet: "$permissions"
            }
        }
    }, {
        $match: {
            "permissions": permission
        }
    }]);
    return has.toArray().length > 0
}

so that something like:

> if ( hasPermission("baz",ObjectId("5d2f6f0fd2c6b42117ecbbe5"),"comment:delete") ) {
    print("Jay")
  } else {
    print("Nay")
  }

results in Yay. (Note that you need to exand the function to match the wildcard permissions comment:* and *:*.)




回答2:


There are various ways of modelling your requirement in the current form.

You can use embed documents if you dont have much duplication and you always need the embedded data when you request the document.

In your case I would use references. Your structure overall looks good to me.

I'll try to show you one such way and uses $lookup with references. You should try with three separate collections one for each project, role and user like below.

One other option will be to use $DBRef which will eagerly load all the roles in the project when you fetch project collection. This option will depend on mongoengine driver and I'm sure driver supports that.

Project Document (Removed roles from project )

{ "_id": ObjectId("5857e7d5aceaaa5d2254aea2"),
  "name": "newProject"
}

Role Document

{ "_id" : "role1",
  "project": ObjectId("5857e7d5aceaaa5d2254aea2"); 
  "users": ["email1", "email2"],
  "permissions": ["delete","update"]
}
{ "_id" : "role2",
  "project": ObjectId("5857e7d5aceaaa5d2254aea2"); 
  "users": ["email1"],
  "permissions": ["add"]
}

User Document

{ "email" : "email1",
  "roles": ["role1", "role2"]
}
{ "email" : "email2",
  "roles": ["role1"]
}

Show All Projects

db.project.find({})

Get All Roles in a Project

db.role.aggregate([
 {$match: {project:ObjectId("5857e7d5aceaaa5d2254aea2")} },
])

Response

{
    "_id": ObjectId("5857e7d5aceaaa5d2254aea2"),
    "name": "newProject",
    "roles": [
       { "_id" : "role1",
         "users": ["email1", "email2"]
       },
       { "_id" : "role2",
         "users": ["email1"]
       }
    ]
}

Get All Roles for a User

db.user.aggregate([ 
  {$match: {email:"email1"}},
  {$lookup: {
     from: "role",
     localField: "roles",
     foreignField: "_id",
     as: "roles"
   }}
])

Response

{
    "email": "email1",
    "roles": [
       { "_id" : "role1",
         "users": ["email1", "email2"]
       },
       { "_id" : "role2",
         "users": ["email1"]
       }
    ]
}

Get user permissions for a project id and email id (With current structure)

db.role.aggregate([
  {$match: {_id:ObjectId("5857e7d5aceaaa5d2254aea2")}},
  {$match: {"$expr": {"$in": ["email1", "$users"]}}},
  {$project:{"permissions":1}}
 ])

Response

[
  {
      "permissions": ["delete","add"]
  },
  {
      "permissions": ["update"]
  }
]

As users going to be ever increasing you could remove the users from role collection and you can use $lookup to join user to role collection for identifying the project. Something like

Role Document (Removed users from role)

{ "_id" : "role1",
  "project": ObjectId("5857e7d5aceaaa5d2254aea2"); 
  "permissions": ["delete","update"]
}
{ "_id" : "role2",
  "project": ObjectId("5857e7d5aceaaa5d2254aea2"); 
  "permissions": ["add"]
}

User Document

{ "email" : "email1",
  "roles": ["role1", "role2"]
}
{ "email" : "email2",
  "roles": ["role1"]
}

Get user permissions for a project id and email id (With updated structure) (Preferred)

db.user.aggregate([
  {$match: {email:"email1"}},
  {$lookup: {
     from: "role",
     localField: "roles",
     foreignField: "_id",
     as: "roles"
   }},
   {$unwind: "$roles"},
   {$match: {"roles.project": ObjectId("5857e7d5aceaaa5d2254aea2")}},
   {$project:{"permissions":"$roles.permissions"}}
 ])

Response

[
  {
      "permissions": ["delete","update"]
  },
  {
      "permissions": ["add"]
  }
]



回答3:


There are different ways of modeling this, for this particular use case I'd suggest to nest the roles/permissions inside the project documents.

In fact from what I understand, your roles aren't shared between projects and so there is an opportunity to embed that, as well as the mapping between the project-roles and users. Here is my proposal (using simplified classes):

class User(Document):
    name = StringField()

class RoleDefinition(EmbeddedDocument):
    users = ListField(ReferenceField(User))
    permissions = ListField(StringField())

class Project(Document):
    role_definitions = EmbeddedDocumentListField(RoleDefinition)

    def has_user_permission(self, user_id, permission):
        for role_def in self.role_definitions:
            if permission in role_def.permissions:
                return user_id in [us.id for us in role_def._data['users']]    # optimization to avoid to dereference all the users
        return False

# save a sample
bob = User(name='Bob').save()
hulk = User(name='hulk').save()
project = Project(
    role_definitions=[
        RoleDefinition(permissions=['read_file', 'delete_file'], users=[bob]),
        RoleDefinition(permissions=['upload_file'], users=[hulk])
    ]
).save()

# Check if a user has a certain permission in a project
assert project.has_user_permission(bob.id, 'read_file') is True

Which will save document with the following structure:

{  
   '_id':ObjectId('5d2cd78cd97f1cc85d0b7b28'),
   'role_definitions':[  
      {  
         'permissions':['read_file', 'delete_file'],
         'users':[ObjectId('5d2cd5d6d97f1cc85d0b7b26')]
      },
      {  
         'permissions':['upload_file'],
         'users':[ObjectId('5d2cd5d9d97f1cc85d0b7b27')]
      }
   ]
}

You can then verify if a user with a certain ID has a certain permission in a project with the following query:

def user_has_permission_in_project(project_id, user_id, permission):
    qry = Project.objects(id=project_id,
                          role_definitions__elemMatch={'users': user_id, 'permissions': permission})
    return qry.count() > 0

assert user_has_permission_in_project(project.id, bob.id, 'read_file') is True

Assuming it fits your constraints, you should be able to adapt this to your need



来源:https://stackoverflow.com/questions/57008603/mongodb-how-to-design-schema-based-on-application-access-patterns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!