I\'m interested in document-oriented databases, and I\'d like to play with MongoDB. So I started a fairly simple project (an issue tracker), but am having hard times thinking in
Redid this answer since the original answer took the relation the wrong way round due to reading incorrectly.
issue = {code:"asdf-11", title:"asdf", reporter:{username:"qwer", role:"manager"}}
As to whether embedding some important information about the user (creator) of the ticket is a wise decision or not depends upon the system specifics.
Are you giving these users the ability to login and report issues they find? If so then it is likely you might want to factor that relation off to a user collection.
On the other hand, if that is not the case then you could easily get away with this schema. The one problem I see here is if you wish to contact the reporter and their job role has changed, that's somewhat awkward; however, that is a real world dilemma, not one for the database.
Since the subdocument represents a single one-to-one relation to a reporter you also should not suffer fragmentation problems mentioned in my original answer.
There is one glaring problem with this schema and that is duplication of changing repeating data (Normalised Form stuff).
Let's take an example. Imagine you hit the real world dilemma I spoke about earlier and a user called Nigel
wants his role to reflect his new job position from now on. This means you have to update all rows where Nigel
is the reporter and change his role
to that new position. This can be a lengthy and resource consuming query for MongoDB.
To contradict myself again, if you were to only have maybe 100 tickets (aka something manageable) per user then the update operation would likely not be too bad and would, in fact, by manageable for the database quite easily; plus due to the lack of movement (hopefully) of the documents this would be a completely in place update.
So whether this should be embedded or not depends heavily upn your querying and documents etc, however, I would say this schema isn't a good idea; specifically due to the duplication of changing data across many root documents. Technically, yes, you could get away with it but I would not try.
I would instead split the two out.
If I have objects (subdocuments) in a document, can I update them all in a single query?
Just like the relation style in my original answer, yes and easily.
For example, let's update the role of Nigel
to MD
(as hinted earlier) and change the ticket status to completed:
db.tickets.update({'reporter.username':'Nigel'},{$set:{'reporter.role':'MD', status: 'completed'}})
So a single document schema does make CRUD easier in this case.
One thing to note, stemming from your English, you cannot use the positional operator to update all subdocuments under a root document. Instead it will update only the first found.
Again hopefully that makes sense and I haven't left anything out. HTH
here I have a user related to the issue). Should I create another document 'user' and reference it in 'issue' document by its id (like in relational databases), or should I leave all the user's data in the subdocument?
This is a considerable question and requires some background knowledge before continuing.
First thing to consider is the size of a issue:
issue = {code:"asdf-11", title:"asdf", reporter:{username:"qwer", role:"manager"}}
Is not very big, and since you no longer need the reporter
information (that would be on the root document) it could be smaller, however, issues are never that simple. If you take a look at the MongoDB JIRA for example: https://jira.mongodb.org/browse/SERVER-9548 (as a random page that proves my point) the contents of a "ticket" can actually be quite considerable.
The only way you would gain a true benefit from embedding the tickets would be if you could store ALL user information in a single 16 MB block of contigious sotrage which is the maximum size of a BSON document (as imposed by the mongod
currently).
I don't think you would be able to store all tickets under a single user.
Even if you was to shrink the ticket to, maybe, a code, title and a description you could still suffer from the "swiss cheese" problem caused by regular updates and changes to documents in MongoDB, as ever this: http://www.10gen.com/presentations/storage-engine-internals is a good reference for what I mean.
You would typically witness this problem as users add multiple tickets to their root user document. The tickets themselves will change as well but maybe not in a drastic or frequent manner.
You can, of course, remedy this problem a bit by using power of 2 sizes allocation: http://docs.mongodb.org/manual/reference/command/collMod/#usePowerOf2Sizes which will do exactly what it says on the tin.
Ok, hypothetically, if you were to only have code
and title
then yes, you could store the tickets as subdocuments in the root user without too many problems, however, this is something that comes down to specifics that the bounty assignee has not mentioned.
If I have objects (subdocuments) in a document, can I update them all in a single query?
Yes, quite easily. This is one thing that becomes easier with embedding. You could use a query like:
db.users.update({user_id:uid,'tickets.code':'asdf-1'}, {$set:{'tickets.$.title':'Oh NOES'}})
However, to note, you can only update ONE subdocument at a time using the positional operator. As such this means you cannot, in a single atomic operation, update all ticket dates on a single user to 5 days in the future.
As for adding a new ticket, that is quite simple:
db.users.update({user_id:uid},{$push:{tickets:{code:asdf-1,title:"Whoop"}}})
So yes, you can quite simply, depending on your queries, update the entire users data in a single call.
That was quite a long answer so hopefully I haven't missed anything out, hope it helps.