Am a beginner in mongoDB. I have two collections Book and author. [name and workswritten] are the common column respectively. Using inner join I have to emit the some columns in
I need how to do it using mapreduce functions.
The db
object has been deprecated for a long time in MRs as such it is impossible to source two tables at once within an MR.
There is another solution though: two MRs. You run a MR on the first collection first outputting to the needed collection and then you use a second MR to output to that very same collection using an out
option like reduce
or merge
to "join" to two collections together.
Of course this is slow so the better way is to not do it. As for:
select book.name,book.editions,book.characters,author.name
from dbo.book book
inner join dbo.author author on book.name=author.works_written
This query can be with streaming a cursor from the book collection and then pining the DB very quickly each book you iterate through (it's ok to do this in MongoDB) grabbing the authors details.
You can also get a set of author ids from the books and then query the authors collection all at once and sort the two out on client side.
You can compare below codes for SQL and mongoDB (NoSQL):
SQL Code:
SELECT *, [output array field]
FROM collection
WHERE [output array field] IN (SELECT *
FROM [collection to join]
WHERE [foreignField]= [collection.localField]);
mongoDB (NoSQL):
{
$lookup:
{
from: [collection to join],
localField: [field from the input documents],
foreignField: [field from the documents of the "from" collection],
as: [output array field]
}
}
The mongodb is not relational database -- so it is not possible to any kind of joins here. Joins is hard to scale.
The common way in mongodb to achieve join is data denormalization. In your case you could denormalize author name into book table. Then your query will not require join. This is schema example:
book
{
_id,
name,
editions,
characters,
author_name
}
Keep in the mind that you will need to update author_name
in books collection each time when you update author collection.
Another solution -- additional request for name of author for each book, but it will work much slower.
MongoDB does not support JOIN operations. When you need this functionality, you have to implement it yourself by querying both collections.
For that reason it's often a good strategy to use embedding instead of linking.
A Map/Reduce job is usually a very expensive operation. It should only be used occasionally for data mining purposes.
Try this:-
db.book.aggregate([
{
$lookup:
{
from: "author",
localField: "name",
foreignField: "works_written",
as: "nameWorkData"
}
}
])