问题
If I wanted to count foobar.relationships.friend.count, how would I use map/reduce against this document structure so the count will equal 22.
[
[0] {
"rank" => nil,
"profile_id" => 3,
"20130913" => {
"foobar" => {
"relationships" => {
"acquaintance" => {
"count" => 0
},
"friend" => {
"males_count" => 0,
"ids" => [],
"females_count" => 0,
"count" => 10
}
}
}
},
"20130912" => {
"foobar" => {
"relationships" => {
"acquaintance" => {
"count" => 0
},
"friend" => {
"males_count" => 0,
"ids" => [
[0] 77,
[1] 78,
[2] 79
],
"females_count" => 0,
"count" => 12
}
}
}
}
}
]
回答1:
In JavaScript this query get you the result you expect
r.db('test').table('test').get(3).do( function(doc) {
return doc.keys().map(function(key) {
return r.branch(
doc(key).typeOf().eq('OBJECT'),
doc(key)("foobar")("relationships")("friend")("count").default(0),
0
)
}).reduce( function(left, right) {
return left.add(right)
})
})
In Ruby, it should be
r.db('test').table('test').get(3).do{ |doc|
doc.keys().map{ |key|
r.branch(
doc.get_field(key).typeOf().eq('OBJECT'),
doc.get_field(key)["foobar"]["relationships"]["friend"]["count"].default(0),
0
)
}.reduce{ |left, right|
left+right
}
}
I would also tend to think that the schema you use is not really adapted, it would be better to use something like
{
rank: null
profile_id: 3
people: [
{
id: 20130913,
foobar: { ... }
},
{
id: 20130912,
foobar: { ... }
}
]
}
Edit: A simpler way to do it without using r.branch
is just to remove the fields that are not objects with the without
command.
Ex:
r.db('test').table('test').get(3).without('rank', 'profile_id').do{ |doc|
doc.keys().map{ |key|
doc.get_field(key)["foobar"]["relationships"]["friend"]["count"].default(0)
}.reduce{ |left, right|
left+right
}
}.run
回答2:
I think you will need your own inputreader. This site gives you a tutorial how it can be done: http://bigdatacircus.com/2012/08/01/wordcount-with-custom-record-reader-of-textinputformat/
Then you run mapreduce with a mapper
Mapper<LongWritable, ClassRepresentingMyRecords, Text, IntWritable>
In your map function you extract the value for count and emit this is the value. Not sure if you need a key?
In the reducer you add together all the elements with the same key (='count' in your case).
This should get you on your way I think.
来源:https://stackoverflow.com/questions/18763866/how-would-you-use-map-reduce-on-this-document-structure