I have a collection as follow:
"_id" : ObjectId("5491d65bf315c2726a19ffe0"),
"tweetID" : NumberLong(535063274220687360),
"tweetText" : "19 RT Toronto @SunNewsNetwork: WATCH: When it comes to taxes, regulations, and economic freedom, is Canada more \"American\" than America? http://t.co/D?",
"retweetCount" : 1,
"source" : "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
"Date" : ISODate("2014-11-19T04:00:00.000Z"),
"Added" : ISODate("2014-11-19T04:00:00.000Z"),
"tweetLat" : 0,
"tweetLon" : 0,
"url" : "http://t.co/DH0xj0YBwD ",
"sentiment" : 18,
"quality" : 0.4,
"intensity" : 10,
"happiness" : 0,
"calmness" : 0,
"kindness" : 0,
"sureness" : 0,
"Hashtags" : [
"authorID" : NumberLong(49067869),
"authorName" : "Fran Walker",
"authorFollowers" : 93,
"authorFollowing" : 133,
"authorFavourites" : 50,
"authorTweets" : 13667,
"authorVerified" : false,
"screenName" : "snickeringcrow",
"profileImageURL" : "http://pbs.twimg.com/profile_images/2180546952/smilinkitty.asp_-_Copy_normal.jpg",
"profileLocation" : "",
"timezone" : "Eastern Time (US & Canada)",
"gender" : "M",
"Entities" : [
"id" : 6,
"name" : "Harper, Stephen",
"frequency" : 0,
"partyId" : 6
"Topics" : [
"id" : 8,
"name" : "Employment",
"frequency" : 1,
"Subtopics" : [
"id" : 34,
"name" : "Economic",
"frequency" : 1
"id" : 11,
"name" : "Economy",
"frequency" : 1,
"Subtopics" : [
"id" : 43,
"name" : "Economic",
"frequency" : 1
And I am trying to get group by date and get the sum of sentiment for each group divided by (number of item in each group -1). As you see because of that -1 I can not use avg function of mongo so I have to do it manually as follow:
DBCollection collectionG;
collectionG = db.getCollection("TweetCachedCollection");
ArrayList<EntityEpochData> results = new ArrayList<EntityEpochData>();
List<DBObject> stages = new ArrayList<DBObject>();
ArrayList<DBObject> andArray = null;
DBObject groupFields = new BasicDBObject("_id", "$Added");
new BasicDBObject("$sum", "$" + sType.toLowerCase()));
groupFields.put("count", new BasicDBObject("$sum", 1));
DBObject groupBy = new BasicDBObject("$group", groupFields);
DBObject project = new BasicDBObject("_id", 0);
project.put("count", new BasicDBObject("$subtract", new Object[] {
"$count", 1 }));
project.put("value", new BasicDBObject("$divide", new Object[] {
"$value", "$count" }));
project.put("Date", "$_id");
stages.add(new BasicDBObject("$project", project));
DBObject sort = new BasicDBObject("$sort", new BasicDBObject("Date", 1));
AggregationOutput output = collectionG.aggregate(stages);
Now everything works properly except :
lets say the count is 3 but if I add it I expect the number for count be 2 and it is after subtraction but when it comes to the next line which is devision still count refers to 3.
For more explanation if for example sum is 6 and count 3 I want sum/(count-1) return 2 but it returns 3!!!! so it seems that this line returns 2:
project.put("count",new BasicDBObject("$subtract", new Object[] {"$count", 1 }));
but the next line still divide 6 by 3 instead of 2:
project.put("value", new BasicDBObject("$divide", new Object[] {
"$value", "$count" }));
it seems that the count in the last line still refers to the old value of count instead of updated one...
Can anyone help me ?
I myself think if I satge the subtraction first and then do the division it works but I do not know how to do it?
You need to make a slight modification to your $project
object. You need to make use of the Object that was obtained on subtracting 1
from count
, rather than using the previous value of count
DBObject project = new BasicDBObject("_id", 0);
DBObject countAfterSubtraction = new BasicDBObject("$subtract",
new Object[] {"$count", 1});
DBObject value = new BasicDBObject("$divide",
new Object[] {"$value",countAfterSubtraction});
project.put("value", value);
project.put("Date", "$_id");
stages.add(new BasicDBObject("$project", project));
The above code would work for groups that have records >= 2
. If there is a single group with only one record, the count after subtraction would be zero, resulting in a divide by zero error.
So you could modify your code, to include a $cond, to check if the count after subtraction is 0
, if it is, then default it to 1
, else keep the subtracted value of count
DBObject project = new BasicDBObject("_id", 0);
DBObject countAfterSubtraction = new BasicDBObject("$subtract",
new Object[] {"$count", 1});
DBObject eq = new BasicDBObject("$eq",
new Object[]{countAfterSubtraction,0});
DBObject cond = new BasicDBObject("$cond",
new Object[]{eq,1,countAfterSubtraction});
DBObject value = new BasicDBObject("$divide",
new Object[] {"$value",cond});
project.put("value", value);
project.put("Date", "$_id");
stages.add(new BasicDBObject("$project", project));