Calculated group-by fields in MongoDB

懵懂的女人 提交于 2019-11-27 02:22:04

问题


For this example from the MongoDB documentation, how do I write the query using MongoTemplate?

db.sales.aggregate(
   [
      {
        $group : {
           _id : { month: { $month: "$date" }, day: { $dayOfMonth: "$date" }, year: { $year: "$date" } },
           totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } },
           averageQuantity: { $avg: "$quantity" },
           count: { $sum: 1 }
        }
      }
   ]
)

Or in general, how do I group by a calculated field?


回答1:


You can actually do something like this with "project" first, but to me it's a little counter-intuitive to require a $project stage before hand:

    Aggregation agg = newAggregation(
        project("quantity")
            .andExpression("dayOfMonth(date)").as("day")
            .andExpression("month(date)").as("month")
            .andExpression("year(date)").as("year")
            .andExpression("price * quantity").as("totalAmount"),
        group(fields().and("day").and("month").and("year"))
            .avg("quantity").as("averavgeQuantity")
            .sum("totalAmount").as("totalAmount")
            .count().as("count")
    );

Like I said, counter-intuitive as you should just be able to declare all of this under $group stage, but the helpers don't seem to work this way. The serialization comes out a bit funny ( wraps the date operator arguments with arrays ) but it does seem to work. But still, this is two pipeline stages rather than one.

What is the problem with this? Well by separating the stages the stages the "project" portion forces the processing of all of the documents in the pipeline in order to get the calculated fields, that means it passes through everything before moving on to the group stage.

The difference in processing time can be clearly seen by running the queries in both forms. With a separate project stage, on my hardware takes three times longer to execute than the query where all fields are calculated during the "group" operation.

So it seems the only present way to construct this properly is by building the pipeline object yourself:

    ApplicationContext ctx =
            new AnnotationConfigApplicationContext(SpringMongoConfig.class);
    MongoOperations mongoOperation = (MongoOperations) ctx.getBean("mongoTemplate");

    BasicDBList pipeline = new BasicDBList();
    String[] multiplier = { "$price", "$quantity" };

    pipeline.add(
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("month", new BasicDBObject("$month", "$date"))
                    .append("day", new BasicDBObject("$dayOfMonth", "$date"))
                    .append("year", new BasicDBObject("$year", "$date"))
            )
            .append("totalPrice", new BasicDBObject(
                "$sum", new BasicDBObject(
                    "$multiply", multiplier
                )
            ))
            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))
            .append("count",new BasicDBObject("$sum",1))
        )
    );

    BasicDBObject aggregation = new BasicDBObject("aggregate","collection")
        .append("pipeline",pipeline);

    System.out.println(aggregation);

    CommandResult commandResult = mongoOperation.executeCommand(aggregation);

Or if all of that seems to terse to you, then you can always work with the JSON source and parse that. But of course, it has to be valid JSON:

    String json = "[" +
        "{ \"$group\": { "+
            "\"_id\": { " +
                "\"month\": { \"$month\": \"$date\" }, " +
                "\"day\": { \"$dayOfMonth\":\"$date\" }, " +
                "\"year\": { \"$year\": \"$date\" } " +
            "}, " +
            "\"totalPrice\": { \"$sum\": { \"$multiply\": [ \"$price\", \"$quantity\" ] } }, " +
            "\"averageQuantity\": { \"$avg\": \"$quantity\" }, " +
            "\"count\": { \"$sum\": 1 } " +
        "}}" +
    "]";

    BasicDBList pipeline = (BasicDBList)com.mongodb.util.JSON.parse(json);



回答2:


Another alternative is to use a custom aggregation operation class defined like so:

public class CustomAggregationOperation implements AggregationOperation {
    private DBObject operation;

    public CustomAggregationOperation (DBObject operation) {
        this.operation = operation;
    }

    @Override
    public DBObject toDBObject(AggregationOperationContext context) {
        return context.getMappedObject(operation);
    }
}

Then implement it in the pipeline like so:

Aggregation aggregation = newAggregation(
    new CustomAggregationOperation(
        new BasicDBObject(
            "$group",
            new BasicDBObject("_id",
                new BasicDBObject("day", new BasicDBObject("$dayOfMonth", "$date" ))
                .append("month", new BasicDBObject("$month", "$date"))
                .append("year", new BasicDBObject("$year", "$date"))
            )
            .append("totalPrice", new BasicDBObject(
                "$sum", new BasicDBObject(
                    "$multiply", Arrays.asList("$price","$quantity")
                )
            ))
            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))
            .append("count",new BasicDBObject("$sum",1))
        )
    )
)

So it's basically just an interface that is consistent with that used by the existing pipeline helpers, but instead of using other helper methods it directly takes a DBObject definition to return in pipeline construction.



来源:https://stackoverflow.com/questions/25437823/calculated-group-by-fields-in-mongodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!