How do I write this Mongo aggregation query in Spring?

12,870

The $project pipeline is not necessary since you can still do an $unwind on the nested field, thus this aggregation pipeline can yield the same results as your current:

db.MyCollection.aggregate([
    {
        "$match": {
            "_id": { "$in": ['A', 'B', 'C'] }
        }
    },
    { "$unwind": "$nested.field" },
    {
        "$group": {
            "_id": "$_id", 
            "maxVersion": { "$max": "$nested.field.version" }
        }
    }
])

The Spring Data MongoDB aggregation equivalent:

Aggregation agg = newAggregation(
        match(Criteria.where("_id").in(ids)),
        unwind("nested.field"),        
        group("_id").max("nested.field.version").as("maxVersion")
    );

Back to your current aggregation, you need to $unwind on nested.field array, not nested.field.version field since that is a String, not array:

db.MyCollection.aggregate([
    {$match: {_id: {$in: ['A', 'B', 'C']}}},
    {$project: {"fields": "$nested.field"}},
    {$unwind: "$fields"},
    {$group: {_id: "$_id", "maxVersion": {$max: "$fields.version"}}}
])

the Sprind Data MongoDB equivalent would look like:

Aggregation agg = newAggregation(
        match(Criteria.where("_id").in(ids)),
        project().and("nested.field").as("fields")
        unwind("fields"),        
        group("_id").max("fields.version").as("maxVersion")
    );
Share:
12,870
donok
Author by

donok

I program things in Python. I'm AWS certified. I love Terraform. In the past I've programmed things in C#, Java, PHP, and more.

Updated on July 11, 2022

Comments

  • donok
    donok almost 2 years

    I've got an aggregation query in MongoDB that works when I run it directly in the shell. Here's the shell query:

    db.MyCollection.aggregate([
        {$match: {_id: {$in: ['A', 'B', 'C']}}},
        {$project: {"versions": "$nested.field.version"}},
        {$unwind: "$versions"},
        {$group: {_id: "$_id", "maxVersion": {$max: "$versions"}}}
    ])
    

    So as you can see, this does the following:

    1. Matches only certain documents with specified IDs
    2. Projects a nested field down to a base-level field (and effectively filters out all other fields from the pipeline, but still retaining the IDs)
    3. Unwinds the array elements of the $versions field that we projected into individual documents in the pipeline
    4. Finds the max value of those $versions per ID

    Like I said, that query above already works. My question is how to translate that into Spring MongoDB syntax. Here's my first attempt, which does not work:

    Aggregation aggregation = newAggregation(
        match(Criteria.where("_id").in(listOfIds))
        ,project().and("versions").nested(bind("versions", "nested.field.version"))
        ,unwind("versions")
        ,group("_id").max("versions").as("maxVersion")
    );
    

    When I try to run the code in debug mode, I can see that I actually get an IllegalArgumentException on newAggregation saying it cannot evaluate. If I comment out the line with the $group clause, then I can see this toString() representation of aggregation variable, which reveals a problem with the $project clause:

    {
      "aggregate" : "__collection__" ,
      "pipeline" : [
        { "$match" : { "_id" : { "$in" : [ "A" , "B" , "C"]}}} ,
        { "$project" : { "versions" : { "versions" : "$nested.field.version"}}} ,
        { "$unwind" : "$versions"}
      ]
    }
    

    Clearly this doesn't match up with what I was intending so I'm not getting the syntax correct. But TBH I don't find the Spring MongoOps syntax very intuitive, and their documentation isn't great either.

    I don't see any way to call that nested() method without first including the call to and(). I think that's the main problem since it's doubling up the nesting there. Are there any Spring MongoOps heroes here who can help me write the equivalent Java code properly?

    EDIT: Here's a snapshot of the collection I'm using: robomongo display

  • donok
    donok over 8 years
    Well, your answer here was correct. Unfortunately it's not going to work in my particular case though because of a bug with Spring Data. I always sanitize my field names before posting on SO, but in reality I have underscores in some of the nested field names, and it looks like Spring Data does some sort of split on underscores when doing some referential integrity checks that cause it to fail. So thanks for the effort, but unfortunately Spring Data is too buggy to use right now for my use case.
  • chridam
    chridam over 8 years
    @SoaperGEM No worries, too bad that didn't work out for you.