Mongo aggregation with paginated data and totals

23,128

Solution 1

If you have a lot of events, {$ push: "$$ ROOT"}, will make Mongo return an error, I have solved it with $facet (Only works with version 3.4+)

aggregate([
    { $match: options },
    {
      $facet: {
        edges: [
          { $sort: sort },
          { $skip: skip },
          { $limit: limit },
        ],
        pageInfo: [
          { $group: { _id: null, count: { $sum: 1 } } },
        ],
      },
    },
  ])

Solution 2

Did this in two steps instead of one:

// Get the totals
db.mongoAuditEvent.aggregate([{$group: {_id: "$corrId"}}, {$group: {_id: 1, total: {$sum: 1}}}]);

// Get the data
db.mongoAuditEvent.aggregate([
  {$group: {
    _id : "$corrId", 
    currentEvent: {"$last": "$event.status"}, 
    "events": { $push: "$$ROOT"}
  }},
  {$sort: {"events.timestamp": -1} }, // Latest first
  {$skip: 0 },
  {$limit: 10}
], {allowDiskUse: true}).pretty();

I would be very happy if anybody got a better solution to this though.

Solution 3

A performance optimization tip:

When you use $facet stage for pagination, Try to add it as soon as it's possible.

For example: if you want to add $project or $lookup stage, add them after $facet, not before it.

it will have impressive effect in aggregation speed. because $project stage require MongoDB to explore all documents and get involve with all fields(which is not necessary).

Share:
23,128
Øystein Amundsen
Author by

Øystein Amundsen

I am a senior full-stack developer working with server/client development for nearly 20 years. I've been working on most technologies and in every phase of a project. Although I might not be an expert in anything, I know more of some things than others and I've come to be absolutely in love with standards and front end development. I hope I can be of help to anyone.

Updated on November 30, 2020

Comments

  • Øystein Amundsen
    Øystein Amundsen over 3 years

    I've crawled all over stack overflow, and have not found any info on how to return proper pagination data included in the resultset.

    I'm trying to aggregate some data from my mongo store. What I want, is to have something return:

    {
      total: 5320,
      page: 0,
      pageSize: 10,
      data: [
        {
          _id: 234,
          currentEvent: "UPSTREAM_QUEUE",
          events: [
            { ... }, { ... }, { ... }
          ]
        },
        {
          _id: 235,
          currentEvent: "UPSTREAM_QUEUE",
          events: [
            { ... }, { ... }, { ... }
          ]
        }
      ]
    }
    

    This is what I have so far:

    // page and pageSize are variables
    db.mongoAuditEvent.aggregate([
      // Actual grouped data
      {"$group": {
        "_id" : "$corrId", 
        "currentEvent": {"$last": "$event.status"}, 
        "events": { $push: "$$ROOT"}
      }},
      // Pagination group
      {"$group": {
        "_id": 0,
        "total": { "$sum": "corrId" },
        "page": page,
        "pageSize": pageSize,
        "data": {
          "$push": {
            "_id": "$_id",
            "currentEvent": "$currentEvent",
            "events": "$events"
          }
        }
      }},
      {"$sort": {"events.timestamp": -1} }, // Latest first
      {"$skip": page },
      {"$limit": pageSize }
    ], {allowDiskUse: true});
    

    I'm trying to have a pagination group as root, containing the actual grouped data inside (so that I get actual totals, whilst still retaining skip and limits).

    The above code will return the following error in mongo console: The field 'page' must be an accumulator object

    If I remove the page and pageSize from the pagination group, I still get the following error:

    BSONObj size: 45707184 (0x2B96FB0) is invalid. Size must be between 0 and 16793600(16MB) First element: id: 0

    If I remove the pagination group alltogether, the query works fine. But I really need to return how many documents I have stored total, and allthough not actually necessary, page and pageSize would be nice to return as well.

    Can somebody please tell me what I am doing wrong? Or tell me if it is at all possible to do this in one go?

  • Petrogad
    Petrogad over 6 years
  • insivika
    insivika over 3 years
    why are you setting _id: null? Great answer tho, thank you
  • T Tse
    T Tse over 3 years
    Why is $count not used? It's also 3.4+. Are there performance issues?
  • migue diaz
    migue diaz over 3 years
    @insivika _id: null aggroup all match, you can use _id: "any string" too.
  • Stefan
    Stefan over 2 years
    This won't work if your match returns 10 results and your skip is 100. If I am on page 2 of a list and I search for something, you will have to reset the skip to 0. Just a quick thing I saw, could be what you want
  • Olivier D.
    Olivier D. over 2 years
    This issue with this kind of group is that you may well end up building too large of a document (> 16Gb)