How to query for distinct results in mongodb with python?

14,841

Solution 1

First of all, it's only possible to get distinct values on some field (only one field) as explained in MongoDB documentation on Distinct.

Mongoengine's QuerySet class does support distinct() method to do the job.

So you might try something like this to get results:

Students.objects(name="Tom").distinct(field="class")

This query results in one BSON-document containing list of classes Tom attends.

Attention Note that returned value is a single document, so if it exceeds max document size (16 MB), you'll get error and in that case you have to switch to map/reduce approach to solve such kind of problems.

Solution 2

import pymongo
posts = pymongo.MongoClient('localhost', 27017)['db']['colection']


res = posts.find({ "geography": { "$regex": '/europe/', "$options": 'i'}}).distinct('geography')
print type(res)
res.sort()
for line in res:
    print line

refer to http://docs.mongodb.org/manual/reference/method/db.collection.distinct/ distinct returns a list , will be printed on print type(res) , you can sort a list with res.sort() , after that it will print the values of the sorted list.

Also you can query posts before select distinct values .

Share:
14,841
Rolando
Author by

Rolando

A learner.

Updated on June 15, 2022

Comments

  • Rolando
    Rolando almost 2 years

    I have a mongo collection with multiple documents, suppose the following (assume Tom had two teachers for History in 2012 for whatever reason)

    {
    "name" : "Tom"
    "year" : 2012
    "class" : "History"
    "Teacher" : "Forester"
    }
    
    {
    "name" : "Tom"
    "year" : 2011
    "class" : "Math"
    "Teacher" : "Sumpra"
    }
    
    
    {
    "name" : "Tom",
    "year" : 2012,
    "class" : "History",
    "Teacher" : "Reiser"
    }
    

    I want to be able to query for all the distinct classes "Tom" has ever had, even though Tom has had multiple "History" classes with multiple teachers, I just want the query to get the minimal number of documents such that Tom is in all of them, and "History" shows up one time, as opposed to having a query result that contains multiple documents with "History" repeated.

    I took a look at: http://mongoengine-odm.readthedocs.org/en/latest/guide/querying.html

    and want to be able to try something like:

    student_users = Students.objects(name = "Tom", class = "some way to say distinct?")
    

    Though it does not appear to be documented. If this is not the syntactically correct way to do it, is this possible in mongoengine, or is there some way to accomplish with some other library like pymongo? Or do i have to query for all documents with Tom then do some post-processing to get to unique values? Syntax would be appreciated for any case.