Django: Retrieving IDs of manyToMany fields quickly

17,386

Solution 1

According you have Three instances. You can use the values_list method to retrieve just the results and from this result get just the ID's of their related instances. I use the pk field to be my filter because i don't know your scheme, but you can use anything, just must be a QuerySet.

>>> result = A.objects.filter(pk=1)
>>> result.values('related__id')
[{'id': 2}, {'id': 3}]
>>> result.values_list('related__id')
[(2,), (3,)]
>>> result.values_list('related__id', flat=True)
[2, 3]

Solution 2

You can get pretty close like this:

qs = A.objects.prefetch_related(Prefetch(
                      'related', 
                      queryset=A.objects.only('pk'), 
                      to_attr='related_insts')).in_bulk(my_list_of_pks)

This will give a mapping from pks of the current object to the instance itself, so you can iterate through as follows:

for pk, inst in qs.iteritems():
  related_ids = (related.pk for related in inst.related_insts)

Or given an instance, you can do a fast lookup like so:

related_ids = (related.pk for related in qs[instance.pk]).

This method maps the instance ids to the related ids (indirectly) since you specifically requested a dictionary. If you aren't doing lookups, you may want the following instead:

qs = A.objects.prefetch_related(Prefetch(
        'related', 
        queryset=A.objects.only('pk'), 
        to_attr='related_insts')).filter(pk__in=my_list_of_pks)
for inst in qs:
  related_ids = (related.pk for related in inst.related_insts)

You may take note of the use of only to only pull the pks from the db. There is an open ticket to allow the use of values and (I presume) values_list in Prefetch queries. This would allow you to do the following.

qs = A.objects.prefetch_related(Prefetch(
        'related', 
        queryset=A.objects.values_list('pk', flat=True), 
        to_attr='related_ids')).filter(pk__in=my_list_of_pks)
for inst in qs:
  related_ids = inst.related_ids

You could of course optimize further, for example by using qs.only('related_insts') on the primary queryset, but make sure you aren't doing anything with these instances-- they're essentially just expensive containers to hold your related_ids.

I believe this is the best that's available for now (without custom queries). To get to exactly what you want, two things are needed:

  1. The feature above is implemented
  2. values_list is made to work with Prefetch to_attr like it does for annotations.

With these two things in place (and continuing the above example) you could do the following to get exactly what you requested:

d = qs.values_list('related_ids', flat=True).in_bulk()
for pk, related_pks in d.items():
    print 'Containing Objects %s' % pk
    print 'Related objects %s' % related_pks
# And lookups
print 'Object %d has related objects %s' % (20, d[20])

I've left off some details explaining things, but it should be pretty clear from the documentation. If you need any clarification, don't hesitate!

Share:
17,386

Related videos on Youtube

Noah Gilmore
Author by

Noah Gilmore

Full stack web and iOS dev. https://noahgilmore.com/blog, https://cifilter.io, https://getfluency.io https://twitter.com/trestleapp, and more

Updated on July 01, 2022

Comments

  • Noah Gilmore
    Noah Gilmore almost 2 years

    I have the following model schema in Django (with Postgres).

    class A(Models.model):
        related = models.ManyToManyField("self", null=True)
    

    Given a QuerySet of A, I would like to return a dictionary mapping each instance of A in the QuerySet to a list of ids of its related instances as quickly as possible.

    I can surely iterate through each A and query the related field, but is there a more optimal way?

    • cazgp
      cazgp almost 10 years
      Did you ever get a solution to this?
  • Noah Gilmore
    Noah Gilmore almost 11 years
    That's what I've been doing, but the problem is that I need to filter on more than one pk (e.g. A.objects.filter(pk__in=[1,2,6])). I'd like a list of lists, where the nth inner list is like the values_list here, but corresponding to the nth pk.
  • alexcasalboni
    alexcasalboni almost 8 years
    You can also do instance.related.values('id', flat=True), if you have a single model instance already.
  • n_moen
    n_moen over 2 years
    result.values_list appears to return a QuerySet these days. It can easily be converted back to a list with list(result.values_list)