Django filter queryset on "tuples" of values for multiple columns

11,069

Solution 1

I don't see much solutions except for a big OR clause:

import operator
from itertools import izip
query = reduce(
    operator.or_, 
    (Q(firstname=fn, lastname=ln) for fn, ln in izip(first_list, last_list))
    )

Person.objects.filter(query)

Solution 2

bruno's answer works, but it feels dirty to me - both on the Python level and on the SQL level (a large concatenation of ORs). In MySQL at least, you can use the following SQL syntax:

SELECT id FROM table WHERE (first_name, last_name) IN
       (('John','Doe'),('Jane','Smith'),('Bill','Clinton'))

Django's ORM doesn't provide a direct way to do this, so I use raw SQL:

User.objects.raw('SELECT * FROM table WHERE (first_name, last_name) IN %s',
      [ (('John','Doe'),('Jane','Smith'),('Bill','Clinton')) ])

(This is a list with one element, matching the single %s in the query. The element is an iterable of tuples, so the %s will be converted to an SQL list of tuples).

Notes:

  1. As I said, this works for MySQL. I'm not sure what other backends support this syntax.
  2. A bug in python-mysql, related to this behavior, was fixed in November 2013 / MySQLdb 1.2.4, so make sure your Python MySQLdb libraries aren't older than that.

Solution 3

Using python 3.5 version :

import operator
import functools

query = functools.reduce(
    operator.or_, 
    (Q(firstname=fn, lastname=ln) for fn, ln in zip(first_list, last_list))
    )

Person.objects.filter(query)
Share:
11,069

Related videos on Youtube

8one6
Author by

8one6

Updated on June 05, 2022

Comments

  • 8one6
    8one6 almost 2 years

    Say I have a model:

    Class Person(models.Model):
        firstname = models.CharField()
        lastname = models.CharField()
        birthday = models.DateField()
        # etc...
    

    and say I have a list of 2 first names: first_list = ['Bob', 'Rob'] And I have a list of 2 last names: last_list = ['Williams', 'Williamson']. Then if I wanted to select everyone whose first name was in first_list I could run:

    Person.objects.filter(firstname__in=first_list)
    

    and if I wanted to select everyone whose last name was in last_list, I could do:

    Person.objects.filter(lastname__in=last_list)
    

    So far, so good. If I want to run both of those restrictions at the same time, that's easy...

    Person.objects.filter(firstname__in=first_list, lastname__in=last_list)
    

    If I wanted to do the or style search instead of the and style search, I can do that with Q objects:

    Person.objects.filter(Q(firstname__in=first_list) | Q(lastname__in=last_name))
    

    But what I have in mind is something a bit more subtle. What if I just want to return a queryset that returns specific combinations of first and last names? I.e. I want to return the Person objects for which (Person.firstname, Person.lastname) is in zip(first_names, last_names). I.e. I want to get back anyone named the Bob Williams or Rob Williamson (but not anyone named Bob Williamson or Rob Williams).

    In my actual use case, first_list and last_list would both have ~100 elements.

    At the moment, I need to solve this problem in a Django app. But I am also curious about the best way to handle this in a more general SQL context.

    Thanks! (And please let me know if I can clarify anything.)

  • 8one6
    8one6 over 10 years
    I didn't realize you could programatically combine Q statements like that! This seems perfect. Do you know if this sort of query is efficient, from a SQL point of view?
  • bruno desthuilliers
    bruno desthuilliers over 10 years
    The generated SQL will be the same as the one you'd write by hand: SELECT <fieldnames here...> FROM <tablename> WHERE (fistname="X1" AND lastname="y1") OR (firstname="x2" AND lastname="Y2") OR <etc....>. For more efficiency you may want to add an index on (firstname, lastname), on a large dataset with real-life data it should be discriminating enough to speed up things (assuming your db server is smart enough to use it), but first try it as is and check if it really needs any optimisation.
  • bruno desthuilliers
    bruno desthuilliers over 10 years
    Oh and if you know of another way to write a SQL query getting the same results please let me know, I'm always willing to learn new things.
  • 8one6
    8one6 over 10 years
    I'm a SQL newbie myself. The other general approach that came to mind was to dynamically create a merged column in the query itself: SELECT <fieldnames> FROM <tablename> WHERE CONCAT(firstname, lastname) IN <list of concatted first and last names>. (EDIT: I think your approach might be faster because a well-designed SQL engine should first filter on the acceptable firstnames OR the acceptable lastnames and then check whether the pairs are ok, while mine would need to make the concatted col for each row). Curious what others have to add!
  • bruno desthuilliers
    bruno desthuilliers over 10 years
    @DJ_8one6: I'm not a SQL guru in any way but my experience with SQL databases is that just like with C compilers if you use a good one and know the right flags / indexes / whatever tricks, you'll have a very hard time getting better optimisation "by hand", so better to write good but simple code and let the db engine / compiler do the job (with just a little hints). My 2 cents...