SQLAlchemy filter in_ operator

62,450

Solution 1

If the table where you are getting your rsids from is available in the same database I'd use a subquery to pass them into your Genotypes query rather than passing the one million entries around in your Python code.

sq = session.query(RSID_Source).subquery()
q = session.query(Genotypes).filter(Genotypes.rsid.in_(sq))

The issue is that in order to pass that list to SQLite (or any database, really), SQLAlchemy has to pass over each entry for your in clause as a variable. The SQL translates roughly to:

-- Not valid SQLite SQL
DECLARE @Param1 TEXT;
SET @Param1 = ?;
DECLARE @Param2 TEXT;
SET @Param2 = ?;
-- snip 999,998 more

SELECT field1, field2, -- etc.
FROM Genotypes G
WHERE G.rsid IN (@Param1, @Param2, /* snip */)

Solution 2

The below workaround worked for me:

q = session.query(Genotypes).filter(Genotypes.rsid.in_(inall))
query_as_string = str(q.statement.compile(compile_kwargs={"literal_binds": True}))
session.execute(query_as_string).first()

This basically forces the query to compile as a string before execution, which bypasses the whole variables issue. Some details on this are available in SQLAlchemy's docs here.

BTW, if you're not using SQLite you can make use of the ANY operator to pass the list object as a single parameter (see my answer to this question here).

Share:
62,450
user1988705
Author by

user1988705

Updated on March 07, 2020

Comments

  • user1988705
    user1988705 over 4 years

    I am trying to do a simple filter operation on a query in sqlalchemy, like this:

    q = session.query(Genotypes).filter(Genotypes.rsid.in_(inall))
    

    where

    inall is a list of strings Genotypes is mapped to a table: class Genotypes(object): pass

    Genotypes.mapper = mapper(Genotypes, kg_table, properties={'rsid': getattr(kg_table.c, 'rs#')})
    

    This seems pretty straightforward to me, but I get the following error when I execute the above query by doing q.first():

    "sqlalchemy.exc.OperationalError: (OperationalError) too many SQL variables u'SELECT" followed by a list of the 1M items in the inall list. But they aren't supposed to be SQL variables, just a list whose membership is the filtering criteria.

    Am I doing the filtering incorrectly?

    (the db is sqlite)

  • Dan M.
    Dan M. about 6 years
    Why doesn't it inline them before making a query? It doesn't make sense for me why the above couldn't be just IN (values...) instead of SET @params = values...; ...; ... IN (@params...).
  • TheRealChx101
    TheRealChx101 almost 5 years
    Any implications or security otherwise?
  • Jeff Bluemel
    Jeff Bluemel over 4 years
    I've tried sub-queries and have not been able to get them to work. the .subquery() at the end seems to make all the difference. thanks