I found out that SQLAlchemy translates
db.query(...).filter(A.id.in_(ids))
into
SELECT ...
FROM a
WHERE a.id != a.id
Be aware what you are asking for:
A.id
is comparable any comparison can actually succeed. The non-existing value is not comparable to anything, all comparisons will result in a non-existing value which in turn is evaluated as False. That is, if A.ID
is NULL
then A.ID == anything
is False and A.ID != anything
is also False: A.ID == A.ID || A.ID != A.ID
is False if A.ID
is NULL
.IN
-clause with an empty sequence asks if the value is part of the empty list. The non-existing value is part of no list, not even the empty one.IS NOT NULL
and something that is part of nothing. This is a condition that has to be checked for. The non-existing value is not something; only some value that is not NULL
can not be be a member of the empty list...IN
-clause if the sequence is empty.For a concrete example this sqlfiddle
For a more philosophical approach see What is the nature of void
I'm using:
if len(ids) > 0:
db.query(...).where(A.id.in_(ids))
else:
db.query(...).where(False)
I tried a .limit(0)
instead of the .where(false)
without success. There is some behind-the-scenes difference in the empty querysets that broke other stuff down the pipeline. This workaround, while could be faster, at least avoids your mentioned warning.
To answer the OP's question of "why", here is the FAQ entry (which is always surprisingly difficult for me to find):
Why does
.col.in_([])
producecol != col
? Why not1=0
?A little introduction to the issue. The
IN
operator in SQL, given a list of elements to compare against a column, generally does not accept an empty list, that is while it is valid to say:column IN (1, 2, 3)
it’s not valid to say:
column IN ()
SQLAlchemy's
Operators.in_()
operator, when given an empty list, produces this expression:column != column
As of version 0.6, it also produces a warning stating that a less efficient comparison operation will be rendered. This expression is the only one that is both database agnostic and produces correct results.
For example, the naive approach of "just evaluate to false, by comparing 1=0 or 1!=1", does not handle nulls properly. An expression like:
NOT column != column
will not return a row when
column IS NULL
, but an expression which does not take the column into account, such as:NOT 1=0
will return a row.
As shown in this post, you may be able to use the ANY function to avoid this, since it's syntactically valid even for an empty list (but not supported on SQLite apparently). It's probably faster for large lists too, since it does less string mangling to build the query.
The performance issue with the in_
operator has recently been fixed and the fix will probably be in SQLAlchemy 1.2.0.
When I ran into this, it was because I was using an Enum type for one of my database table columns. When I changed it to a String, the problem went away. That's not a real solution, since I would have preferred an Enum, but it did avoid the problem.
use subquery, if ids
is empty, will (never executed).
example:
subquery = db.query(SomeTable.id).filter(...).subquery()
db.query(...).filter(A.id.in_(subquery))
into:
SELECT ...
FROM a
WHERE a.id IN (SELECT ...)