SQLAlchemy: Find difference between array columns

后端 未结 1 1383
误落风尘
误落风尘 2021-01-27 22:51

I have a table called UnitOfWork which has 3 columns cases_identified, cases_completed and cases_double_check, all of which a

相关标签:
1条回答
  • 2021-01-27 23:09

    For this answer I'll assume that cls.cases_identified, cls.cases_completed, and cls.cases_double_check are of postgresql.ARRAY(Integer) type in Python side.

    Your @todo.expression should return just that: an SQL expression. Currently it is trying to return a python list. The exception is raised since a postgresql.ARRAY does not support the in operator, though it does have a method contains, which maps to the @> operator in Postgresql and tests "if elements are a superset of the elements of the argument array expression". This on the other hand is not what you want. You're lucky that you had the if x not in ... in there, as plain

    [x for x in cls.cases_identified]
    

    seems to result in an infinite loop instead of an exception.

    Getting the difference between arrays in Postgresql has been covered here extensively, but here's how you would apply that using SQLAlchemy, first using an array constructor:

    from sqlalchemy import func
    
    ...
    
    class UnitOfWork(...):
    
        @todo.expression
        def todo(cls):
            # Force correlation of `cls` from outer FROM objects. Use
            # the `select()` shorthand method of `FunctionElement`.
            identified = func.unnest(cls.cases_identified).select().correlate(cls)
            completed = func.unnest(cls.cases_completed).select().correlate(cls)
            double_check = func.unnest(cls.cases_double_check).select().correlate(cls)
            # Create the subquery statement
            stmt = identified.except_(completed.union(double_check))
            # Uses the `func` generic for ARRAY constructor from
            # subquery syntax. The subquery statement has to be
            # converted to a "scalar subquery" in the eyes of SQLA
            # (really just a subquery returning 1 col) in order
            # to remove it from the FROM clause hierarchy.
            return func.array(stmt.as_scalar())
    

    This has a downside of not providing any FROM objects (as it correlates everything from the enclosing query), so you'd have to issue the original query like this:

    test = Session.query(UnitOfWork.todo).select_from(UnitOfWork).first()
    

    You could also use the Postgresql intarray module that provides special functions and operators for null-free arrays of integers:

    class UnitOfWork(...):
    
        @todo.expression
        def todo(cls):
            return (cls.cases_identified - cls.cases_completed - 
                    cls.cases_double_check)
    

    Note that you have to install the extension first in Postgresql:

    CREATE EXTENSION intarray;
    
    0 讨论(0)
提交回复
热议问题