How to explicitly load relationships on an existing object?

问题

I have a SQLAlchemy model Foo which contains a lazy-loaded relationship bar which points to another model that also has a lazy-loaded relationship foobar.

When querying normally I would use this code to ensure that all objects are loaded with a single query:

session.query(Foo).options(joinedload('bar').joinedload('foobar'))

However, now I have a case where a base class already provides me a Foo instance that was retrieved using session.query(Foo).one(), so the relationships are lazy-loaded (which is the default, and I don't want to change that).

For a single level of nesting I wouldn't mind it being loaded once I access foo.bar, but since I also need to access foo.bar[x].foobar I really prefer to avoid sending queries in a loop (which would happen whenever I access foobar).

I'm looking for a way to make SQLAlchemy load the foo.bar relationship while also using the joinedload strategy for foobar.

回答1:

I ran into a similar situation recently, and ended up doing the following:

eager_loaded = db.session.query(Bar).options(joinedload('foobar'))
    .filter_by(bar_fk=foo.foo_pk).all()

Assuming you can recreate the bar join condition in the filter_by arguments, all the objects in the collection will be loaded into the identity map, and foo.bar[x].foobar will not need to go to the database.

One caveat: It looks like the identity map may dispose of the loaded entities if they are no longer strongly referenced - thus the assignment to eager_loaded.

回答2:

The SQLAlchemy wiki contains the Disjoint Eager Loading recipe. A query is issued for the parent collection, then the children are queried and combined. For the most part, this was implemented in SQLAlchemy as the subquery strategy, but the recipe covers the case where you explicitly need to make the query later, not just separately.

The idea is that you order the child query and group the results by the remote columns linking the relationship, then populate the attribute for each parent item with the group of children. The following is slightly modified from the recipe to allow passing in a custom child query with extra options, rather than building it from the parent query. This does mean that you have to construct the child query more carefully: if your parent query has filters, then the child should join and filter as well, to prevent loading unneeded rows.

from itertools import groupby
from sqlalchemy.orm import attributes

def disjoint_load(parents, rel, q):
    local_cols, remote_cols = zip(*rel.prop.local_remote_pairs)
    q = q.join(rel).order_by(*remote_cols)

    if attr.prop.order_by:
        q = q.order_by(*rel.prop.order_by)

    collections = dict((k, list(v)) for k, v in groupby(q, lambda x: tuple([getattr(x, c.key) for c in remote_cols])))

    for p in parents:
        attributes.set_committed_value(
            p, attr.key,
            collections.get(tuple([getattr(p, c.key) for c in local_cols]), ()))

    return parents

# load the parents
devices = session.query(Device).filter(Device.active).all()

# build the child query with extras, use the same filter
findings = session.query(Finding
).join(Device.findings
).filter(Device.active
).options(db.joinedload(Finding.scans))

for d in disjoint_load(devices, Device.findings, findings):
    print(d.cn, len(d.findings))

来源：https://stackoverflow.com/questions/32337933/how-to-explicitly-load-relationships-on-an-existing-object

标签

python

sqlalchemy