So I placed this question without too much context, and got downvoted, let\'s try again...
For one, I don\'t follow the logic behind SQLAlchemy\'s session.add
The reason why flushes in bauble end up emitting a COMMIT is the line 133 in db.py where they handle their history table:
table.insert(dict(table_name=mapper.local_table.name,
table_id=instance.id, values=str(row),
operation=operation, user=user,
timestamp=datetime.datetime.today())).execute()
Instead of issuing the additional SQL in the event handler using the passed in transactional connection, as they should, they execute the statement itself as is, which means it ends up using the engine as the bind (found through the table's metadata). Executing using the engine has autocommit behaviour. Since bauble always uses a SingletonThreadPool, there's just one connection per thread, and so that statement ends up committing the flushed changes as well. I wonder if this bug is the reason why bauble disables autoflush...
The fix is to change the event handling to use the transactional connection:
class HistoryExtension(orm.MapperExtension):
"""
HistoryExtension is a
:class:`~sqlalchemy.orm.interfaces.MapperExtension` that is added
to all clases that inherit from bauble.db.Base so that all
inserts, updates, and deletes made to the mapped objects are
recorded in the `history` table.
"""
def _add(self, operation, mapper, connection, instance):
"""
Add a new entry to the history table.
"""
... # a ton of code here
table = History.__table__
stmt = table.insert(dict(table_name=mapper.local_table.name,
table_id=instance.id, values=str(row),
operation=operation, user=user,
timestamp=datetime.datetime.today()))
connection.execute(stmt)
def after_update(self, mapper, connection, instance):
self._add('update', mapper, connection, instance)
def after_insert(self, mapper, connection, instance):
self._add('insert', mapper, connection, instance)
def after_delete(self, mapper, connection, instance):
self._add('delete', mapper, connection, instance)
It's worth a note that MapperExtension has been deprecated since version 0.7.
Regarding your views about the session I quote "Session Basics", which you really should read through:
In the most general sense, the
Session
establishes all conversations with the database and represents a “holding zone” for all the objects which you’ve loaded or associated with it during its lifespan. It provides the entrypoint to acquire aQuery
object, which sends queries to the database using theSession
object’s current database connection, ...
and "Is the Session a cache?":
Yeee…no. It’s somewhat used as a cache, in that it implements the identity map pattern, and stores objects keyed to their primary key. However, it doesn’t do any kind of query caching. This means, if you say
session.query(Foo).filter_by(name='bar')
, even ifFoo(name='bar')
is right there, in the identity map, the session has no idea about that. It has to issue SQL to the database, get the rows back, and then when it sees the primary key in the row, then it can look in the local identity map and see that the object is already there. It’s only when you sayquery.get({some primary key})
that theSession
doesn’t have to issue a query.
So:
My expectation from something which reads session.query is that it queries the session...
Your expectations are wrong. The Session
handles talking to the DB – among other things.
There's two reasons why I don't want to do session.flush before my session.query,
- one based on efficiency fears (why should I write to the database, and query the database if I am still within a session which the user may want to rollback?);
Because your DB may do validation, have triggers, and generate values for some columns – primary keys, timestamps, and the like. The data you thought you're inserting may end up something else in the DB and the Session
has absolutely no way to know about that.
Also, why should SQLAlchemy implement a sort of an in-memory DB in itself, with its own query engine, and all the problems that come with synchronizing 2 databases? How would SQLAlchemy support all the different operations and functions of different DBs you query against? Your simple equality predicate example just scratches the surface.
When you rollback, you roll back the DB's transaction (along with the session's unflushed changes).
- two is because I've adopted a fairly large program, and it manages to define its own Session whose instances causes flush to also commit.
Caused by the event handling bug.