问题
I am trying to use psycopg2's connection pool with python's multiprocess library.
Currently, attempting to share the connection pool amongst threads in the manner described above causes:
psycopg2.OperationalError: SSL error: decryption failed or bad record mac
The following code should reproduce the error, which the caveat that the reader has to set up a simple postgres database.
from multiprocessing import Pool
from psycopg2 import pool
import psycopg2
import psycopg2.extras
connection_pool = pool.ThreadedConnectionPool(1, 200, database='postgres', user='postgres',password='postgres', host='localhost')
class ConnectionFromPool:
"""
Class to establish a connection with the local PostgreSQL database
To use:
query = SELECT * FROM ticker_metadata
with ConnectionFromPool() as cursor:
cursor.execute(query)
results = cursor.fetchall()
Returns:
Arrayed Dictionary of results
[{...},{...},{...}]
"""
def __init__(self):
self.connection_pool = None
self.cursor = None
self.connection = None
def __enter__(self):
self.connection = connection_pool.getconn()
self.cursor = self.connection.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
return self.cursor
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_val is not None:
self.connection.rollback()
else:
self.cursor.close()
self.connection.commit()
connection_pool.putconn(self.connection)
def test_query(col_attribute):
"""
Simple SQL query
"""
query = f"""SELECT *
FROM col
WHERE col = {col_attribute}
;"""
with ConnectionFromPool() as cursor:
cursor.execute(query)
result = cursor.fetchall()
return result
def multiprocessing(func, args, n_workers=2):
"""spawns multiple processes
Args:
func: function, to be performed
args: list of args to be passed to each call of func
n_workers: number of processes to be spawned
Return:
A list, containing the results of each proccess
"""
with Pool(processes=n_workers) as executor:
res = executor.starmap(func, args)
return list(res)
def main():
args = [[i] for i in range(1000)]
results = multiprocessing(test_query, args, 2)
if __name__ == "__main__":
main()
What I have already tried:
- Having each process open and close its own connection to the database, instead of attempting to use a connection pool. This is slow.
- Having each process use its own connection pool, this is also slow.
- Passing a connection a psycopg2 connection object to each process, instead of having this implicitly called with the
with
statement in the sql query. This throws an error claiming that the connection object is not pickle-able.
Note: If I put a sleep
operation in all but one of the processes, the non-sleeping processes runs fine and executes its query, until the remaining threads un-sleep, then I get the above error.
What I have already read:
- Share connection to postgres db across processes in Python
- Python: decryption failed or bad record mac when calling from Thread
- Connection problems with SQLAlchemy and multiple processes
Finally:
How can I use a connection pool (psycopg2) with python's multiprocess (multiprocessing). I am open to using other libraries so long as they work with python and postgresql databases.
来源:https://stackoverflow.com/questions/57214664/sharing-a-postgres-connection-pool-between-python-multiproccess