问题
I am trying to create an architecture that will have a main parent process & it can create new child processes. The main parent process will always be on loop to check if there is any child process available.
I have used ThreadedConnectionPool of psycopg2.pool module in order to have a common database connection for all child processes created. That means the program will be connecting once to the database and execute all the SQL queries for each of the child processes. So there is no need to connect to Database every time for execution of SQL queries.
The code is as follows:
from multiprocessing import Process, Lock
import time, os, psycopg2
from psycopg2 import pool
def child(dbConnection, lock, num, pid, sleepTime, query):
lock.acquire()
start = time.time()
print("Child Process {} - Process ID: {}".format(num + 1, str(os.getpid())))
db_cursor = dbConnection.cursor()
db_cursor.execute(query)
records = db_cursor.fetchmany(2)
print("Displaying rows from User Master Table")
for row in records:
print(row)
print("Executed Query:", query)
print("Child Process {} - Process ID {} Completed.".format(num + 1, str(os.getpid())))
end = time.time()
print("Time taken:", str(end - start), "seconds")
lock.release()
time.sleep(sleepTime)
if __name__ == "__main__":
try:
connectionPool = psycopg2.pool.ThreadedConnectionPool(5, 21, user = "dwhpkg", password = "dwhpkg", host = "127.0.0.1", port = "5432", database = "dwhdb")
while True:
processes = []
print("Main Process ID: {}".format(str(os.getpid())))
lock = Lock()
# 21 Times Process Execution
for count in range(21):
if connectionPool :
print("Connection Pool Successfully Created")
# Getting DB Connection From Connection Pool
dbConnection = connectionPool.getconn()
if dbConnection:
sql_execute_process = Process(target = child, args = (dbConnection, lock, count, os.getpid(), 4, 'SELECT * FROM public."USER_MASTER"',))
sql_execute_process.start()
processes.append(sql_execute_process)
print("Parent Process:", os.getpid())
print(processes)
time.sleep(5)
for process in processes:
process.join()
except (Exception, psycopg2.DatabaseError) as error:
print("Error Connecting To PostgreSQL", error)
finally:
# Closing DB Connection
if connectionPool:
connectionPool.closeall
print("Connection Pool is closed")
When I try to run the above code, it gives the following error:
Main Process ID: 46700
Connection Pool Successfully Created
Error Connecting To PostgreSQL can't pickle psycopg2.extensions.connection objects
Connection Pool is closed
(task_env) C:\Users\sicuser\Desktop\ジート\03_作業案件\タスク機能プロトタイプ作成\開発>Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 99, in spawn_main
new_handle = reduction.steal_handle(parent_pid, pipe_handle)
File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 82, in steal_handle
_winapi.PROCESS_DUP_HANDLE, False, source_pid)
OSError: [WinError 87] The parameters are incorrect.
For troubleshooting, I have also used the Debugging mode and tried to find out the error location. Using debugging, I have found the error is occurring due to the line below:
sql_execute_process.start()
【Error Message】
Main Process ID: 47708
Connection Pool Successfully Created
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Error Connecting To PostgreSQL can't pickle psycopg2.extensions.connection objects
The OS Environment is Windows and Python version: Python 3.7.4
Looking forward for support from experts.
回答1:
In your solution above you're using a ThreadedConnectionPool
with multiprocessing.Process
instances (thread != process).
Multiple processes cannot safely share the same connection; check the details on psycopg's section about thread and process safety.
You're also using a Lock
for the critical code in the child which basically prevents you from executing tasks in parallel; even if it worked the performance would largely be the same to a single process solution.
The solution depends on how CPU intensive and long-lived will the child processes be:
- if the children will be light / short lived just do all the work using a single (main) thread
- for heavy / long lived child processes connect to the database from inside the child (don't share the connection with the main process)
来源:https://stackoverflow.com/questions/60923178/error-connecting-to-postgresql-cant-pickle-psycopg2-extensions-connection-objec