Error Connecting To PostgreSQL can't pickle psycopg2.extensions.connection objects

こ雲淡風輕ζ 提交于 2020-05-17 09:05:26

问题


I am trying to create an architecture that will have a main parent process & it can create new child processes. The main parent process will always be on loop to check if there is any child process available.

I have used ThreadedConnectionPool of psycopg2.pool module in order to have a common database connection for all child processes created. That means the program will be connecting once to the database and execute all the SQL queries for each of the child processes. So there is no need to connect to Database every time for execution of SQL queries.

The code is as follows:

from multiprocessing import Process, Lock
import time, os, psycopg2
from psycopg2 import pool

def child(dbConnection, lock, num, pid, sleepTime, query):
    lock.acquire()

    start = time.time()

    print("Child Process {} - Process ID: {}".format(num + 1, str(os.getpid())))

    db_cursor = dbConnection.cursor()
    db_cursor.execute(query)
    records = db_cursor.fetchmany(2)

    print("Displaying rows from User Master Table")

    for row in records:
        print(row)

    print("Executed Query:", query)
    print("Child Process {} - Process ID {} Completed.".format(num + 1, str(os.getpid())))

    end = time.time()
    print("Time taken:", str(end - start), "seconds")

    lock.release()
    time.sleep(sleepTime)

if __name__ == "__main__":
    try:
        connectionPool = psycopg2.pool.ThreadedConnectionPool(5, 21, user = "dwhpkg", password = "dwhpkg", host = "127.0.0.1", port = "5432", database = "dwhdb")

        while True:

            processes = []

            print("Main Process ID: {}".format(str(os.getpid())))
            lock = Lock()


            # 21 Times Process Execution
            for count in range(21):
                if connectionPool :
                    print("Connection Pool Successfully Created")

                # Getting DB Connection From Connection Pool
                dbConnection = connectionPool.getconn()

                if dbConnection:
                    sql_execute_process = Process(target = child, args = (dbConnection, lock, count, os.getpid(), 4, 'SELECT * FROM public."USER_MASTER"',))

                    sql_execute_process.start()

                    processes.append(sql_execute_process)
                    print("Parent Process:", os.getpid())

                    print(processes)

                    time.sleep(5)

            for process in processes:
                process.join()

    except (Exception, psycopg2.DatabaseError) as error:
        print("Error Connecting To PostgreSQL", error)

    finally:
        # Closing DB Connection
        if connectionPool:
            connectionPool.closeall
        print("Connection Pool is closed")

When I try to run the above code, it gives the following error:

Main Process ID: 46700
Connection Pool Successfully Created
Error Connecting To PostgreSQL can't pickle psycopg2.extensions.connection objects
Connection Pool is closed

(task_env) C:\Users\sicuser\Desktop\ジート\03_作業案件\タスク機能プロトタイプ作成\開発>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 99, in spawn_main
    new_handle = reduction.steal_handle(parent_pid, pipe_handle)
  File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 82, in steal_handle
    _winapi.PROCESS_DUP_HANDLE, False, source_pid)
OSError: [WinError 87] The parameters are incorrect.

For troubleshooting, I have also used the Debugging mode and tried to find out the error location. Using debugging, I have found the error is occurring due to the line below:

sql_execute_process.start()

【Error Message】

Main Process ID: 47708
Connection Pool Successfully Created
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\sicuser\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Error Connecting To PostgreSQL can't pickle psycopg2.extensions.connection objects

The OS Environment is Windows and Python version: Python 3.7.4

Looking forward for support from experts.


回答1:


In your solution above you're using a ThreadedConnectionPool with multiprocessing.Process instances (thread != process).
Multiple processes cannot safely share the same connection; check the details on psycopg's section about thread and process safety.

You're also using a Lock for the critical code in the child which basically prevents you from executing tasks in parallel; even if it worked the performance would largely be the same to a single process solution.

The solution depends on how CPU intensive and long-lived will the child processes be:

  • if the children will be light / short lived just do all the work using a single (main) thread
  • for heavy / long lived child processes connect to the database from inside the child (don't share the connection with the main process)


来源:https://stackoverflow.com/questions/60923178/error-connecting-to-postgresql-cant-pickle-psycopg2-extensions-connection-objec

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!