问题
I'm using multi thread inside my program in python. I've got 3 queues. In one of them I'm inserting data to postgres database. But before, I need to check if in the database already exists row with specific domain name. So I've got:
class AnotherThread(threading.Thread):
def __init__(self, another_queue):
threading.Thread.__init__(self)
self.another_queue = another_queue
def run(self):
while True:
chunk = self.another_queue.get()
if chunk is not '':
dane = chunk[0].split(',',2)
cur.execute("SELECT exists(SELECT 1 FROM global where domain = %s ) ", (domena,))
jest = cur.fetchone()
print(jest)
It's a part of code my third queue. I'm connecting to database here (in main() function):
queue = Queue.Queue()
out_queue = Queue.Queue()
another_queue = Queue.Queue()
for i in range(50):
t = ThreadUrl(queue, out_queue)
t.setDaemon(True)
t.start()
for host in hosts:
queue.put(host)
for i in range(50):
dt = DatamineThread(out_queue,another_queue)
dt.setDaemon(True)
dt.start()
conn_str = "dbname='{db}' user='user' host='localhost' password='pass'"
conn = psycopg2.connect(conn_str.format(db='test'))
conn.autocommit = True
cur = conn.cursor()
for i in range(50):
dt = AnotherThread(another_queue)
dt.setDaemon(True)
dt.start()
queue.join()
out_queue.join()
another_queue.join()
cur.close()
conn.close()
When I run my script I've got:
(False,)
(False,)
(False,)
(False,)
(False,)
(False,)
(False,)
(False,)
(False,)
Exception in thread Thread-128:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "domains.py", line 242, in run
jest = cur.fetchone()
ProgrammingError: no results to fetch
Exception in thread Thread-127:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "domains.py", line 242, in run
jest = cur.fetchone()
ProgrammingError: no results to fetch
(False,)
(False,)
(False,)
Why for some of them I'm getting an error?
回答1:
This may have to do with the fact that all threads share the same connection and cursor. I could imagine a case where cur.execute()
is run, then cur.fetchone()
by another thread, then cur.fetchone()
again by (yet another or the same or the previous) thread, with no cur.execute
in between. The Python GIL would switch between threads per line (statement). Thus, that second time fetchone()
is run, there are no results anymore: there's only one row to fetch initially, and that's now been exhausted.
You probably want to isolate each cursor, or somehow make the cur.execute(...); cur.fetchone()
commands atomic.
The answer to the question are transactions in postgresql via psycopg2 per cursor or per connection (DBA StackExchange link) mentions transactions are per connection, so isolating cursors probably won't help you.
来源:https://stackoverflow.com/questions/48630476/multi-thread-python-psycopg2