Accessing a MySQL connection pool from Python multiprocessing

前端 未结 3 886
感动是毒
感动是毒 2020-12-05 11:56

I\'m trying to set up a MySQL connection pool and have my worker processes access the already established pool instead of setting up a new connection each time.

I\'m

相关标签:
3条回答
  • 2020-12-05 12:46
    #!/usr/bin/python
    # -*- coding: utf-8 -*-
    import time
    import mysql.connector.pooling
    
    
    dbconfig = {
        "host":"127.0.0.1",
        "port":"3306",
        "user":"root",
        "password":"123456",
        "database":"test",
    }
    
    
    class MySQLPool(object):
        """
        create a pool when connect mysql, which will decrease the time spent in 
        request connection, create connection and close connection.
        """
        def __init__(self, host="172.0.0.1", port="3306", user="root",
                     password="123456", database="test", pool_name="mypool",
                     pool_size=3):
            res = {}
            self._host = host
            self._port = port
            self._user = user
            self._password = password
            self._database = database
    
            res["host"] = self._host
            res["port"] = self._port
            res["user"] = self._user
            res["password"] = self._password
            res["database"] = self._database
            self.dbconfig = res
            self.pool = self.create_pool(pool_name=pool_name, pool_size=pool_size)
    
        def create_pool(self, pool_name="mypool", pool_size=3):
            """
            Create a connection pool, after created, the request of connecting 
            MySQL could get a connection from this pool instead of request to 
            create a connection.
            :param pool_name: the name of pool, default is "mypool"
            :param pool_size: the size of pool, default is 3
            :return: connection pool
            """
            pool = mysql.connector.pooling.MySQLConnectionPool(
                pool_name=pool_name,
                pool_size=pool_size,
                pool_reset_session=True,
                **self.dbconfig)
            return pool
    
        def close(self, conn, cursor):
            """
            A method used to close connection of mysql.
            :param conn: 
            :param cursor: 
            :return: 
            """
            cursor.close()
            conn.close()
    
        def execute(self, sql, args=None, commit=False):
            """
            Execute a sql, it could be with args and with out args. The usage is 
            similar with execute() function in module pymysql.
            :param sql: sql clause
            :param args: args need by sql clause
            :param commit: whether to commit
            :return: if commit, return None, else, return result
            """
            # get connection form connection pool instead of create one.
            conn = self.pool.get_connection()
            cursor = conn.cursor()
            if args:
                cursor.execute(sql, args)
            else:
                cursor.execute(sql)
            if commit is True:
                conn.commit()
                self.close(conn, cursor)
                return None
            else:
                res = cursor.fetchall()
                self.close(conn, cursor)
                return res
    
        def executemany(self, sql, args, commit=False):
            """
            Execute with many args. Similar with executemany() function in pymysql.
            args should be a sequence.
            :param sql: sql clause
            :param args: args
            :param commit: commit or not.
            :return: if commit, return None, else, return result
            """
            # get connection form connection pool instead of create one.
            conn = self.pool.get_connection()
            cursor = conn.cursor()
            cursor.executemany(sql, args)
            if commit is True:
                conn.commit()
                self.close(conn, cursor)
                return None
            else:
                res = cursor.fetchall()
                self.close(conn, cursor)
                return res
    
    
    if __name__ == "__main__":
        mysql_pool = MySQLPool(**dbconfig)
        sql = "select * from store WHERE create_time < '2017-06-02'"
        p = Pool()
        for i in range(5):
            p.apply_async(mysql_pool.execute, args=(sql,))
    

    Code above creates a connection pool at the beginning, and get connections from it in execute(), once the connection pool has been created, the work is to remain it, since the pool is created only once, it will save the time to request for a connection every time you would like to connect to MySQL. Hope it helps!

    0 讨论(0)
  • 2020-12-05 12:49

    First, you're creating a different connection pool for each instance of your DB class. The pools having the same name doesn't make them the same pool

    From the documentation:

    It is not an error for multiple pools to have the same name. An application that must distinguish pools by their pool_name property should create each pool with a distinct name.

    Besides that, sharing a database connection (or connection pool) between different processes would be a bad idea (and i highly doubt it would even work correctly), so each process using it's own connections is actually what you should aim for.

    You could just initialize the pool in your init initializer as a global variable and use that instead.
    Very simple example:

    from multiprocessing import Pool
    from mysql.connector.pooling import MySQLConnectionPool
    from mysql.connector import connect
    import os
    
    pool = None
    
    def init():
        global pool
        print("PID %d: initializing pool..." % os.getpid())
        pool = MySQLConnectionPool(...)
    
    def do_work(q):
        con = pool.get_connection()
        print("PID %d: using connection %s" % (os.getpid(), con))
        c = con.cursor()
        c.execute(q)
        res = c.fetchall()
        con.close()
        return res
    
    def main():
        p = Pool(initializer=init)
        for res in p.map(do_work, ['select * from test']*8):
            print(res)
        p.close()
        p.join()
    
    if __name__ == '__main__':
        main()
    

    Or just use a simple connection instead of a connection pool, as only one connection will be active in each process at a time anyway.
    The number of concurrently used connections is implicitly limited by the size of the multiprocessing.Pool.

    0 讨论(0)
  • 2020-12-05 12:53

    You created multiple DB object instance. In mysql.connector.pooling.py, pool_name is only a attribute to let you make out which pool it is. There is no mapping in the mysql pool.

    So, you create multiple DB instance in def readfile(), then you will have several connection pool.

    A Singleton is useful in this case.

    (I spent several hours to find it out. In Tornado framework, each http get create a new handler, which leads to making a new connection.)

    0 讨论(0)
提交回复
热议问题