How to create a new table in a MySQL DB from a pandas dataframe

后端 未结 2 1320
面向向阳花
面向向阳花 2021-01-06 19:57

I recently transitioned from using SQLite for most of my data storage and management needs to MySQL. I think I\'ve finally gotten the correct libraries installed to work wit

相关标签:
2条回答
  • 2021-01-06 20:18

    This

    connection = engine.connect()
    df.to_sql(con=connection, name='TBL_NAME', schema='SCHEMA', index=False, if_exists='replace')
    

    works with oracle DB in specific schema wothout errors, but will not work if you have limited permissions. And note that table names is case sensative.

    0 讨论(0)
  • 2021-01-06 20:22

    I took an approach suggested by aws_apprentice above which was to create the table first, then write data to the table.

    The code below first auto-generates a mysql table from a df (auto defining table names and datatypes) then writes the df data to that table.

    There were a couple of hiccups I had to overcome, such as: unnamed csv columns, determining the correct data type for each field in the mysql table.

    I'm sure there are multiple other (better?) ways to do this, but this seems to work.

    import pandas as pd
    from sqlalchemy import create_engine
    
    infile = r'path/to/file.csv'
    db = 'a001_db'
    db_tbl_name = 'a001_rd004_db004'
    
    '''
    Load a csv file into a dataframe; if csv does not have headers, use the headers arg to create a list of headers; rename unnamed columns to conform to mysql column requirements
    '''
    def csv_to_df(infile, headers = []):
        if len(headers) == 0:
            df = pd.read_csv(infile)
        else:
            df = pd.read_csv(infile, header = None)
            df.columns = headers
        for r in range(10):
            try:
                df.rename( columns={'Unnamed: {0}'.format(r):'Unnamed{0}'.format(r)},    inplace=True )
            except:
                pass
        return df
    
    '''
    Create a mapping of df dtypes to mysql data types (not perfect, but close enough)
    '''
    def dtype_mapping():
        return {'object' : 'TEXT',
            'int64' : 'INT',
            'float64' : 'FLOAT',
            'datetime64' : 'DATETIME',
            'bool' : 'TINYINT',
            'category' : 'TEXT',
            'timedelta[ns]' : 'TEXT'}
    '''
    Create a sqlalchemy engine
    '''
    def mysql_engine(user = 'root', password = 'abc', host = '127.0.0.1', port = '3306', database = 'a001_db'):
        engine = create_engine("mysql://{0}:{1}@{2}:{3}/{4}?charset=utf8".format(user, password, host, port, database))
        return engine
    
    '''
    Create a mysql connection from sqlalchemy engine
    '''
    def mysql_conn(engine):
        conn = engine.raw_connection()
        return conn
    '''
    Create sql input for table names and types
    '''
    def gen_tbl_cols_sql(df):
        dmap = dtype_mapping()
        sql = "pi_db_uid INT AUTO_INCREMENT PRIMARY KEY"
        df1 = df.rename(columns = {"" : "nocolname"})
        hdrs = df1.dtypes.index
        hdrs_list = [(hdr, str(df1[hdr].dtype)) for hdr in hdrs]
        for i, hl in enumerate(hdrs_list):
            sql += " ,{0} {1}".format(hl[0], dmap[hl[1]])
        return sql
    
    '''
    Create a mysql table from a df
    '''
    def create_mysql_tbl_schema(df, conn, db, tbl_name):
        tbl_cols_sql = gen_tbl_cols_sql(df)
        sql = "USE {0}; CREATE TABLE {1} ({2})".format(db, tbl_name, tbl_cols_sql)
        cur = conn.cursor()
        cur.execute(sql)
        cur.close()
        conn.commit()
    
    '''
    Write df data to newly create mysql table
    '''
    def df_to_mysql(df, engine, tbl_name):
        df.to_sql(tbl_name, engine, if_exists='replace')
    
    df = csv_to_df(infile)
    create_mysql_tbl_schema(df, mysql_conn(mysql_engine()), db, db_tbl_name)
    df_to_mysql(df, mysql_engine(), db_tbl_name)
    
    0 讨论(0)
提交回复
热议问题