python-pandas and databases like mysql

后端 未结 13 2076
有刺的猬
有刺的猬 2020-12-02 03:51

The documentation for Pandas has numerous examples of best practices for working with data stored in various formats.

However, I am unable to find any good examples

相关标签:
13条回答
  • 2020-12-02 04:03

    import the module

    import pandas as pd
    import oursql
    

    connect

    conn=oursql.connect(host="localhost",user="me",passwd="mypassword",db="classicmodels")
    sql="Select customerName, city,country from customers order by customerName,country,city"
    df_mysql = pd.read_sql(sql,conn)
    print df_mysql
    

    That works just fine and using pandas.io.sql frame_works (with the deprecation warning). Database used is the sample database from mysql tutorial.

    0 讨论(0)
  • 2020-12-02 04:05

    I prefer to create queries with SQLAlchemy, and then make a DataFrame from it. SQLAlchemy makes it easier to combine SQL conditions Pythonically if you intend to mix and match things over and over.

    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy import Table
    from sqlalchemy import create_engine
    from sqlalchemy.orm import sessionmaker
    from pandas import DataFrame
    import datetime
    
    # We are connecting to an existing service
    engine = create_engine('dialect://user:pwd@host:port/db', echo=False)
    Session = sessionmaker(bind=engine)
    session = Session()
    Base = declarative_base()
    
    # And we want to query an existing table
    tablename = Table('tablename', 
        Base.metadata, 
        autoload=True, 
        autoload_with=engine, 
        schema='ownername')
    
    # These are the "Where" parameters, but I could as easily 
    # create joins and limit results
    us = tablename.c.country_code.in_(['US','MX'])
    dc = tablename.c.locn_name.like('%DC%')
    dt = tablename.c.arr_date >= datetime.date.today() # Give me convenience or...
    
    q = session.query(tablename).\
                filter(us & dc & dt) # That's where the magic happens!!!
    
    def querydb(query):
        """
        Function to execute query and return DataFrame.
        """
        df = DataFrame(query.all());
        df.columns = [x['name'] for x in query.column_descriptions]
        return df
    
    querydb(q)
    
    0 讨论(0)
  • 2020-12-02 04:06

    pandas.io.sql.frame_query is deprecated. Use pandas.read_sql instead.

    0 讨论(0)
  • 2020-12-02 04:07

    For the record, here is an example using a sqlite database:

    import pandas as pd
    import sqlite3
    
    with sqlite3.connect("whatever.sqlite") as con:
        sql = "SELECT * FROM table_name"
        df = pd.read_sql_query(sql, con)
        print df.shape
    
    0 讨论(0)
  • 2020-12-02 04:10

    As Wes says, io/sql's read_sql will do it, once you've gotten a database connection using a DBI compatible library. We can look at two short examples using the MySQLdb and cx_Oracle libraries to connect to Oracle and MySQL and query their data dictionaries. Here is the example for cx_Oracle:

    import pandas as pd
    import cx_Oracle
    
    ora_conn = cx_Oracle.connect('your_connection_string')
    df_ora = pd.read_sql('select * from user_objects', con=ora_conn)    
    print 'loaded dataframe from Oracle. # Records: ', len(df_ora)
    ora_conn.close()
    

    And here is the equivalent example for MySQLdb:

    import MySQLdb
    mysql_cn= MySQLdb.connect(host='myhost', 
                    port=3306,user='myusername', passwd='mypassword', 
                    db='information_schema')
    df_mysql = pd.read_sql('select * from VIEWS;', con=mysql_cn)    
    print 'loaded dataframe from MySQL. records:', len(df_mysql)
    mysql_cn.close()
    
    0 讨论(0)
  • 2020-12-02 04:11

    MySQL example:

    import MySQLdb as db
    from pandas import DataFrame
    from pandas.io.sql import frame_query
    
    database = db.connect('localhost','username','password','database')
    data     = frame_query("SELECT * FROM data", database)
    
    0 讨论(0)
提交回复
热议问题