How can I populate a pandas DataFrame with the result of a Snowflake sql query?

后端未结

关注

 2  2066

Using the Python Connector I can query Snowflake:

import snowflake.connector

# Gets the version
ctx = snowflake.connector.connect(
    user=USER,
    password=P


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2021-02-13 16:27
              
            
            
                                                                       
There is now a method .fetch_pandas.all() for this, no need for SQL Alchemy anymore.

Note that you need to install snowflake.connector for pandas by doing this

pip install snowflake-connector-python[pandas]


Full documentation here

import pandas as pd
import snowflake.connector

conn = snowflake.connector.connect(
            user="xxx",
            password="xxx",
            account="xxx",
            warehouse="xxx",
            database="MYDB",
            schema="MYSCHEMA"
            )

cur = conn.cursor()

# Execute a statement that will generate a result set.
sql = "select * from MYTABLE limit 10"
cur.execute(sql)
# Fetch the result set from the cursor and deliver it as the Pandas DataFrame.
df = cur.fetch_pandas_all()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2021-02-13 16:28
              
            
            
                                                                       
You can use DataFrame.from_records() or pandas.read_sql() with snowflake-sqlalchemy. The snowflake-alchemy option has a simpler API

pd.DataFrame.from_records(iter(cur), columns=[x[0] for x in cur.description])


will return a DataFrame with proper column names taken from the SQL result. The iter(cur) will convert the cursor into an iterator and  cur.description gives the names and types of the columns.

So the complete code will be 

import snowflake.connector
import pandas as pd

# Gets the version
ctx = snowflake.connector.connect(
    user=USER,
    password=PASSWORD,
    account=ACCOUNT,
    authenticator='https://XXXX.okta.com',
    )
ctx.cursor().execute('USE warehouse MY_WH')
ctx.cursor().execute('USE MYDB.MYSCHEMA')


query = '''
select * from MYDB.MYSCHEMA.MYTABLE
LIMIT 10;
'''

cur = ctx.cursor().execute(query)
df = pd.DataFrame.from_records(iter(cur), columns=[x[0] for x in cur.description])


If you prefer using pandas.read_sql then you can 

import pandas as pd

from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL


url = URL(
    account = 'xxxx',
    user = 'xxxx',
    password = 'xxxx',
    database = 'xxx',
    schema = 'xxxx',
    warehouse = 'xxx',
    role='xxxxx',
    authenticator='https://xxxxx.okta.com',
)
engine = create_engine(url)


connection = engine.connect()

query = '''
select * from MYDB.MYSCHEMA.MYTABLE
LIMIT 10;
'''

df = pd.read_sql(query, connection)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复