问题
I'm using Python 3 with pymysql
package to query raw data from AWS Aurora
while executing from an EC2 with Amazon Linux. And I'd like to improve the performance significantly.
So far, I managed to get the job done but it takes 150 seconds to get the results of 2.3 million rows back, using the following code:
import pandas as pd
import pymysql
conn = pymysql.connect(host, user=user,port=port,
passwd=password, db=dbname)
myQuery = '''
SELECT * FROM fEvents f
Left Join fParams fp
on f.id = fp.id
WHERE f.DateTime BETWEEN '2019-01-24' and '2019-02-28'
'''
df = pd.read_sql(myQuery, con=conn)
When we tried executing the same query from the same EC2 using node.js
we got an object with the 2.3 million results within only 20 seconds!
Since the rest of the code is in Python 3, I'm struggling to improve the performance of my Python API.
I'll appreciate any suggestions or explanations please.
来源:https://stackoverflow.com/questions/56116451/how-to-improve-query-time-from-aws-aurora-rds-when-using-python-3-executed-on