问题
While reading large relations from a SQL database to a pandas dataframe, it would be nice to have a progress bar, because the number of tuples is known statically and the I/O rate could be estimated. It looks like the tqdm
module has a function tqdm_pandas
which will report progress on mapping functions over columns, but by default calling it does not have the effect of reporting progress on I/O like this. Is it possible to use tqdm
to make a progress bar on a call to pd.read_sql
?
回答1:
Edit: Answer may be misleading - chunksize
has no effect on database side of the operation. See comments below.
You could use the chunksize
parameter to do something like this:
chunks = pd.read_sql('SELECT * FROM table', con=conn, chunksize=100)
df = pd.DataFrame()
for chunk in tqdm(chunks):
df = pd.concat([df, chunk])
I think this would use less memory as well.
来源:https://stackoverflow.com/questions/40282478/can-tqdm-be-used-with-database-reads