Right way to implement pandas.read_sql with ClickHouse

问题

Trying to implement pandas.read_sql function.

I created a clickhouse table and filled it:

create table regions
(
    date DateTime Default now(),
    region String
)
    engine = MergeTree()
        PARTITION BY toYYYYMM(date)
        ORDER BY tuple()
        SETTINGS index_granularity = 8192;

insert into regions (region) values ('Asia'), ('Europe')

Then python code:

import pandas as pd 
from sqlalchemy import create_engine


uri = 'clickhouse://default:@localhost/default'
engine = create_engine(uri)
query = 'select * from regions'
pd.read_sql(query, engine)

As the result I expected to get a dataframe with columns date and region but all I get is empty dataframe:

Empty DataFrame
Columns: [2021-01-08 09:24:33, Asia]
Index: []

UPD. It occured that defining clickhouse+native solves the problem.

Can it be solved without +native?

回答1:

There is encient issue https://github.com/xzkostyan/clickhouse-sqlalchemy/issues/10. Also there is a hint which assumes to add FORMAT TabSeparatedWithNamesAndTypes at the end of a query. So the init query will be look like this:

select * 
from regions 
FORMAT TabSeparatedWithNamesAndTypes

来源：https://stackoverflow.com/questions/65627583/right-way-to-implement-pandas-read-sql-with-clickhouse

标签

pandas

ClickHouse

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!