问题
We are revamping our existing system, which uses MYSQL DB to deal with the following type of data.
- transaction and order related data
- customers information
- products information
We need to query on these data and pull in statistical data, and also filter, facet and segment list and KPIs.
We tried ClickHouse, Druid, DGraph did a few tests on sample data to benchmark and to check which DB fits our needs.
Few things I liked about Druid DB are,
- Druid Search Queries: Which lists down all the matches along with the dimensions(column names) and count/occurrence for the same. Link: http://druid.io/docs/latest/querying/searchquery.html
- utf8mb4 support
- Full text search
- Case insensitive search
We found ClickHouse to be faster when compared to MYSQL and Druid databases. But have the following problems.
- Unable to do druid-like-search queries (which return dimension and occurrences). Any workaround to achieve this?
- Case insensitive search. How do we handle this? ClickHouse is case-sensitive, right?
- utf8mb4 support? How do we save/store special characters or few emoji's which are not supported on utf8?
We had similar issues in MYSQL, and changing the collation to utf8mb4 solved it. What do we in ClickHouse to achieve this?
Your suggestions can help us overcome these challenges and make a better decision.
Thanks in advance.
回答1:
Unable to do druid-like-search queries (which return dimension and occurrences). Any workaround to achieve this?
That feature sounds to work roughly like:
SELECT interval, dim1, COUNT(*) FROM my_table WHERE condition GROUP BY interval, dim1
UNION ALL
SELECT interval, dim2, COUNT(*) FROM my_table WHERE condition GROUP BY interval, dim2
UNION ALL
...
Case insensitive search. How do we handle this? ClickHouse is case-sensitive, right?
There are multiple options, for example positionCaseInsensitiveUTF8(haystack, needle)
function or match with regular expressions: https://clickhouse.yandex/docs/en/query_language/functions/string_search_functions/#match-haystack-pattern
utf8mb4 support? How do we save/store special characters or few emoji's which are not supported on utf8?
Strings in ClickHouse are arbitrary byte sequences, so you can store whatever you want there, but you should probably check whether the available functions match your usecase.
来源:https://stackoverflow.com/questions/55535770/problems-using-clickhouse