ClickHouse JSON parse exception: Cannot parse input: expected ',' before

别等时光非礼了梦想. 提交于 2021-02-05 08:34:06

问题


I'm trying to add JSON data to ClickHouse from Kafka. Here's simplified JSON:

{
  ...
   "sendAddress":{
      "sendCommChannelTypeId":4,
      "sendCommChannelTypeCode":"SMS",
      "sendAddress":"789345345945"},
   ...
}

Here's the steps for creating table in ClickHouse, create another table using Kafka Engine and creating MATERIALIZED VIEW to connect these two tables, and also connect CH with Kafka.

Creating the first table

CREATE TABLE tab 
(
    ...

    sendAddress Tuple (sendCommChannelTypeId Int32, sendCommChannelTypeCode String, sendAddress String),
     ...

)Engine = MergeTree()
PARTITION BY applicationId
ORDER BY (applicationId);

Creating a second table with Kafka Engine SETTINGS:

CREATE TABLE tab_kfk
(
    ...
    sendAddress Tuple (sendCommChannelTypeId Int32, sendCommChannelTypeCode String, sendAddress String),
    ...
)ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
       kafka_topic_list = 'topk2',
       kafka_group_name = 'group1',
       kafka_format = 'JSONEachRow',
       kafka_row_delimiter = '\n';

Create MATERIALIZED VIEW

CREATE MATERIALIZED VIEW tab_mv TO tab AS
SELECT ... sendAddress, ...
FROM tab_kfk;

Then I try to SELECT all or specific items from the first table - tab and get nothing. Logs is following

OK. Just add '[]' before curly braces in the sendAddress like this:

"authkey":"some_value",
   "sendAddress":[{
      "sendCommChannelTypeId":4,
      "sendCommChannelTypeCode":"SMS",
      "sendAddress":"789345345945"
   }]

And I still get a mistake, but slightly different: What should I do to fix this problem, thanks!


回答1:


There are 3 ways to fix it:

  1. Not use nested objects and flatten messages before inserting to Kafka topic. For example such way:
{
    ..
    "authkey":"key",
    "sendAddress_CommChannelTypeId":4,
    "sendAddress_CommChannelTypeCode":"SMS",
    "sendAddress":"789345345945",
    ..
}
  1. Use Nested data structure that required to change the JSON-message schema and table schema:
{
    ..
    "authkey":"key",
    "sendAddress.sendCommChannelTypeId":[4],
    "sendAddress.sendCommChannelTypeCode":["SMS"],
    "sendAddress.sendAddress":["789345345945"],
    ..
}
CREATE TABLE tab_kfk
(
    applicationId Int32,
    ..
    sendAddress Nested(
        sendCommChannelTypeId Int32,
        sendCommChannelTypeCode String,
        sendAddress String),
    ..
)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
       kafka_topic_list = 'topk2',
       kafka_group_name = 'group1',
       kafka_format = 'JSONEachRow',
       kafka_row_delimiter = '\n',
       input_format_import_nested_json = 1 /* <--- */

Take into account the setting input_format_import_nested_json.

  1. Interpret input JSON-message as string & parse it manually (see github issue #16969):
CREATE TABLE tab_kfk
(
    message String
)
ENGINE = Kafka
SETTINGS 
    ..
    kafka_format = 'JSONAsString', /* <--- */
    ..

CREATE MATERIALIZED VIEW tab_mv TO tab 
AS
SELECT 
    ..
    JSONExtractString(message, 'authkey') AS authkey,
    JSONExtract(message, 'sendAddress', 'Tuple(Int32,String,String)') AS sendAddress,
    ..
FROM tab_kfk;


来源:https://stackoverflow.com/questions/65104109/clickhouse-json-parse-exception-cannot-parse-input-expected-before

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!