问题
I have a stream deriving from a topic that contains 271 total messages the stream also contains 271 total messages, but when i create a other stream from that previous stream to flatten it, i get total messages of 542=(271*2).
this is the stream deriving from the topic
Name : TRANSACTIONSPURE
Type : STREAM
Key field :
Key format : STRING
Timestamp field : Not set - using <ROWTIME>
Value format : JSON
Kafka topic : mongo_conn.digi.transactions (partitions: 1,
replication: 1)
Field | Type
ROWTIME | BIGINT (system)
ROWKEY | VARCHAR(STRING) (system)
PAYLOAD | STRUCT<SENDER VARCHAR(STRING), RECEIVER VARCHAR(STRING),
RECEIVERWALLETID VARCHAR(STRING), STATUS VARCHAR(STRING), TYPE
VARCHAR(STRING), AMOUNT DOUBLE, TOTALFEE DOUBLE, CREATEDAT
VARCHAR(STRING), UPDATEDAT VARCHAR(STRING), ID VARCHAR(STRING),
ORDERID
VARCHAR(STRING), __V VARCHAR(STRING), TXID VARCHAR(STRING),
SENDERWALLETID VARCHAR(STRING)>
Local runtime statistics
------------------------
consumer-messages-per-sec: 0 consumer-total-bytes: 361356
consumer-total-messages: 271 last-message:
2019-09-02T10:44:14.003Z
and this is my flattened stream deriving from the previous stream
Name : TRANSACTIONSRAW
Type : STREAM
Key field :
Key format : STRING
Timestamp field : Not set - using <ROWTIME>
Value format : JSON
Kafka topic : TRANSACTIONSRAW (partitions: 4, replication: 1)
Field | Type
----------------------------------------------
ROWTIME | BIGINT (system)
ROWKEY | VARCHAR(STRING) (system)
SENDER | VARCHAR(STRING)
RECEIVER | VARCHAR(STRING)
RECEIVERWALLETID | VARCHAR(STRING)
STATUS | VARCHAR(STRING)
TYPE | VARCHAR(STRING)
AMOUNT | DOUBLE
TOTALFEE | DOUBLE
CREATEDAT | VARCHAR(STRING)
UPDATEDAT | VARCHAR(STRING)
ID | VARCHAR(STRING)
ORDERID | VARCHAR(STRING)
__V | VARCHAR(STRING)
TXID | VARCHAR(STRING)
SENDERWALLETID | VARCHAR(STRING)
----------------------------------------------
Queries that write into this STREAM
-----------------------------------
CSAS_TRANSACTIONSRAW_10 : CREATE STREAM transactionsraw
with(value_format='JSON') as SELECT payload->sender as sender,
payload->receiver as receiver, payload->receiverWalletId as
receiverWalletId, payload->status as status, payload->type as type,
payload->amount as amount, payload->totalFee as totalFee,
payload->createdAt as createdAt, payload->updatedAt as updatedAt,
payload->id as id, payload->orderId as orderId , payload-> __v as __v,
payload->txId as txId, payload->senderWalletId as senderWalletId from
transactionspure;
For query topology and execution plan please run: EXPLAIN <QueryId>
Local runtime statistics
------------------------
consumer-messages-per-sec: 0 consumer-total-bytes: 315500
consumer-total-messages: 542 messages-per-sec: 0 total-
messages: 271 last-message: 2019-09-02T10:44:15.493Z
来源:https://stackoverflow.com/questions/57770983/data-is-duplicated-when-i-create-a-flattened-stream