发表新帖

发表新帖

Why does Complete output mode require aggregation?

前端未结

关注

 2  1399

北荒 2021-02-13 05:42

I work with the latest Structured Streaming in Apache Spark 2.2 and got the following exception:

org.apache.spark.sql.AnalysisException: Complete output m

2条回答

难免孤独 (楼主)

2021-02-13 06:15

From the Structured Streaming Programming Guide - other queries (excluding aggregations, mapGroupsWithState and flatMapGroupsWithState):

Complete mode not supported as it is infeasible to keep all unaggregated data in the Result Table.

To answer the question:

What would happen if Spark allowed Complete output mode with no aggregations in a streaming query?

Probably OOM.

The puzzling part is why dropDuplicates("id") is not marked as aggregation.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题