Spark SQL broadcast hint intermediate tables

雨燕双飞 提交于 2019-12-25 01:48:51

问题


I have a problem using Broadcast hints (maybe is some lack of SQL knowledge).

I have a query like

SELECT * /* broadcast(a) */
FROM a 
INNER JOIN b
ON ....
INNER JOIN c
on ....

I would like to do

SELECT * /* broadcast(a) */
FROM a 
INNER JOIN b 
ON ....
INNER JOIN c /* broadcast(AjoinedwithB) */
on ....

I mean, I want to force broadcast join (I would prefer to avoid changing spark parameters to force it everywhere), but I don't know how to refer to the table named AjoinedwithB

Of course I can split the SQL, work with DF API and such... but I would like to do it in a single SQL Query.


回答1:


You can use either subquery

SELECT /*+ broadcast(a_b) */ *
FROM 
    (SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...) AS a_b 
    JOIN c ON ...

or CTE:

WITH a_b AS (SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...)
SELECT /*+ broadcast(a_b) */ * FROM a_b JOIN c ON ...


来源:https://stackoverflow.com/questions/54493069/spark-sql-broadcast-hint-intermediate-tables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!