问题
I have a problem using Broadcast hints (maybe is some lack of SQL knowledge).
I have a query like
SELECT * /* broadcast(a) */
FROM a
INNER JOIN b
ON ....
INNER JOIN c
on ....
I would like to do
SELECT * /* broadcast(a) */
FROM a
INNER JOIN b
ON ....
INNER JOIN c /* broadcast(AjoinedwithB) */
on ....
I mean, I want to force broadcast join (I would prefer to avoid changing spark parameters to force it everywhere), but I don't know how to refer to the table named AjoinedwithB
Of course I can split the SQL, work with DF API and such... but I would like to do it in a single SQL Query.
回答1:
You can use either subquery
SELECT /*+ broadcast(a_b) */ *
FROM
(SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...) AS a_b
JOIN c ON ...
or CTE:
WITH a_b AS (SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...)
SELECT /*+ broadcast(a_b) */ * FROM a_b JOIN c ON ...
来源:https://stackoverflow.com/questions/54493069/spark-sql-broadcast-hint-intermediate-tables