If I have a query in hive
which employs JOIN, lets say a LEFT OUTER JOIN
or an INNER JOIN
on two tables ON
any column, then h
Use explain select ...
and check the plan. It explains what exactly map and reduce will do. Also during execution you can check logs on job tracker and see what mapper or reducer processes are doing.
For example the following piece of explain plan says that it is map-side join (Note Map Join Operator in the plan):
Stage: Stage-33
Map Reduce
Map Operator Tree:
TableScan
**alias: s**
filterExpr: (col is not null) (type: boolean)
Statistics: Num rows: 85 Data size: 78965 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: (col is not null) (type: boolean)
Statistics: Num rows: 22 Data size: 20438 Basic stats: COMPLETE Column stats: NONE
**Map Join Operator
condition map:
Inner Join 0 to 1**