How to use NOT IN in Hive

不羁岁月 提交于 2021-02-06 04:37:27

问题


Suppose I have 2 tables as shown below. Now, if I want to achieve result which sql will give using, insert into B where id not in(select id from A) which will insert 3 George in Table B.

How to implement this in hive?

Table A

id  name      
1   Rahul     
2   Keshav    
3   George

Table B

id  name      
1   Rahul     
2   Keshav    
4   Yogesh   

回答1:


NOT IN in the WHERE clause with uncorrelated subqueries is supported since Hive 0.13 which was released more than 3 years ago, on 21 April, 2014.

select * from A where id not in (select id from B where id is not null);

+----+--------+
| id |  name  |
+----+--------+
|  3 | George |
+----+--------+

On earlier versions the column of the outer table should be qualified with the table name/alias.

hive> select * from A where id not in (select id from B where id is not null);
FAILED: SemanticException [Error 10249]: Line 1:22 Unsupported SubQuery Expression 'id': Correlating expression cannot contain unqualified column references.

hive> select * from A where A.id not in (select id from B where id is not null);
OK
3   George

P.s.
When using NOT IN you should add is not null to the inner query, unless you are 100% sure that the relevant column does not contain null values.
One null value is enough to cause your query to return no results.



来源:https://stackoverflow.com/questions/44714625/how-to-use-not-in-in-hive

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!