How to use NOT IN in Hive

前端 未结 1 1848
小蘑菇
小蘑菇 2021-02-04 17:26

Suppose I have 2 tables as shown below. Now, if I want to achieve result which sql will give using, insert into B where id not in(select id from A) which will inse

相关标签:
1条回答
  • 2021-02-04 17:45

    NOT IN in the WHERE clause with uncorrelated subqueries is supported since Hive 0.13 which was released more than 3 years ago, on 21 April, 2014.

    select * from A where id not in (select id from B where id is not null);
    

    +----+--------+
    | id |  name  |
    +----+--------+
    |  3 | George |
    +----+--------+
    

    On earlier versions the column of the outer table should be qualified with the table name/alias.

    hive> select * from A where id not in (select id from B where id is not null);
    FAILED: SemanticException [Error 10249]: Line 1:22 Unsupported SubQuery Expression 'id': Correlating expression cannot contain unqualified column references.
    

    hive> select * from A where A.id not in (select id from B where id is not null);
    OK
    3   George
    

    P.s.
    When using NOT IN you should add is not null to the inner query, unless you are 100% sure that the relevant column does not contain null values.
    One null value is enough to cause your query to return no results.

    0 讨论(0)
提交回复
热议问题