how to write subquery and use “In” Clause in Hive

后端 未结 7 1782
滥情空心
滥情空心 2021-01-31 19:49

How can I use In clause in Hive I want to write something like this in Hive select x from y where y.z in (select distinct z from y) order by x; But I am not finding any way o

相关标签:
7条回答
  • 2021-01-31 20:26

    According to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select:

    "Hive does not support IN, EXISTS or subqueries in the WHERE clause."

    You might want to look at: https://issues.apache.org/jira/browse/HIVE-801

    https://issues.apache.org/jira/browse/HIVE-1799

    0 讨论(0)
  • 2021-01-31 20:28

    Hive 0.13 now do support IN/EXISTS in the WHERE-clause .. The issue https://issues.apache.org/jira/browse/HIVE-784 has been resolved after 4 years :)

    0 讨论(0)
  • 2021-01-31 20:28

    Hive supports perfectly the IN ... it does not support the subquery in the WHERE clause

    there is an open feature ticket from Facebook engineers since 4 years... https://issues.apache.org/jira/browse/HIVE-784?focusedCommentId=13579059

    0 讨论(0)
  • 2021-01-31 20:29

    assume table t1(id,name) and table t2(id,name)

    listing only those ids from t1 that exists in t2(basically IN clause)

    hive>select a.id from t1 a left semi join t2 b on (a.id=b.id);
    

    listing only those ids from t1 that exists only in t1 but not in t2(basically NOT IN clause)

    hive>select a.id from t1 a left outer join t2 b on(a.id=b.id) where b.id is null;
    
    0 讨论(0)
  • 2021-01-31 20:30

    Hive does support IN/EXISTS statements since Hive 0.13 with few limitations. Please refer to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries for more details.

    0 讨论(0)
  • 2021-01-31 20:37

    I'm using hive version 0.7.1 and SELECT * FROM MYTABLE WHERE MYCOLUMN IN ('thisThing','thatThing');

    I tested this on a column type STRING so I am not sure if this works universally on all data types since I noticed like Wawrzyniec mentioned above that the Hive Language Manual says that it is not supported and to instead use LEFT SEMI JOIN but it worked fine in my test.

    0 讨论(0)
提交回复
热议问题