Does Hive have a String split function?

前端 未结 3 948
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-13 05:48

I am looking for a in-built String split function in Hive? e.g. if String is:

A|B|C|D|E

Then I want to have a function like:



        
相关标签:
3条回答
  • 2020-12-13 06:31

    Another interesting usecase for split in Hive is when, for example, a column ipname in the table has a value "abc11.def.ghft.com" and you want to pull "abc11" out:

    SELECT split(ipname,'[\.]')[0] FROM tablename;
    
    0 讨论(0)
  • 2020-12-13 06:35

    There does exist a split function based on regular expressions. It's not listed in the tutorial, but it is listed on the language manual on the wiki:

    split(string str, string pat)
       Split str around pat (pat is a regular expression) 
    

    In your case, the delimiter "|" has a special meaning as a regular expression, so it should be referred to as "\\|".

    0 讨论(0)
  • 2020-12-13 06:41

    Just a clarification on the answer given by Bkkbrad.

    I tried this suggestion and it did not work for me.

    For example,

    split('aa|bb','\\|')
    

    produced:

    ["","a","a","|","b","b",""]
    

    But,

    split('aa|bb','[|]')
    

    produced the desired result:

    ["aa","bb"]
    

    Including the metacharacter '|' inside the square brackets causes it to be interpreted literally, as intended, rather than as a metacharacter.

    For elaboration of this behaviour of regexp, see: http://www.regular-expressions.info/charclass.html

    0 讨论(0)
提交回复
热议问题