Using Regex in Pig in hadoop

前端 未结 1 802
广开言路
广开言路 2021-01-28 18:28

I have a CSV file containing user (tweetid, tweets, userid).

396124436476092416,\"Think about the life you livin but don\'t think so hard it hurts Life is truly          


        
相关标签:
1条回答
  • 2021-01-28 18:38

    Can't comment, but from looking at this and testing it out, it looks like your quotes in the regex are different from those in the csv.

    " in the csv

    in the regex code.

    To get the tweetid try this:

    B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT(line,'.*(,")',1))  AS (tweetid:long);
    
    0 讨论(0)
提交回复
热议问题