Hive: Filtering Data between Specified Dates when Date is a String

后端 未结 6 1399
旧巷少年郎
旧巷少年郎 2021-02-04 02:08

I\'m trying to filter data between September 1st, 2010 and August 31st, 2013 in a Hive table. The column containing the date is in string format (yyyy-mm-dd). I can use month()

6条回答
  •  醉梦人生
    2021-02-04 02:49

    Hive has a lot of good date parsing UDFs: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

    Just doing the string comparison as Nigel Tufnel suggests is probably the easiest solution, although technically it's unsafe. But you probably don't need to worry about that unless your tables have historical data about the medieval ages (dates with only 3 year digits) or dates from scifi novels (dates with more than 4 year digits).

    Anyway, if you ever find yourself in a situation where you would want to do fancier date comparisons, or if your date format is not in a "biggest to smallest" order, e.g. the American convention of "mm/dd/yyyy", then you could use unix_timestamp with two arguments:

    select *
    from your_table
    where unix_timestamp(your_date_column, 'yyyy-MM-dd') >= unix_timestamp('2010-09-01', 'yyyy-MM-dd')
    and unix_timestamp(your_date_column, 'yyyy-MM-dd') <= unix_timestamp('2013-08-31', 'yyyy-MM-dd')
    

提交回复
热议问题