Local Time Convert To UTC Time In Hive

拜拜、爱过 提交于 2020-01-29 02:07:31

问题


I searched a lot on Internet but couldn't find the answer. Here is my question:

I'm writing some queries in Hive. I have a UTC timestamp and would like to change it to UTC time, e.g., given timestamp 1349049600, I would like to convert it to UTC time which is 2012-10-01 00:00:00. However if I use the built in function from_unixtime(1349049600) in Hive, I get the local PDT time 2012-09-30 17:00:00.

I realized there is a built in function called from_utc_timestamp(timestamp, string timezone). Then I tried it like from_utc_timestamp(1349049600, "GMT"), the output is 1970-01-16 06:44:09.6 which is totally incorrect.

I don't want to change the time zone of Hive permanently because there are other users. So is there any way I can get a UTC timestamp string from 1349049600 to "2012-10-01 00:00:00"? Thanks a lot!!


回答1:


As far as I can tell, from_utc_timestamp() needs a date string argument, like "2014-01-15 11:21:15", not a unix seconds-since-epoch value. That might be why it is giving odd results when you pass an integer?

The only Hive function that deals with epoch seconds seems to be from_unixtime() which gives you a timestamp string in the server timezone, which I found in /etc/sysconfig/clock - "America/Montreal" in my case.

So you can get a UTC timestamp string via to_utc_timestamp(from_unixtime(1389802875),'America/Montreal'), and then convert to your target timezone with from_utc_timestamp()

It all seems very torturous, particularly having to wire your server TZ into your SQL. Life would be easier if there was a from_unixtime_utc() function or something.


Update: from_utc_timestamp() does deal with a milliseconds argument as well as a string, but then gets the conversion wrong.

When I try from_utc_timestamp(1389802875000, 'America/Los_Angeles') it gives "2014-01-15 03:21:15" which is wrong.
The correct answer is "2014-01-15 08:21:15" which you can get (for a server in Montreal) via from_utc_timestamp(to_utc_timestamp(from_unixtime(1389802875),'America/Montreal'), 'America/Los_Angeles')




回答2:


Hey just wanted to add a little here, I'd suggest trying to "automate" the system timezone. So instead of statically

#STATIC TZ deceleration     
to_utc_timestamp(from_unixtime(1389802875),'America/Montreal')

Give this a shot

#DYNAMIC TZ
select to_utc_timestamp(from_unixtime(1389802875), from_unixtime(unix_timestamp(), "z"));

This just uses the string output format of "from_unixtime" to return the timezone string (lowercase z)




回答3:


Use it like this :

to_utc_timestamp(from_unixtime(timestamp),"PDT")




回答4:


This example provides a solution to the problem of having a hardwired value of the system time zone TZ in your hive code. It was run using hive 0.10.0 in a Centos environment, with OpenJDK java version 1.6. Because it involves time manipulation those precise software revisions might matter. Currently the system is operating in EDT. The table tblFiniteZahl is like a DUAL but with about a million rows, of, you guessed it, finite numbers. But you can substitute any table with at least 1 row. The trick is to format the time in a local timezone but use the z format to capture the timezone and then to extract that value at runtime for passing to the to_utc_timestamp function.

select D1,
       D1E,
       D1L,
       D1LT,
       D1LZ,
       to_utc_timestamp(D1LT, D1LZ) as D1UTC
from (
select D1,
       D1E,
       D1L,
       regexp_extract(D1L, '^([^ ]+[ ][^ ]+)[ ](.+)$', 1) as D1LT,
       regexp_extract(D1L, '^([^ ]+[ ][^ ]+)[ ](.+)$', 2) as D1LZ
from (
select D1,
       D1E,
       from_unixtime(D1E, 'yyyy-MM-dd HH:mm:ss z') as D1L
from (
select D1,
       unix_timestamp(D1,'yyyy-MM-dd HH:mm:ss Z') as D1E
from (
select '2015-08-24 01:15:23 UTC' as D1
from tblFiniteZahl
limit 1
      ) T1
      ) T2
      ) T3
      ) T4
;

The result is

D1 = 2015-08-24 01:15:23 UTC
DT3 = 1440378923
D1L = 2015-08-23 21:15:23 EDT
D1LT = 2015-08-23 21:15:23
D1LZ = EDT
D1UTC = 2015-08-23 21:15:23

This illustrates that the to_utc_timestamp does take a second argument of EDT.




回答5:


I went to currentmillis.com and pasted 1349049600 without realizing it was actually seconds. And indeed it returned 1970-01-16 in the date, which means that the function you suggested: from_utc_timestamp actually takes milliseconds as the first parameter? Maybe you can try again with from_utc_timestamp(1349049600000, "GMT") ?



来源:https://stackoverflow.com/questions/18278786/local-time-convert-to-utc-time-in-hive

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!