Why is unix_timestamp parsing this incorrectly by 12 hours off?

痞子三分冷 提交于 2019-12-13 02:25:31

问题


The following appears to be incorrect (spark.sql):

select unix_timestamp("2017-07-03T12:03:56", "yyyy-MM-dd'T'hh:mm:ss")
-- 1499040236

Compared to:

select unix_timestamp("2017-07-03T00:18:31", "yyyy-MM-dd'T'hh:mm:ss")
-- 1499041111

Clearly the first comes after the second. And the second appears to be correct:

# ** R Code **
# establish constants
one_day = 60 * 60 * 24
one_year = 365 * one_day
one_year_leap = 366 * one_day
one_quad = 3 * one_year + one_year_leap

# to 2014-01-01
11 * one_quad +
  # to 2017-01-01
  2 * one_year + one_year_leap + 
  # to 2017-07-01
  (31 + 28 + 31 + 30 + 31 + 30) * one_day + 
  # to 2017-07-03 00:18:31
  2 * one_day + 18 * 60 + 31
# [1] 1499041111

A similar calculation shows the first should be 1499083436 (confirmed by as.integer(as.POSIXct('2017-07-03 12:03:56', tz = 'UTC')) in R), and that 1499040236 should correspond to 2017-07-03 00:03:56.

So what's happening here? It certainly looks like a bug. Two last sanity checks -- select unix_timestamp("2017-07-03T00:03:56", "yyyy-MM-dd'T'hh:mm:ss") correctly returns 1499040236; and replacing the T in the middle with a space has no effect on the incorrect parse.


Since it appears to be fixed in development, I'll note that this is on 2.1.1.


回答1:


It is just a format mistake:

  • Your data is in 0-23 hour format (denoted in SimpleDateFormat as HH).
  • You use hh format which corresponds to 1-24 hour format.

In fact, in the latest Spark version (2.3.0 RC1) it wouldn't parse at all:

spark.version
String = 2.3.0
spark.sql("""
  select unix_timestamp("2017-07-03T00:18:31", "yyyy-MM-dd'T'hh:mm:ss")""").show
+----------------------------------------------------------+
|unix_timestamp(2017-07-03T00:18:31, yyyy-MM-dd'T'hh:mm:ss)|
+----------------------------------------------------------+
|                                                      null|
+----------------------------------------------------------+


来源:https://stackoverflow.com/questions/48357091/why-is-unix-timestamp-parsing-this-incorrectly-by-12-hours-off

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!