问题
This is an extension of a question I posed yesterday: How to handle potential data loss when performing comparisons across data types in different groups
In HIVE, is it possible to perform comparisons between two columns that are in different data type groups inline within the SELECT
clause? I need to first determine what the incoming meta data is for each column and then provide logic that picks what CAST
to use.
CASE
WHEN Column1 <=> Column2 THEN 0 -- Error occurs here if data types are in different data type groups (i.e., BIGINT <=> STRING)
ELSE 1
END
If this is not possible, is there a workaround to retrieve the meta data and perform CAST'ing based on certain rules? For example, if:
- Column1 is BIGINT
- Column2 is STRING
Then CAST Column2 as BIGINT and then perform the comparison. The simplest solution possible is desired. I'm hoping to not have to make substantial code changes to the underlying system that is generating the SELECT
statement.
Update 2:
There is a back-end system that generates the SQOOP queries and in that system there is a function that modifies the SELECT clause to compare two columns. The problem we are running into is that the columns used to be implicitly converted on an earlier version of HIVE/Cloudera that we were running. Now that we are upgrading to a version that does not support implicit data type conversion, we need to find the simplest solution to continue supporting the data type conversion. Unfortunately, there is no meta data available within the back-end, so I am trying to hack it out in the SELECT
clause
回答1:
Cast implicitly is good idea:
CASE
WHEN Column1 <=> cast(Column2 as bigint) THEN 0
ELSE 1
END
If not possible to cast, the result of cast is NULL.
来源:https://stackoverflow.com/questions/58240566/how-to-compare-two-columns-with-different-data-type-groups