Suppose my table looks something like:
Col1 Col2 Col3.....Col20 Col21
Now I want to select all but Col21. I want to change it to unix_timestamp
In case you want to drop multiple columns on which you are joining
select
tb1.*,
tb2.`(col1|col2)?+.+`
from
tb1 left join tb2 on
tb1.col1 = tb2.col1
and tb1.col2 = tb2.col2
Try to setup the below property
set hive.support.quoted.identifiers=none;
Then select all columns except col_21:
select `(col_21)?+.+` from <table_name>;
For more info refer to this link.
Then insert statement will be
insert into <tablename> partition (col21)
select `(col_21)?+.+` from ( --select all columns from subquery except col21
select *, unix_timestamp() AS alias_col21 from table_name --select *, create new col based on col21
)a;
By using this approach you are going to have alias_col21 as last column in your select statement so that you can partition based on that column.
In Case of joins:
We cannot refer individual columns((t1.id)?+.+
..etc) from each table, so drop the unnecessary columns in select statement.
hive>insert into <tablename> partition (col21)
select * from (
select t1.* from
(--drop col21 and create new alias_col21 by using col21
select `(col21)?+.+`, unix_timestamp() AS alias_col21 from table1
) t1
join table2 t2
on t1.<col-name>=t2.<col-name>)a;