I have a HIVE table with 10 columns where first 9 columns will have duplicate rows while the 10th column will not as it CREATE_DATE which will have the date it was created.
we don't need to write all the column name in sql code by this way:
select * from ( select *, row_number() over (partition by (col1, col2) order by col1) tmp_row_number from table_name ) t where t.tmp_row_number==1
the only side effect is add an extra column tmp_row_number to the table.
tmp_row_number