@innocent-bystander figured it out (just had to slightly modify his/her suggested solution):
$ hdfs dfs -du -s /foo/bar/* | sort -r -k 1 -g | head -5 | awk '{ suffix="KMGT"; for(i=0; $1>1024 && i < length(suffix); i++) $1/=1024; print int($1) substr(suffix, i, 1), $3; }'
28T /foo/bar/card_dim_h_tobedeleted
20T /foo/bar/transaction_item_fct_tobedeleted
2T /foo/bar/card_dim_h_new_tobedeleted
2T /foo/bar/hshd_loyalty_seg_tobedeleted
1T /foo/bar/prod_dim_h_tobedeleted
(taking head
also just to save some space on this page)
Thank you so much. Not only for solving this but also teaching me stuff I didn't know about awk. Very powerful isnt it?