Replacing an SQL query with unix sort, uniq and awk
问题 We currently have some data on an HDFS cluster on which we generate reports using Hive. The infrastructure is in the process of being decommissioned and we are left with the task of coming up with an alternative of generating the report on the data (which we imported as tab separated files into our new environment) Assuming we have a table with the following fields. Query IPAddress LocationCode Our original SQL query we used to run on Hive was (well not exactly.. but something similar) select