How to enable Snappy/Snappy Codec over hadoop cluster for Google Compute Engine
问题 I am trying to run Hadoop Job on Google Compute engine against our compressed data, which is sitting on Google Cloud Storage. While trying to read the data through SequenceFileInputFormat, I get the following exception: hadoop@hadoop-m:/home/salikeeno$ hadoop jar ${JAR} ${PROJECT} ${OUTPUT_TABLE} 14/08/21 19:56:00 INFO jaws.JawsApp: Using export bucket 'askbuckerthroughhadoop' as specified in 'mapred.bq.gcs.bucket' 14/08/21 19:56:00 INFO bigquery.BigQueryConfiguration: Using specified project