qubole | 易学教程

qubole

How to query data from gz file of Amazon S3 using Qubole Hive query？

阅读更多关于 How to query data from gz file of Amazon S3 using Qubole Hive query？

问题 I need get specific data from gz. how to write the sql? can I just sql as table database?: Select * from gz_File_Name where key = 'keyname' limit 10. but it always turn back with an error. 回答1: You need to create Hive external table over this file location(folder) to be able to query using Hive. Hive will recognize gzip format. Like this: create external table hive_schema.your_table ( col_one string, col_two string ) stored as textfile --specify your file type, or use serde LOCATION 's3:/

Autoscaling EMR- is it required? Should I just use EC2? Should I just use Qubole?

阅读更多关于 Autoscaling EMR- is it required? Should I just use EC2? Should I just use Qubole?

问题 In order to reduce the time for provisioning, we've decided to keep up a dedicated EMR cluster with 5 instances (we expect to need about 5). In case we need more, we think we'll need to implement some sort of autoscaling. I'm not familiar at all with EMR- does it support autoscaling? I found this in the docs: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-manage-resize.html Is that the correct place to look for autoscaling or am I misunderstanding what they mean by

订阅 qubole