qubole

How to query data from gz file of Amazon S3 using Qubole Hive query?

旧时模样 提交于 2021-02-16 15:35:34
问题 I need get specific data from gz. how to write the sql? can I just sql as table database?: Select * from gz_File_Name where key = 'keyname' limit 10. but it always turn back with an error. 回答1: You need to create Hive external table over this file location(folder) to be able to query using Hive. Hive will recognize gzip format. Like this: create external table hive_schema.your_table ( col_one string, col_two string ) stored as textfile --specify your file type, or use serde LOCATION 's3:/

Autoscaling EMR- is it required? Should I just use EC2? Should I just use Qubole?

半腔热情 提交于 2019-12-21 03:20:49
问题 In order to reduce the time for provisioning, we've decided to keep up a dedicated EMR cluster with 5 instances (we expect to need about 5). In case we need more, we think we'll need to implement some sort of autoscaling. I'm not familiar at all with EMR- does it support autoscaling? I found this in the docs: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-manage-resize.html Is that the correct place to look for autoscaling or am I misunderstanding what they mean by