Create EMR 5.3.0 with EMRFS (s3 bucket) as storage

假如想象 提交于 2020-02-25 05:28:12

问题


I'm trying to create EMR 5.3.0 with EMRFS (S3 bucket) as storage. Please provide your general guidance regarding this.

Currently i'm using below command for creating EMR 5.3.0 with InstanceType=m4.2xlarge.Which is working fine, but with EMRFS as storage i'm not able to do

aws emr create-cluster --name "DEMAPAUR001" --release-label emr-5.3.0 --service-role EMR_DefaultRole_Private --enable-debug --log-uri 's3n://xyz/trn' --ec2-attributes SubnetId=subnet-545e8823, KeyName=XXX --applications Name=Hbase Name=Hive Name=Pig Name=Ganglia --configurations '[{"Classification":"hdfs-site","Properties": {"dfs.replication":"2"},"Configurations":[]}]' --instance-groups
'InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.2xlarge, EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification= {VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}' 'InstanceGroupType=CORE, InstanceCount=1,InstanceType=m4.2xlarge,EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}' --tags Name=DEMAPAUR001 Owner="XXX" Division=Corporate Application=DEM-EMR Environment=TRN CostCenter=XXX123 CreatedBy=XXX ManagedBy=XXX Availability=24x7_Mon-Fri Backup=NA

Kindly help me regarding this.


回答1:


You can use the following classification in the configuration while launching the cluster.

For enabling consistent view

{ "Classification": "emrfs-site", "Properties": { "fs.s3.consistent": "true" } }

Also, if you actually want hive to point to S3 and store all new files there, you will have to add this classification to hive-site.xml

{ "Classification": "hive-site", "Properties": { "hive.metastore.warehouse.dir": self.hive_warehouse_dir } }



来源:https://stackoverflow.com/questions/42154896/create-emr-5-3-0-with-emrfs-s3-bucket-as-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!