问题
I'm trying to create EMR 5.3.0 with EMRFS (S3 bucket) as storage. Please provide your general guidance regarding this.
Currently i'm using below command for creating EMR 5.3.0 with InstanceType=m4.2xlarge.Which is working fine, but with EMRFS as storage i'm not able to do
aws emr create-cluster --name "DEMAPAUR001"
--release-label emr-5.3.0
--service-role EMR_DefaultRole_Private
--enable-debug
--log-uri 's3n://xyz/trn'
--ec2-attributes SubnetId=subnet-545e8823,
KeyName=XXX
--applications Name=Hbase Name=Hive Name=Pig Name=Ganglia
--configurations '[{"Classification":"hdfs-site","Properties":
{"dfs.replication":"2"},"Configurations":[]}]'
--instance-groups
'InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.2xlarge,
EbsConfiguration={EbsOptimized=true,
EbsBlockDeviceConfigs=[{VolumeSpecification= {VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}'
'InstanceGroupType=CORE,
InstanceCount=1,InstanceType=m4.2xlarge,EbsConfiguration={EbsOptimized=true,
EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1,SizeInGB=500,Iops=200},VolumesPerInstance=1}]}'
--tags Name=DEMAPAUR001 Owner="XXX" Division=Corporate Application=DEM-EMR Environment=TRN CostCenter=XXX123 CreatedBy=XXX ManagedBy=XXX Availability=24x7_Mon-Fri Backup=NA
Kindly help me regarding this.
回答1:
You can use the following classification in the configuration while launching the cluster.
For enabling consistent view
{ "Classification": "emrfs-site", "Properties": { "fs.s3.consistent": "true" } }
Also, if you actually want hive to point to S3 and store all new files there, you will have to add this classification to hive-site.xml
{ "Classification": "hive-site", "Properties": { "hive.metastore.warehouse.dir": self.hive_warehouse_dir } }
来源:https://stackoverflow.com/questions/42154896/create-emr-5-3-0-with-emrfs-s3-bucket-as-storage