Security-Configuration Field For AWS Data Pipeline EmrCluster

不羁岁月 提交于 2019-12-11 06:24:28

问题


I created an AWS EMR Cluster through the regular EMR Cluster wizard on the AWS Management Console and I was able to select a security-configuration e.g., when you export the CLI command it's --security-configuration 'mySecurityConfigurationValue'.

I now need to create a similar EMR through the AWS Data Pipeline but I don't see any options where I can specify this security-configuration field.

The only similar fields I see are EmrManagedSlaveSecurityGroup, EmrManagedMasterSecurityGroup, AdditionalSlaveSecurityGroups, AdditionalMasterSecurityGroups, and SubnetId. I already have all of those filled out in my Pipeline configuration but I just need to also specify the security-configuration. Any thoughts?


回答1:


Unfortunately, DataPipeline does not support the Security Configurations feature (as well as other features that were introduced in the EMR 5.x versions like using a custom AMI).

One solution for this is to:

  1. Replace the EmrCluster in your pipeline with an EC2 resource
  2. Use a ShellCommandActivity on the EC2 resource to run the aws emr create-cluster CLI command
  3. Use a bootstrap step to install TaskRunner on the cluster
  4. Replace all the runsOn properties in your pipeline with workerGroup so the tasks run on the EMR cluster you created in step 2
  5. Add a final ShellCommandActivity at the end of the pipeline to terminate the cluster using CLI

Now since you are spinning up your cluster using the CLI you have access to all kinds of features like security configurations, custom AMI, instance fleets, etc. and you can still orchestrate the tasks using DataPipeline.



来源:https://stackoverflow.com/questions/50353136/security-configuration-field-for-aws-data-pipeline-emrcluster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!