问题
Precisely following the step-by-step instructions on this page I am trying to export contents of one of my DynamoDB tables to an S3 bucket. I create a pipeline exactly as instructed but it fails to run. It seems that it has trouble identifying/running an EC2 resource to do the export. When I access EMR through AWS Console, I see entries like this:
Cluster: df-0..._@EmrClusterForBackup_2015-03-06T00:33:04Terminated with errorsEMR service role arn:aws:iam::...:role/DataPipelineDefaultRole is invalid
Why am I getting this message? Do I need to set up/configure something else for the pipeline to run?
UPDATE: UnderIAM->Roles
in AWS console I am seeing this for DataPipelineDefaultResourceRole
:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"s3:List*",
"s3:Put*",
"s3:Get*",
"s3:DeleteObject",
"dynamodb:DescribeTable",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:GetItem",
"dynamodb:BatchGetItem",
"dynamodb:UpdateTable",
"rds:DescribeDBInstances",
"rds:DescribeDBSecurityGroups",
"redshift:DescribeClusters",
"redshift:DescribeClusterSecurityGroups",
"cloudwatch:PutMetricData",
"datapipeline:PollForTask",
"datapipeline:ReportTaskProgress",
"datapipeline:SetTaskStatus",
"datapipeline:PollForTask",
"datapipeline:ReportTaskRunnerHeartbeat"
],
"Resource": ["*"]
}]
}
And this for DataPipelineDefaultRole
:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"s3:List*",
"s3:Put*",
"s3:Get*",
"s3:DeleteObject",
"dynamodb:DescribeTable",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:GetItem",
"dynamodb:BatchGetItem",
"dynamodb:UpdateTable",
"ec2:DescribeInstances",
"ec2:DescribeSecurityGroups",
"ec2:RunInstances",
"ec2:CreateTags",
"ec2:StartInstances",
"ec2:StopInstances",
"ec2:TerminateInstances",
"elasticmapreduce:*",
"rds:DescribeDBInstances",
"rds:DescribeDBSecurityGroups",
"redshift:DescribeClusters",
"redshift:DescribeClusterSecurityGroups",
"sns:GetTopicAttributes",
"sns:ListTopics",
"sns:Publish",
"sns:Subscribe",
"sns:Unsubscribe",
"iam:PassRole",
"iam:ListRolePolicies",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:ListInstanceProfiles",
"cloudwatch:*",
"datapipeline:DescribeObjects",
"datapipeline:EvaluateExpression"
],
"Resource": ["*"]
}]
}
Do these need to be modified somehow?
回答1:
I ran into the same error.
In IAM, attach the AWSDataPipelineRole
managed policy to DataPipelineDefaultRole
I also had to update the Trust Relationship to the following (needed ec2 which is not in the documentation):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": [
"ec2.amazonaws.com",
"elasticmapreduce.amazonaws.com",
"datapipeline.amazonaws.com"
]
},
"Action": "sts:AssumeRole"
}
]
}
回答2:
There is a similar question in AWS forum and it seems it is related to an issue with managed policies
https://forums.aws.amazon.com/message.jspa?messageID=606756
In that question, they recommend using specific inline policies for both access and trust policies to define those roles changing some permissions. Oddly enough, the specific inline policies can be found at
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-iam-roles.html
回答3:
I had the same issue. The managed policies were correct in my case, but I had to update the trust relationships for both the DataPipelineDefaultRole and DataPipelineDefaultResourceRole roles using the documentation Gonfva linked to above as they were out of date.
来源:https://stackoverflow.com/questions/28906981/automatic-aws-dynamodb-to-s3-export-failing-with-role-datapipelinedefaultrole-i