问题
I have set up a Cloudwatch rule event where an ECS task definition is started when a previous task definition is completed.
I can see the event triggers the task definition however it fails.
The only visibility of this failure is in the rule metrics, where I see the metric failedinnvocations.
Question, are there any logs to see why the trigger failed?
I can manually set up the rule via the management console and everything works fine.
The error occurs when I set up the rule via a cloudformation template.
I have compared the two rules and both are identical, except the role. However, both roles have the same permissions.
回答1:
This stumped us for ages, the main issue is the role problem Nathan B mentions but something else that tripped us up is that Scheduled Containers won't work in awsvpc
mode (and by extension Fargate). Here's a sample CloudFormation template:
---
AWSTemplateFormatVersion: 2010-09-09
Description: Fee Recon infrastructure
Parameters:
ClusterArn:
Type: String
Description: The Arn of the ECS Cluster to run the scheduled container on
Resources:
TaskRole:
Type: AWS::IAM::Role
Properties:
Path: /
AssumeRolePolicyDocument:
Statement:
- Action:
- sts:AssumeRole
Effect: Allow
Principal:
Service:
- ecs-tasks.amazonaws.com
Version: 2012-10-17
Policies:
- PolicyName: TaskPolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- 'ses:SendEmail'
- 'ses:SendRawEmail'
Resource: '*'
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
TaskRoleArn: !Ref TaskRole
ContainerDefinitions:
- Name: !Sub my-container
Essential: true
Image: !Sub <aws-account-no>.dkr.ecr.eu-west-1.amazonaws.com/mycontainer
Memory: 2048
Cpu: 1024
CloudWatchEventECSRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- events.amazonaws.com
Action:
- sts:AssumeRole
Path: /
Policies:
- PolicyName: CloudwatchEventsInvokeECSRunTask
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: 'ecs:RunTask'
Resource: !Ref TaskDefinition
TaskSchedule:
Type: AWS::Events::Rule
Properties:
Description: Runs every 10 minutes
Name: ScheduledTask
ScheduleExpression: cron(0/10 * * * ? *)
State: ENABLED
Targets:
- Id: ScheduledEcsTask
RoleArn: !GetAtt CloudWatchEventECSRole.Arn
EcsParameters:
TaskDefinitionArn: !Ref TaskDefinition
TaskCount: 1
Arn: !Ref ClusterArn
Note: I've added the ClusterArn as a parameter to the script but of course it's better to do this with a CloudFormation ImportValue
statement.
There are two roles you need to care about, the first is the role (TaskRole
) for the task itself: in this example the container just sends an email using SES so it has the necessary permissions. The second role (CloudWatchEventECSRole
) is the one that makes it all work, note that in its Policies
array the principle is events.amazonaws.com
and the resource is the ECS task defined in the template.
回答2:
This problem was due to not setting the principle services to include events.amazonaws.com. The task couldn't assume the role.
Shame aws doesn't have better logging for failedinvocations.
回答3:
If the rule has been successfully triggered, but the invocation on the target failed, you should see a trace of the API call in the Event History inside the AWS CloudTrail looking at the errorCode
and errorMessage
properties:
{
[..]
"errorCode": "InvalidInputException",
"errorMessage": "Artifacts type is required",
[..]
}
回答4:
CloudTrail logs helped. event Name is RunTask. The issue was: "errorCode": "InvalidParameterException", "errorMessage": "Override for container named rds-task is not a container in the TaskDefinition.",
The AWS documentation for debugging CloudWatch events is here:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CWE_Troubleshooting.html
I opened a PR to add documentation for debugging failed ECS Task Invocations from CloudWatch Events:
https://github.com/awsdocs/amazon-cloudwatch-events-user-guide/pull/12/files
回答5:
In case other people come here looking for the setup necessary to make this work for a task in Fargate. There is some extra configuration in addition to Stefano's answer. Running tasks in Fargate requires setting up an execution role, so you need to enable the CloudWatchEventECSRole to use it. Add this statement to that role:
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": [
"arn:aws:iam::<account>:role/<executionRole>"
]
}
回答6:
For anyone that is struggling with setting up scheduled tasks on Fargate, and is using Terraform to set-up their cloud, take a look at this module. https://github.com/dxw/terraform-aws-ecs-scheduled-task
It helps in setting up the scheduled tasks through CloudEvents and sets the correct IAM roles.
回答7:
I spent ages trying to troubleshoot this, when creating an ECS scheduled task via the command line the task was created but never started. Thanks for this post, I discovered by looking at the EventHistory in CloudTrail that the ECS instances had all died and there were no EC2 instances running!
{
[..]
"errorCode": "InvalidParameterException",
"errorMessage": "No Container Instances were found in your cluster.",
[..]
}
回答8:
I too was not seeing my lambda executing, but I did find evidence of FailedInvocations in CloudWatch Events (but only via the Event Rule Metrics link, which took me to https://console.aws.amazon.com/cloudwatch/home?region={your_aws_region}#metricsV2:graph=~();query=~'*7bAWS*2fEvents*2cRuleName*7d*2{Lambda_Physical_ID})
I was not seeing the "trigger" in the console either so I took a step back, decided to do a more "simple" SAM deploy with the Events
property set, then looked at the processed template to determine how it was done in that case. Below is what I ended up using to implement "EventBridge" to have a ScheduledEvent fire my Lambda (alias in my case, which is why I discovered this).
Simple SAM approach to scheduled invocations
(Add this property to your AWS::Serverless::Function)
Events:
InvokeMyLambda:
Type: Schedule
Properties:
Schedule: rate(1 minute)
Description: Run SampleLambdaFunction once every minute.
Enabled: True
By looking at the converted template in CloudFormation and comparing to the version without Events
, I was able to identify not on the expected AWS::Events::Rule (which is what I expected to see invocing the lambad), but I also saw AWS::Lambda::Permission.
Hopefully this is what you all are needing as well to get invocations working (and not needing the missing logs to see why) :P
Working approach
MyLambdaScheduledEvent:
Type: AWS::Events::Rule
Properties:
Name: MyLambdaScheduledEvent
EventBusName: "default"
State: ENABLED
ScheduleExpression: rate(5 minutes) # same as cron(0/5 * * * ? *)
Description: Run MyLambda once every 5 minutes.
Targets:
- Id: EventMyLambdaScheduled
Arn: !Ref MyLambda
MyLambdaScheduledEventPermission:
Type: AWS::Lambda::Permission
Properties:
Action: lambda:InvokeFunction
Principal: events.amazonaws.com
FunctionName: !Ref MyLambda
SourceArn: !GetAtt MyLambdaScheduledEvent.Arn
来源:https://stackoverflow.com/questions/48602856/cloudwatch-failedinvocation-error-no-logs-available