How to process more than 10 concurrent messages from an AWS SQS FiFo queue using Spring Integration

问题

I want to be able to process more than 10 SQS messages at a time using a Spring Integration Workflow.

From this question, the recommendation was to use an ExecutorChannel. I updated my code but still have the same symptoms.

How execute Spring integration flow in multiple threads to consume more Amazon SQS queue messages in parallel?

After making this update, my application requests 10 messages, processes those, and only after I make the call to amazonSQSClient.deleteMessage near the end of the flow will it accept another 10 messages from the SQS queue.

The application uses an SQS FiFo queue.

Is there something else I'm missing, or is this an unavoidable symptom of using SqsMessageDeletionPolicy.NEVER and then deleting the messages at the end of the flow? Accepting the messages at the beginning of the flow isn't really an option due to other constraints.

Here are the relevant snippets of code, with some simplifications, but I hope it expresses the problem.

Queue configuration

@Bean
public AsyncTaskExecutor inputChannelTaskExecutor() {
    SimpleAsyncTaskExecutor executor = new SimpleAsyncTaskExecutor();
    executor.setConcurrencyLimit(50);
    return executor;
}

@Bean
@Qualifier("inputChannel")
public ExecutorChannel inputChannel() {
    return new ExecutorChannel(inputChannelTaskExecutor());
}

I also tried a ThreadPoolTaskExecutor instead of the SimpleAsyncTaskExecutor, with the same result but I'll include that too, in case it offers other insight.

    @Bean
    public AsyncTaskExecutor inputChannelTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setMaxPoolSize(50);
        executor.setQueueCapacity(50);
        executor.setThreadNamePrefix("spring-async-");
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.afterPropertiesSet();
        executor.initialize();
        return executor;
    }

SQS Channel Adapter

@Bean
public SqsMessageDrivenChannelAdapter changeQueueMessageAdapter() {
    SqsMessageDrivenChannelAdapter adapter = new SqsMessageDrivenChannelAdapter(this.amazonSQSClient, changeQueue);
    adapter.setOutputChannel(inputChannel);
    adapter.setMessageDeletionPolicy(SqsMessageDeletionPolicy.NEVER);
    return adapter;
}


@Bean(name = PollerMetadata.DEFAULT_POLLER)
public PollerSpec poller() {
    return Pollers.fixedRate(500, TimeUnit.MILLISECONDS).maxMessagesPerPoll(10);
}

Simplified main flow

A common scenario for us is to get a number of Branch edits in a short period of time. This flow only 'cares' that at least one edit has happened. The messageTransformer extracts an id from the payload document and puts it in the header dsp_docId which we then use to aggregate on (we use this id in a few other places, so we felt a header made sense rather than doing all the work in a custom aggregator).

The provisioningServiceActivator retrieves the latest version of the Branch, then the router decides whether it needs further transforms (in which case it sends it to the transformBranchChannel) or it can be sent onto our PI instance (via the sendToPiChannel).

The transform flow (not shown, I don't think you need it) leads to the sent to PI flow eventually, it just does more work first.

The listingGroupProcessor captures all the aws_receiptHandle headers and adds them to a new header as a | separated list.

The sendToPi flow (and the errorFlow) ends with a call to a custom handler that takes care of deleting all the SQS messages referred to by that list of aws_receiptHandle strings.

@Bean
IntegrationFlow sqsListener() {
    return IntegrationFlows.from(inputChannel)
                           .transform(messageTransformer)
                           .aggregate(a -> a.correlationExpression("1")
                                            .outputProcessor(listingGroupProcessor)
                                            .autoStartup(true)
                                            .correlationStrategy(message -> message.getHeaders().get("dsp_docId"))
                                            .groupTimeout(messageAggregateTimeout)  // currently 25s
                                            .expireGroupsUponCompletion(true)
                                            .sendPartialResultOnExpiry(true)
                                            .get())

                           .handle(provisioningServiceActivator, "handleStandard")
                           .route(Branch.class, branch -> (branch.isSuppressed() == null || !branch.isSuppressed()),
                                  routerSpec -> routerSpec.channelMapping(true, "transformBranchChannel")
                                                          .resolutionRequired(false)
                                                          .defaultOutputToParentFlow())

                           .channel(sendtoPiChannel)
                           .get();
}

回答1:

I thought I'd post this as an answer, as this solves my issue, and may help others. As an answer it's more likely to get spotted rather than an edit to the original question that might get overlooked.

Firstly, I should have noted that we're using a FiFo queue.

The issue was actually further up the chain, where we were setting the MessageGroupId to a simple value that described the source of the data. This meant we had very large message groups.

From the ReceiveMessage documentation you can see that it quite sensibly stops you requesting more messages from that group in this scenario, as it would be impossible to guarantee the order should a message need to be put back on the queue.

Updating the code that posts the message to set an appropriate MessageGroupId then meant that the ExecutorChannel worked as expected.

While messages with a particular MessageGroupId are invisible, no more messages belonging to the same MessageGroupId are returned until the visibility timeout expires. You can still receive messages with another MessageGroupId as long as it is also visible.

来源：https://stackoverflow.com/questions/52336006/how-to-process-more-than-10-concurrent-messages-from-an-aws-sqs-fifo-queue-using

标签

java

spring-integration

amazon-sqs

spring-integration-dsl

spring-cloud-aws