Where is my AWS EMR reducer output for my completed job (should be on S3, but nothing there)?

后端未结

关注

 1  1039

I\'m having an issue where my Hadoop job on AWS\'s EMR is not being saved to S3. When I run the job on a smaller sample, the job stores the output just fine. When I run th

相关标签:

1条回答

粉色の甜心

2021-01-15 19:33

This turned out to be a bug on AWS's part, and they've fixed it in the latest AMI version 2.2.1, briefly described in these release notes.

The long explanation I got from AWS is that when the reducer files are > the block limit for S3 (i.e. 5GB?), then multipart is used, but there was not proper error-checking going on, so that is why it would sometimes work, and other times not.

In case this continues for anyone else, refer to my case number, 62849531.

0 讨论(0)
发布评论:

提交评论
- 加载中...