ShellCommandActivity in AWS Data Pipeline

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-23 04:46:02

问题


I am transferring Dynamo DB data to S3 using Data Pipeline. In the S3 bucket I get the backup but it is split into multiple files. To get the data in a single file I used a Shell Command Activity which runs the following command:

aws s3 cat #{myOutputS3Loc}/#{format(@scheduledStartTime,'YYYY-MM-dd')}/* > #{myRenamedFile}

This should concatenate all the files present in the S3 folder to a single file named #{myRenamedFile}. But I get the following error in data pipeline:

usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters] To see help text, you can run: aws help aws <command> help aws <command> <subcommand> help aws: error: argument subcommand: Invalid choice, valid choices are: ls | website cp | mv rm | sync mb | rb

Does this mean cat is not supported in Shell Command Activity or is there something wrong here? Is there any other method to combine the different files to a single file in S3 itself?


回答1:


There is no cat command in aws s3. Other options:

  • cp/sync the files and catenate all the files using cat command in shell
  • Get the file names and loop through the list by calling aws s3 cp s3://<file> - and append the output to a new file. You can do this in a single command with --recursive option to cp but --recursive is not supported if the file is copied to stdout


来源:https://stackoverflow.com/questions/36217110/shellcommandactivity-in-aws-data-pipeline

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!