Import Postgres data into RDS using S3 and aws_s3

孤街醉人 提交于 2020-12-29 10:53:26

问题


I'm having a hard time importing data from S3 into an RDS postgres instance. According to the docs, you can use this syntax:

aws_s3.table_import_from_s3 (
   table_name text, 
   column_list text, 
   options text, 
   bucket text, 
   file_path text, 
   region text, 
   access_key text, 
   secret_key text, 
   session_token text 
) 

So, in pgAdmin, I did this:

SELECT aws_s3.table_import_from_s3(
  'contacts_1', 
  'firstname,lastname,imported', 
  '(format csv)',
  'com.foo.mybucket', 
  'mydir/subdir/myfile.csv', 
  'us-east-2',
  'AKIAYYXUMxxxxxxxxxxx',
  '3zB4S5jb1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
);

I also tried it with an explicit NULL for the last parameter.

The error message I get is:

NOTICE:  CURL error code: 51 when attempting to validate pre-signed URL, 1 attempt(s) remaining
NOTICE:  CURL error code: 51 when attempting to validate pre-signed URL, 0 attempt(s) remaining

ERROR:  Unable to generate pre-signed url, look at engine log for details.
SQL state: XX000

I checked the server logs and there was no further information.

I have triple-checked the correctness of all the parameters. How do I make this work?

UPDATE:

I can confirm that I can do an s3.getObject() in the Java aws sdk using these same credentials.


回答1:


The main issue here is that you need to 1) add a IAM role to the RDS instance to access the S3 bucket and 2) add an S3 endpoint to the VPC where the RDS instance run in order to allow communications.

This is the procedure I followed to make it work, using AWS cli commands in a shell (take care of value properly the environmental variables involved), hope it can help:

  1. Create the IAM role:
$ aws iam create-role \
    --role-name $ROLE_NAME \
    --assume-role-policy-document '{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Principal": {"Service": "rds.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'
  1. Create the IAM policy that will be attached to the IAM role:
$ aws iam create-policy \
    --policy-name $POLICY_NAME \
    --policy-document '{"Version": "2012-10-17", "Statement": [{"Sid": "s3import", "Action": ["s3:GetObject", "s3:ListBucket"], "Effect": "Allow", "Resource": ["arn:aws:s3:::${BUCKET_NAME}", "arn:aws:s3:::${BUCKET_NAME}/*"]}]}'
  1. Attach the policy:
$ aws iam attach-role-policy \
    --policy-arn arn:aws:iam::$AWS_ACCOUNT_ID:policy/$POLICY_NAME \
    --role-name $ROLE_NAME
  1. Add the role to a specific instance - this step need to be repeated for every new instance:
$ aws rds add-role-to-db-instance \
    --db-instance-identifier $RDS_INSTANCE_NAME \
    --feature-name s3Import \
    --role-arn arn:aws:iam::$AWS_ACCOUNT_ID:role/$ROLE_NAME \
    --region $REGION
  1. Create the VPC endpoint for the S3 service:
$ aws ec2 create-vpc-endpoint \
    --vpc-id $VPC_ID \
    --service-name com.amazonaws.$REGION.s3
    --route-table-ids $ROUTE_TABLE_ID

The route table id related to the VPC where the endpoint is created can be retrieved through the command

$ aws ec2 describe-route-tables | jq -r '.RouteTables[] | "\(.VpcId) \(.RouteTableId)"'



回答2:


Currently (2020-10-16) at least with RDS Postgres 12.4, importing files from S3 does not work if the file is not in root of S3 bucket.

Importing file myfile.csv works, importing file mydir/subdir/myfile.csv doesn't work. The latter will give these kinds of errors if permissions and everything else is working:

[XX000] ERROR: HTTP 403. Permission denied. Check bucket or provided credentials as they may no longer be valid.

This is known issue to AWS and according to AWS Support they are working on the issue.

Also it seems that you need to reboot the RDS instance after adding the roles described in documentation. Otherwise the roles do not have any effect. The need for reboot is not mentioned in the documentation.

In general RDS s3Import gives very confusing error messages. For example, importing 0 length file, gives this error:

[Amazon](500310) Invalid operation: HTTP 416. Check your arguments and try again.;



回答3:


To reproduce your situation, I did the following:

  • Launched an Amazon RDS PostgreSQL instance in a public subnet
  • Under Manage IAM Roles, I assigned AmazonRDSServiceRolePolicy for s3import
  • Created a table
  • Put a CSV file in S3
  • Used the SELECT aws_s3.table_import_from_s3() command (as above) to load the data

It worked fine for me.

Given that your error message relates to a pre-signed URL, it suggests that the credentials you provided did not have permission to access the CSV file in S3. However, you then say that you used those credentials successfully to retrieve the object. So, this is unlikely to be the cause.

Based on Reddit: Having issue with AWS RDS Postgres 11+ import from S3 using RDS s3Import feature : aws, the issue might be related to the fact that the Amazon RDS instance is unable to access Amazon S3. This could be because it is in a private subnet with no NAT Gateway in the VPC. If this is the case, then you could either add a NAT Gateway to provide Internet connectivity or, as mentioned in the link, add a VPC Endpoint for S3.

Another comment in that post reported the same problem with a missing Outbound rule in the Security Group, which stopped the RDS instance from accessing Amazon S3.




回答4:


Check your RDS and S3 are at the same region. I had same problem and fixed it using a bucket in the same region my Aurora RDS.




回答5:


I solved the same problem when deployed production clusters under private subnets.

Please check your cluster security group's outbound ( my case )

And also add rds-import-role to "Manage IAM roles" with select feature is s3import

I hope it helps.




回答6:


I had the same issue.

ERROR:  Unable to generate pre-signed url, look at engine log for details

that issue was related to error:

:LOG: S3 bucket names with a period (.) are not supported

The root cause of the issue, in my case, was . (dot) in bucket's name.



来源:https://stackoverflow.com/questions/58616005/import-postgres-data-into-rds-using-s3-and-aws-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!