amazon-redshift | 易学教程

Number of slices in a Redshift cluster

阅读更多关于 Number of slices in a Redshift cluster

问题 I have a dc2.large Redshift cluster with 4 nodes. And according to AWS Documentation (see image below), the number of slices per node in a dc2.large cluster are 2 . Then why do I see the number of slices as 4 when I run select * from stv_slices to determine the number of slices? I am running this using the Admin user. Why is this the case, and how can I increase the number of slices in my nodes? 回答1: When you use Elastic Resize to change the size of your cluster Redshift moves the existing

Subquery has too many columns error while getting data for past X weeks?

阅读更多关于 Subquery has too many columns error while getting data for past X weeks?

问题 I have below query which gives me data for previous week as shown below. It returns data with these columns: type , amount and total for previous week using week_number column which is used in inner subquery. select type, case WHEN (type = 'PROC1' AND code = 'UIT') THEN 450 WHEN (type = 'PROC1' AND code = 'KJH') THEN 900 WHEN (type = 'PROC2' AND code = 'LOP') THEN 8840 WHEN (type = 'PROC2' AND code = 'AWE') THEN 1490 WHEN (type = 'PROC3' AND code = 'MNH') THEN 1600 WHEN (type = 'PROC3' AND

Subquery has too many columns error while getting data for past X weeks?

阅读更多关于 Subquery has too many columns error while getting data for past X weeks?

Error while loading parquet format file into Amazon Redshift using copy command and manifest file

阅读更多关于 Error while loading parquet format file into Amazon Redshift using copy command and manifest file

问题 I'm trying to load parquet file using manifest file and getting below error. query: 124138ailed due to an internal error. File 'https://s3.amazonaws.com/sbredshift-east/data/000002_0 has an invalid version number: ) Here is my copy command copy testtable from 's3://sbredshift-east/manifest/supplier.manifest' IAM_ROLE 'arn:aws:iam::123456789:role/MyRedshiftRole123' FORMAT AS PARQUET manifest; here is my manifest file **{ "entries":[ { "url":"s3://sbredshift-east/data/000002_0", "mandatory"

how to write to dynamically created table in Redshift procedure

阅读更多关于 how to write to dynamically created table in Redshift procedure

问题 I need to write a procedure in Redshift that will write to a table, but the table name comes from the input string. Then I declare a variable that puts together the table name. CREATE OR REPLACE PROCEDURE my_schema.data_test(current "varchar") LANGUAGE plpgsql AS $$ declare new_table varchar(50) = 'new_tab' || '_' || current; BEGIN select 'somestring' as colname into new_table; commit; END; $$ This code runs but it doesn't create a new table, no errors. If I remove the declare statement then

How to get all the procedure name and definition in a given schema in Redshift?

阅读更多关于 How to get all the procedure name and definition in a given schema in Redshift?

问题 When using Redshift, I would like to get the names of all the procedure that were created in a schema, along with their definition. I know you can use the SHOW PROCEDURE command to get the definition but that requires to have the procedure name. In SVV_TABLE there is only information regarding tables and view but not procedure. So if anyone knows how to get that ? 回答1: Redshift doesn't have a system view for that yet but you can use tbe PG_PROC table and join it with pg_namespace to filter on

Iterate over rows using SQL

阅读更多关于 Iterate over rows using SQL

问题 I have a table in a Redshift-database containing event-data. Each row is one event. Every event have eventid, but not sessionid that I now need. I have extracted a sample of the table (a subset of columns and only events from one userid): time userid eventid sessionstart sessiontop 1498639773 101xnmnd1ohi62 504747459 t f 1498639777 101xnmnd1ohi62 1479311450 f f 1498639803 101xnmnd1ohi62 808610184 f f 1498639816 101xnmnd1ohi62 335000637 f f 1498639903 101xnmnd1ohi62 238269920 f f 1498639906

AWS DMS endpoint connection to Redshift not working

阅读更多关于 AWS DMS endpoint connection to Redshift not working

问题 I'm currently trying so setup a replication from RDS (MySQL) to Redshift via DMS. The endpoint to RDS is working, but the one to Redshift is not. Here is my setup: VPC RDS, DMS, and Redshift are running in the same VPC and share the same subnets. Roles I implemented the required roles for DMS ( dms-vpc-role , dms-cloudwatch-logs-role ) and the specific one for Redshift ( dms-access-for-endpoint ) according to the AWS documentation. Security groups The security group setup is the same as well.

AWS Redshift - Failed to incorporate external table into local catalog

阅读更多关于 AWS Redshift - Failed to incorporate external table into local catalog

问题 Having a problem with one of our external tables in redshift. We have over 300 tables in AWS Glue which have been added to our redshift cluster as an external schema called events . Most of the tables in events can be queries fine. But when querying one of the tables called item_loaded we get the following error; select * from events.item_loaded limit 1; ERROR: XX000: Failed to incorporate external table "events"."item_loaded" into local catalog. LOCATION: localize_external_table, /home/ec2

How to make the copy command continue its run in redshift even after the lambda function which initiated it has timed out?

阅读更多关于 How to make the copy command continue its run in redshift even after the lambda function which initiated it has timed out?

问题 I am trying to run a copy command which loads around 100 GB of data from S3 to redshift. I am using the lambda function to initiate this copy command every day. This is my current code from datetime import datetime, timedelta import dateutil.tz import psycopg2 from config import * def lambda_handler(event, context): con = psycopg2.connect(dbname=dbname, user=user, password=password, host=host, port=port) cur = con.cursor() try: query = """BEGIN TRANSACTION; COPY """ + table_name + """ FROM '"