Copying specific Columns in Amazon Redshift from S3 databucket
Solution 1
This is for the case where the file has more columns than the target load table.
Assuming that CustomerName and ProductName can be NULL fields you have two options.
Load the data in a staging table. Then join the staging table with the reference data to insert data into
COPY staging-tablename
FROM 's3://<<YOUR-BUCKET>>/<<YOUR-FILE>>'
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret- access-key>';
INSERT INTO
main_tablename
SELECT Date
,CustomerID
,ProductID
,Price
FROM staging-tablename st;
TRUNCATE TABLE staging-tablename;
ANALYZE main_tablename;
Solution 2
This is for the case where the file has fewer columns than the target load table.
Assuming that CustomerName and ProductName can be NULL fields you have two options.
Option #1 - Load Directly on the table
COPY main_tablename
(Date
,CustomerID
,ProductID
,Price)
FROM 's3://<<YOUR-BUCKET>>/<<YOUR-FILE>>'
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret- access-key>';
ANALYZE main_tablename;
Option #2 -- Load the data in a staging table. Then join the staging table with the reference data to insert data into
COPY staging-tablename
(Date
,CustomerID
,ProductID
,Price)
FROM 's3://<<YOUR-BUCKET>>/<<YOUR-FILE>>'
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret- access-key>';
INSERT INTO
main_tablename
SELECT st.CustomerID
,cust.CustomerName
,st.ProductID
,prod.ProductName
,st.Price
,st.Date
FROM staging-tablename st
INNER JOIN customer-tablename cust ON ( cust.CustomerID = st.CustomerID)
INNER JOIN product-tablename prod ON ( prod.ProductID = st.ProductID );
TRUNCATE TABLE staging-tablename;
ANALYZE main_tablename;
Bitanshu Das
Updated on June 14, 2022Comments
-
Bitanshu Das almost 2 years
I have a file in S3 with columns like
CustomerID CustomerName ProductID ProductName Price Date
Now the existing SQL table structure in Redshift is like
Date CustomerID ProductID Price
Is there a way to copy the selected data into the existing table structure? The S3 database doesn't have any headers, just the data in this order.
-
moertel over 7 yearsIt's the other way round: The target table has fewer columns than the source data at S3. The
copy
command will not work in this case because it encounters more columns in the source data than are available in the target table. -
BigDataKid over 7 yearsI am sorry. You would have to use a staging table. I posted 2 different answers. one for each case. I am sorry I misread your question initially
-
moertel over 7 yearsIt wasn't my question; I just read through the thread and noticed it. :)
-
Admin over 2 yearsAs it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
-
HIMANSHU JAIN almost 2 years@Community - thanks for highlighting that. I have edited my answer and provide more details for the better understanding of all.
-
user433342 almost 2 yearsalter table append instead of insert/truncate