Data Destination - Amazon Redshift

Prev Next

summary

Help Page for Data Settings for ETL Configuration to Amazon Redshift.

Case study on the use of Amazon Redshift and TROCCO

TROCCO's product website provides examples of customer installations.
Please refer to this page together with this help page.

Setting items

STEP1 Basic settings

item name indispensable default value Contents
Amazon Redshift Connection Configuration Yes - From the pre-registered Amazon Redshift Connection Configuration, select the one that has the necessary permissions for this ETL Configuration.
database Yes - Select the Database Name for Data Destination.
schema Yes - Select the name of the Data Destination schema.
table Yes - Enter the name of the Data Destination table.
If the target table does not exist in the Data Destination database schema, the table will be created automatically.
Amazon S3 bucket Yes - Specify the name of the S3 bucket for temporary data storage during data transfer.
You can also load the list by pressing Load Bucket List.
Amazon S3 key prefix Yes - Specify the S3 path name for temporary data storage during data transfer.
transfer mode Yes Append (INSERT) Select the transfer mode.
For details about each mode, see About Transfer Modes below.

STEP1 Detailed settings

Click on Advanced Settings to display the configuration items.

item name indispensable default value Contents
Batch size (KB) Yes 16,384 Enter the batch size.
If an OutOfMemoryError occurs during the transfer, reduce the batch size accordingly.

STEP2 Output Option

item name default value Details
table constraint - Table constraints can be added to the CREATE TABLE statement.
CREATE TABLE <table_name> (<column1> <type1>, <column2> <type2>, ... , <table constraints>) <table options>.
Table Options - Table options can be added to the CREATE TABLE statement.
CREATE TABLE <table_name> (<column1> <type1>, <column2> <type2>, ... , <table constraints>) <table options>.
Delete files temporarily stored in Amazon S3 be just about to -
IAM role name - IAM roles can be used in place of AWS access keys and AWS secret access keys.
Maximum number of retries 12 -
Retry wait time (ms) 1,000 -
Maximum retry wait time (ms) 1,800,000 -
Merge Key Settings - This can be set when UPSERT (MERGE) is selected in ETL Configuration mode.
Enter the name of the column to be treated as a merge key (primary key) that has no duplicate values and no null values.
Default time zone UTC If there is a column in timestamp format, the time zone can be specified.
Column Setting - -
SQL to execute before loading - -
SQL to execute after loading - -
Conditions under which schema-related settings apply

The contents of the Column Settings in the STEP2 Output Option apply only when creating a new table.
Specifically, it is applied when a job is executed in the following conditions

  • If the target table does not exist in Data Destination
  • When Full Data Transfer (REPLACE) is selected in the Transfer Mode
    • In this case, the schema of the Data Destination table is updated for each transfer, so the contents of the column configuration are applied each time.

supplementary information

About transfer mode

All of the following explanations are for cases where the target table already exists in the Data Destination.
If the target table does not exist in Data Destination, a new table is created in either mode.

transfer mode Schema Change transaction behavior supplement
Append (INSERT) No Yes The following additions are made to the table.
  1. Create a temporary table and populate it with data
  2. INSERT data from a temporary table into a target table
Postscript (INSERT DIRECT) No No The table is appended as follows
  1. INSERT data directly into the target table
  • Compared to the append (INSERT ) mode, processing time is shorter, but transactions are not secured.
  • Therefore, if a transfer fails midway through, rows added along the way may remain in the target table.
  • TRUNCATE INSERT No Yes The table will be rewashed as follows
    1. Create a temporary table and populate it with data
    2. DELETE the actual data in the target table and INSERT the data in the temporary table into the target table
    All cases washed (REPLACE) Yes Yes The table will be rewashed as follows
    1. Create a temporary table and populate it with data
    2. DROP the target table itself and change the temporary table name to the target table name.
  • The schema of the existing table is deleted and updated to that of the new table.
  • UPSERT (MERGE) No Yes UPSERT to the table as follows
    1. Create a temporary table and populate it with data
    2. MERGE data from temporary table to target table
  • For each row of transferred data, if the value of the merge key exists in the corresponding column of the target table, the row is updated. If it does not exist, the line is appended.
  • If the merge key has duplicate values or contains null values, it will not UPSERT correctly.