Data Destination - Amazon Redshift
    • PDF

    Data Destination - Amazon Redshift

    • PDF

    Article summary

    summary

    Help Page for Data Settings for ETL Configuration to Amazon Redshift.

    Case study on the use of Amazon Redshift and TROCCO

    TROCCO's product website provides examples of customer installations.
    Please refer to this page together with this help page.

    Setting items

    STEP1 Basic settings

    item nameindispensabledefault valueContents
    Amazon Redshift Connection ConfigurationYes-From the pre-registered Amazon Redshift Connection Configuration, select the one that has the necessary permissions for this ETL Configuration.
    databaseYes-Select the Database Name for Data Destination.
    schemaYes-Select the name of the Data Destination schema.
    tableYes-Enter the name of the Data Destination table.
    If the target table does not exist in the Data Destination database schema, the table will be created automatically.
    Amazon S3 bucketYes-Specify the name of the S3 bucket for temporary data storage during data transfer.
    You can also load the list by pressing Load Bucket List.
    Amazon S3 key prefixYes-Specify the S3 path name for temporary data storage during data transfer.
    transfer modeYesAppend (INSERT)Select the transfer mode.
    For details about each mode, see About Transfer Modes below.

    STEP1 Detailed settings

    Click on Advanced Settings to display the configuration items.

    item nameindispensabledefault valueContents
    Batch size (KB)Yes16,384Enter the batch size.
    If an OutOfMemoryError occurs during the transfer, reduce the batch size accordingly.

    STEP2 Output Option

    item namedefault valueDetails
    table constraint-Table constraints can be added to the CREATE TABLE statement.
    CREATE TABLE <table_name> (<column1> <type1>, <column2> <type2>, ... , <table constraints>) <table options>.
    Table Options-Table options can be added to the CREATE TABLE statement.
    CREATE TABLE <table_name> (<column1> <type1>, <column2> <type2>, ... , <table constraints>) <table options>.
    Delete files temporarily stored in Amazon S3be just about to-
    IAM role name-IAM roles can be used in place of AWS access keys and AWS secret access keys.
    Maximum number of retries12-
    Retry wait time (ms)1,000-
    Maximum retry wait time (ms)1,800,000-
    Merge Key Settings-This can be set when UPSERT (MERGE) is selected in ETL Configuration mode.
    Enter the name of the column to be treated as a merge key (primary key) that has no duplicate values and no null values.
    Default time zoneUTCIf there is a column in timestamp format, the time zone can be specified.
    Column Setting--
    SQL to execute before loading--
    SQL to execute after loading--
    Conditions under which schema-related settings apply

    The contents of the Column Settings in the STEP2 Output Option apply only when creating a new table.
    Specifically, it is applied when a job is executed in the following conditions

    • If the target table does not exist in Data Destination
    • When Full Data Transfer (REPLACE) is selected in the Transfer Mode
      • In this case, the schema of the Data Destination table is updated for each transfer, so the contents of the column configuration are applied each time.

    supplementary information

    About transfer mode

    All of the following explanations are for cases where the target table already exists in the Data Destination.
    If the target table does not exist in Data Destination, a new table is created in either mode.

    transfer modeSchema Changetransactionbehaviorsupplement
    Append (INSERT)NoYesThe following additions are made to the table.
    1. Create a temporary table and populate it with data
    2. INSERT data from a temporary table into a target table
    Postscript (INSERT DIRECT)NoNoThe table is appended as follows
    1. INSERT data directly into the target table
  • Compared to the append (INSERT ) mode, processing time is shorter, but transactions are not secured.
  • Therefore, if a transfer fails midway through, rows added along the way may remain in the target table.
  • TRUNCATE INSERTNoYesThe table will be rewashed as follows
    1. Create a temporary table and populate it with data
    2. DELETE the actual data in the target table and INSERT the data in the temporary table into the target table
    All cases washed (REPLACE)YesYesThe table will be rewashed as follows
    1. Create a temporary table and populate it with data
    2. DROP the target table itself and change the temporary table name to the target table name.
  • The schema of the existing table is deleted and updated to that of the new table.
  • UPSERT (MERGE)NoYesUPSERT to the table as follows
    1. Create a temporary table and populate it with data
    2. MERGE data from temporary table to target table
  • For each row of transferred data, if the value of the merge key exists in the corresponding column of the target table, the row is updated. If it does not exist, the line is appended.
  • If the merge key has duplicate values or contains null values, it will not UPSERT correctly.

  • Was this article helpful?