Data Destination - Redshift
  • 06 Dec 2022
  • Dark
    Light
  • PDF

Data Destination - Redshift

  • Dark
    Light
  • PDF

Article Summary

Overview

Help page for setting up a data transfer to Amazon’s Redshift.

Supported Protocol

Dataloader: Embulk
Using the embulk-output-redshift plugin.

Limitations

None.

Settings

Step 1: General Settings

Field Name
Required
Default Value
Description
Redshift Connection ConfigurationYes-From the Redshift Connection Configuration created in advance, select the one with the required permissions for this transfer.
If you have not previously created any Connection Configurations, click the Add Connection Configuration button to set up a new one.
DatabaseYes-Enter the database name at the destination that you would like to transfer to.
SchemaYes-Enter the schema name at the destination that you would like to transfer to.
Click the Load Schema List button to load a list of schemas you can use to auto-complete the field.
TableYes-Enter the table name at the destination you wish to transfer to.
Click the Load Table List button to load a list of tables you can use to auto-complete the field.
S3 BucketYes-Enter the S3 bucket name used to temporarily store the data during transfer.
Click the Load Bucket List button to load a list of buckets you can use to auto-complete the field.
The bucket specified here must exist before creating this ETL configuration.
S3 Path PrefixYes-Specify the path prefix in S3, used to temporarily store the data.
Transfer ModeYes-Select a transfer mode.
See Transfer Modes below for information regarding each mode.

Step 2: Advanced Settings

Field Name
Default Value
Details
Table Constraints-In this field you can add table constraints to the CREATE TABLE statements.
e.g. CREATE TABLE <table_name> (<column1> <type1>, <column2> <type2>, ... , <table constraint>) <table option>
Table Options-In this field you can add table options to the CREATE TABLE statements.
e.g. CREATE TABLE <table_name> (<column1> <type1>, <column2> <type2>, ... , <table constraint>) <table option>
Delete Temporary S3 FileYes-
IAM Role Name-An IAM role can be used instead of a AWS access key or AWS secret key.
Maximum Retries12-
Retry Wait Time (milliseconds)1800000-
Default Time ZoneUTCWhen loading a column with a timestamp data type, specify the corresponding time zone in thie field.
Column Settings--
Execute SQL Before Loading--
Execute SQL After Loading--

Transfer Modes

Mode
Details
INSERTThis transfer mode will append the table at the destination.
First, the data is inserted into temporary tables that are created by trocco. Next, using the INSERT command, the data will be transferred from the temporary tables to the table at the destination.
This ensures that if the transfer fails during execution, there will not be any incomplete data left in the table at the destination.
If a table does not exist at the destination, it will automatically be created by trocco.
INSERT_DIRECTThis transfer mode will append the table at the destination.
Data will be inserted directly into the table at the destination, without using a temporary table.
If the transfer happens to fail during execution, there is a chance that the data at the destination table can end up incomplete.
If a table does not exist at the destination, it will be automatically created by trocco.
TRUNCATE_INSERTThis mode will overwrite the data in the table at the destination.
First, the data is inserted into temporary tables that are created by trocco. Next, the TRUNCATE command will be executed on the table at the destination, followed by the INSERT command.
REPLACEThis mode will recreate the table at the destination.
First, the data is inserted into a temporary table created by trocco. Next, if all the data is transferred successfully, trocco will drop the existing table in the destination, then replace the name of the temporary table with the name of the previously existing table in the destination.
UPSERTThis transfer mode will update and/or insert data into the table.
In this mode, without creating a temporary table, the data will be inserted directly into the destination table.
If a primary key of a record in the intermediate table already exists in the destination table, the destination table record will be updated.
If a primary key of a record does not exist, the record will be inserted into the destination table.

For further information, please refer to the embulk plugin documentation.


Was this article helpful?