- 03 Mar 2022
- Print
- DarkLight
- PDF
Data Source - PostgreSQL
- Updated on 03 Mar 2022
- Print
- DarkLight
- PDF
Overview
Help page for setting up a data transfer from PostgreSQL.
Supported Protocol
Data Loader: Embulk
Using the embulk-input-postgresql plugin.
Limitations
None.
Settings
Step 1: General Settings
PostgreSQL Connection Configurations | Yes | - | From the PostgreSQL Connection Configurations created in advance, select the one with the necessary permissions for this transfer. If you have not previously created any Connection Configurations, click the Add Connection Configuration button to set up a new one. |
Database Name | Yes | - | Enter the PostgreSQL database name containing the data you would like to transfer. |
Schema | Yes | - | Enter the Schema Name containing the data that you wish to extract. Click the Load Schema List button to refresh the list of schema names you can use to auto-complete the field. |
Transfer Method | Yes | - | Choose between either the Query or Incremental Load transfer methods. |
Query | Yes | - | Complete this field when using the Query transfer method. Write a query to extract the transfer data from PostgreSQL. You can use custom variables to embed a value that will be dynamically set when the job executes. |
Table | Yes | - | Complete this field when the transfer method is set to Incremental Load. Enter the table name that contains the data you would like to transfer. Click the Load Table List button to refresh the list of table names you can use to auto-complete the field. |
Incremental Load Column Name | Yes | - | You can enter into this field when the transfer method is set to Incremental Load. For each transfer, the last value loaded from the column specified here will be saved as the Last Record Loaded. From the next transfer, only values higher than the Last Record Loaded, will be transferred. When using a unique and auto-increment column (e.g. id, created_at, etc.) enter it in this field. You can use comma separation to specify multiple columns. If left unspecified, a primary key is used. |
Last Record Loaded | No | - | You can enter into this field when the transfer method is set to Incremental Load. When using Incremental Load, only data with a value higher than the value specified here will be transferred. If unspecified, a full load will occur. Do not change this value without a reason (such as if you need to re-load data from a specific date range), as there is a chance of data duplication. |
Default Time Zone | Yes | UTC | When the column data type is date, timestamp or datetime, specify the corresponding time zone. |
Step 2: Advanced Settings
Number of records the cursor can fetch at a time | 10000 | In this field you can specify the number of rows the cursor will fetch at a time. |
Connection Timeout (seconds) | 300 | In this field you can set the amount of time (in seconds) until connection timeout. |
Socket Timeout (seconds) | 1800 | In this field you can set the socket timeout (in seconds). |
Column types unable to be extracted by trocco
The column types listed below are unsupported by embulk, meaning that trocco is unable to extract them.
If included in the extraction data, a Preview and Transfer Error will occur.
Error example: Unsupported type interval (sqlType=1111) of 'interval_data' column.
However, these columns can be extracted as a string by specifying the column name and converting the data type to a string value, under Extraction Data Type, in Step 1: Advanced Settings.
List of unsupported column types
- box
- bytea
- cidr
- circle
- inet
- interval
- line
- lseg
- macaddr
- macaddr8
- money
- path
- pg_lsn
- point
- polygon
- tsquery
- tsvector
- xml
Example: When importing the column interval_data, with the interval data type, set the column name as interval_data, and set the data type to string, as shown in the image below.