Data Source - PostgreSQL
  • 03 Mar 2022
  • 3 Minutes to read
  • Dark
    Light

Data Source - PostgreSQL

  • Dark
    Light

Overview

Help page for setting up a data transfer from PostgreSQL.

Supported Protocol

Data Loader: Embulk
Using the embulk-input-postgresql plugin.

Limitations

None.

Settings

Step 1: General Settings

Field Name
Required
Default Value
Description
PostgreSQL Connection Configurations Yes - From the PostgreSQL Connection Configurations created in advance, select the one with the necessary permissions for this transfer.
If you have not previously created any Connection Configurations, click the Add Connection Configuration button to set up a new one.
Database Name Yes - Enter the PostgreSQL database name containing the data you would like to transfer.
Schema Yes - Enter the Schema Name containing the data that you wish to extract.
Click the Load Schema List button to refresh the list of schema names you can use to auto-complete the field.
Transfer Method Yes - Choose between either the Query or Incremental Load transfer methods.
Query Yes - Complete this field when using the Query transfer method.
Write a query to extract the transfer data from PostgreSQL.
You can use custom variables to embed a value that will be dynamically set when the job executes.
Table Yes - Complete this field when the transfer method is set to Incremental Load.
Enter the table name that contains the data you would like to transfer.
Click the Load Table List button to refresh the list of table names you can use to auto-complete the field.
Incremental Load Column Name Yes - You can enter into this field when the transfer method is set to Incremental Load.
For each transfer, the last value loaded from the column specified here will be saved as the Last Record Loaded.
From the next transfer, only values higher than the Last Record Loaded, will be transferred.
When using a unique and auto-increment column (e.g. id, created_at, etc.) enter it in this field.
You can use comma separation to specify multiple columns.
If left unspecified, a primary key is used.
Last Record Loaded No - You can enter into this field when the transfer method is set to Incremental Load.
When using Incremental Load, only data with a value higher than the value specified here will be transferred.
If unspecified, a full load will occur.
Do not change this value without a reason (such as if you need to re-load data from a specific date range), as there is a chance of data duplication.
Default Time Zone Yes UTC When the column data type is date, timestamp or datetime, specify the corresponding time zone.

Step 2: Advanced Settings

Field Name
Default Value
Description
Number of records the cursor can fetch at a time 10000 In this field you can specify the number of rows the cursor will fetch at a time.
Connection Timeout (seconds) 300 In this field you can set the amount of time (in seconds) until connection timeout.
Socket Timeout (seconds) 1800 In this field you can set the socket timeout (in seconds).

Column types unable to be extracted by trocco

The column types listed below are unsupported by embulk, meaning that trocco is unable to extract them.
If included in the extraction data, a Preview and Transfer Error will occur.
Error example: Unsupported type interval (sqlType=1111) of 'interval_data' column.
However, these columns can be extracted as a string by specifying the column name and converting the data type to a string value, under Extraction Data Type, in Step 1: Advanced Settings.

List of unsupported column types

  • box
  • bytea
  • cidr
  • circle
  • inet
  • interval
  • line
  • lseg
  • macaddr
  • macaddr8
  • money
  • path
  • pg_lsn
  • point
  • polygon
  • tsquery
  • tsvector
  • xml

Example: When importing the column interval_data, with the interval data type, set the column name as interval_data, and set the data type to string, as shown in the image below.

extraction data type.png


Was this article helpful?