- Print
- PDF
Incremental Data Transfer
- Print
- PDF
summary
Help page for Incremental Data Transfer function, which can be specified in Data Source within ETL Configuration.
What is Incremental Data Transfer?
This mode transfers only the incremental data from the previous transfer.
When Incremental Data Transfer is enabled, it retains "how far" the data has been transferred at the time of transfer.
Therefore, it is possible to identify new files that have been added since the last transfer, and only those incremental files will be transferred.
Supported Connectors
Database system
Data Source - MongoDB
Data Source - MySQL
Data Source - Oracle Database
Data Source - PostgreSQL
Data Source - Microsoft SQL Server
File and storage systems
Data Source - Amazon S3
Data Source - Azure Blob Storage
Data Source - FTP/FTPS
Data Source - Google Cloud Storage
Data Source - TROCCO Web Activity Log
Data Source - SFTP
application-based
Cloud application system
Data Source - Google Analytics 4
Data Source - DataSpot
Data Source - KARTE Datahub
Data Source - Repro
Behavior at initial transfer
Even if Incremental Data Transfer is specified, Full Data Transfer is performed on the first transfer.
It is possible to transfer any file after the last transferred file by specifying the file path, etc., in the record path of the last transfer, even for the first transfer.
For details, please refer to the Setting Values section.
set value (e.g. of a function, parameter, etc.)
Database-based Connector
column for Incremental Data Transfer.
Only records for which the value of the "column to determine incremental data" is greater than the value of the "last record transferred" will be retrieved.
item name | Description. |
---|---|
Columns to determine incremental data | Specify the columns from which Incremental Data Transfer will be sourced. If there is a unique and Auto Incremental ID column, etc. for the record, specify the column name. Multiple column names may be specified, separated by commas. |
Last record transferred | Normally, this form is not edited (TROCCO will update it automatically). Edit this form only if there is an error in the execution of the ETL Job or if you wish to perform the transfer from any point during the initial transfer. This form contains information about "how far you have transferred at the last transfer". |
File/Storage Connector
Incremental Data Transfer is performed using a path prefix.
When sorted in ascending order by file name, files that come after the "last transferred path" are identified as incremental and data is acquired.
Please note that this cannot be determined by the incremental modification date of the file.
item name | Description. |
---|---|
Last path forwarded | Normally, this form is not edited (TROCCO will update it automatically). Edit this form only if there is an error in the execution of the ETL Job or if you wish to perform the transfer from any point during the initial transfer. This form contains information about "how far you have transferred at the last transfer". |
These values will appear in STEP 3 of ETL Configuration, Confirm and Apply (and Latest Revision of Change History) as the values of the keys last_record
and last_path,
respectively.
On the other hand, these values are not included in the YAML configuration file that can be viewed on the ETL Configuration Details screen.
Therefore, when performing Git Integration, these values will not be Git-Integrated.
Incremental Data Transfer Example for File/Storage Systems
For example, the transfer is performed with the following files on the S3 bucket
001.csv
002.csv
003.csv
At this time, 003.csv
is stored in the last transferred path.
Suppose that 000.csv and``004.csv are
added to the bucket in this state and the transfer is performed again.
000.csv
is not transferred, only 004.csv
is transferred.
Note that 004.csv
will be saved in the new last transferred path.
Google Analytics 4・HubSpot
Incremental Data Transfer is performed using the latest record update date and time.
Transfers records that have been newly updated since the last modification date of the last retrieved record.
item name | Description. |
---|---|
Date and time of the latest record update | Normally, this form is not edited (TROCCO will update it automatically). Edit this form only if there is an error in the execution of the ETL Job or if you wish to execute the transfer of data after a given time during the initial transfer. This form contains information about "how far you have transferred at the last transfer". In the unlikely event that you do, please enter the data in the format yyyy-mm-dd HH:MM:SS z. |
To resume transfer from any data
Transfers can be resumed from any location by editing the last transferred record path or the date and time of the most recent record update.
However, re-transferring a file that has already been transferred may result in duplicate data at the Data Destination. Delete data as appropriate and then rerun.