Incremental Data Transfer
    • PDF

    Incremental Data Transfer

    • PDF

    Article summary

    summary

    Help page for Incremental Data Transfer function, which can be specified in Data Source within ETL Configuration.

    What is Incremental Data Transfer?

    This mode transfers only the incremental data from the previous transfer.
    When Incremental Data Transfer is enabled, it retains "how far" the data has been transferred at the time of transfer.
    Therefore, it is possible to identify new files that have been added since the last transfer, and only those incremental files will be transferred.

    Supported Connectors

    Database system

    Data Source - MongoDB
    Data Source - MySQL
    Data Source - Oracle Database
    Data Source - PostgreSQL
    Data Source - Microsoft SQL Server

    File and storage systems

    Data Source - Amazon S3
    Data Source - Azure Blob Storage
    Data Source - FTP/FTPS
    Data Source - Google Cloud Storage
    Data Source - TROCCO Web Activity Log
    Data Source - SFTP

    application-based

    Data Source - Google Play

    Cloud application system

    Data Source - Google Analytics 4
    Data Source - DataSpot
    Data Source - KARTE Datahub
    Data Source - Repro

    Behavior at initial transfer

    Even if Incremental Data Transfer is specified, Full Data Transfer is performed on the first transfer.
    It is possible to transfer any file after the last transferred file by specifying the file path, etc., in the record path of the last transfer, even for the first transfer.
    For details, please refer to the Setting Values section.

    set value (e.g. of a function, parameter, etc.)

    Database-based Connector

    column for Incremental Data Transfer.
    Only records for which the value of the "column to determine incremental data" is greater than the value of the "last record transferred" will be retrieved.

    item nameDescription.
    Columns to determine incremental dataSpecify the columns from which Incremental Data Transfer will be sourced.
    If there is a unique and Auto Incremental ID column, etc. for the record, specify the column name.
    Multiple column names may be specified, separated by commas.
    Last record transferredNormally, this form is not edited (TROCCO will update it automatically).
    Edit this form only if there is an error in the execution of the ETL Job or if you wish to perform the transfer from any point during the initial transfer.
    This form contains information about "how far you have transferred at the last transfer".

    File/Storage Connector

    Incremental Data Transfer is performed using a path prefix.
    When sorted in ascending order by file name, files that come after the "last transferred path" are identified as incremental and data is acquired.
    Please note that this cannot be determined by the incremental modification date of the file.

    item nameDescription.
    Last path forwardedNormally, this form is not edited (TROCCO will update it automatically).
    Edit this form only if there is an error in the execution of the ETL Job or if you wish to perform the transfer from any point during the initial transfer.
    This form contains information about "how far you have transferred at the last transfer".
    last record transferred, last path value transferred

    These values will appear in STEP 3 of ETL Configuration, Confirm and Apply (and Latest Revision of Change History) as the values of the keys last_record and last_path, respectively.
    On the other hand, these values are not included in the YAML configuration file that can be viewed on the ETL Configuration Details screen.
    Therefore, when performing Git Integration, these values will not be Git-Integrated.

    Incremental Data Transfer Example for File/Storage Systems

    For example, the transfer is performed with the following files on the S3 bucket

    • 001.csv
    • 002.csv
    • 003.csv

    At this time, 003.csv is stored in the last transferred path.
    Suppose that 000.csv and``004.csv are added to the bucket in this state and the transfer is performed again.
    000.csv is not transferred, only 004.csv is transferred.
    Note that 004.csv will be saved in the new last transferred path.

    Google Analytics 4・HubSpot

    Incremental Data Transfer is performed using the latest record update date and time.
    Transfers records that have been newly updated since the last modification date of the last retrieved record.

    item nameDescription.
    Date and time of the latest record updateNormally, this form is not edited (TROCCO will update it automatically).
    Edit this form only if there is an error in the execution of the ETL Job or if you wish to execute the transfer of data after a given time during the initial transfer.
    This form contains information about "how far you have transferred at the last transfer". In the unlikely event that you do, please enter the data in the format yyyy-mm-dd HH:MM:SS z.

    To resume transfer from any data

    Transfers can be resumed from any location by editing the last transferred record path or the date and time of the most recent record update.
    However, re-transferring a file that has already been transferred may result in duplicate data at the Data Destination. Delete data as appropriate and then rerun.


    Was this article helpful?