Destination - Databricks
  • 17 Jul 2024
  • PDF

Destination - Databricks

  • PDF

Article summary

summary

Help page for setting up data transfer to Databricks.

constraints

Unavailable data types

Setting items

STEP1 Basic settings

item nameindispensabledefault valueContents
Databricks connection informationYes-Select the previously registered Databricks connection information that has the necessary permissions for this transfer setup.
Catalog NameYes-Select the destination catalog name.
schema nameYes-Select the destination schema name.
tableYes-Select the destination table name.
If the target table does not exist in the destination database schema, it will be created automatically.
transfer modeYesAppend (INSERT)Select the transfer mode.
For more information, see About Transfer Mode below.
merge keyNo-Can be entered when UPSERT (MERGE) is selected in transfer mode.
If the primary key does not exist in the destination table, enter the name of the column to be treated as a merge key (primary key).
The merge key should be populated with columns that have no duplicate values and no null values.

STEP1 Detailed settings

item namedefault valueContents
Batch size (MB)50Specify batch size.
Default time zoneEtc/UTCSpecify the default time zone.

STEP2 Output Options

item namedefault valueDetails
Column Setting-Specify columns for creating temporary tables. The default values of the type are as follows
  • boolean: boolean BOOLEAN
  • string``:STRING
  • long: a long time BIGINT
  • double: (a) DOUBLE
  • timestamp:: 1 TIMESTAMP
  • json:. STRING

  • Settings are required if you wish to use other than the above.
    For more information on the types that can be specified, please refer to the official Databricks documentation - Data types.
    (except for data types listed under Unavailable Data Types )
    Conditions under which schema-related settings apply

    The contents of the column settings in the STEP2 output options apply only when creating a new table.
    Specifically, it is applied when a job is executed in the following conditions

    • If the target table does not exist at the destination
    • When REPLACE is selected in the transfer mode
      • In this case, the schema of the destination table is updated for each transfer, so the column settings are applied each time.

    supplementary information

    About transfer mode

    transfer modeContents
    Append (INSERT)First, a temporary table is created and data is transferred.
    After all temporary tables have been created, insert data into the target tables.
    Postscript (INSERT DIRECT)Inserts rows directly into the target table.
    If the transfer fails midway, some data may have been inserted into the target table.
    TRUNCATE INSERTFirst, a temporary table is created and data is transferred to the temporary table.
    After all temporary tables have been created, delete the contents of the target table and then insert data into the target table.
    All cases washed (REPLACE )First, a temporary table is created and data is transferred.
    Once the temporary table has been created, delete the target table and rename the temporary table to the target name.
    If the transfer fails during the process, the target table may be deleted.
    UPSERT (MERGE)First, a temporary table is created and data is transferred.
    Once all temporary tables have been created
    For the target table, rows that match the merge key and value in the temporary table are updated, and rows that do not match are inserted.

    Was this article helpful?

    What's Next