Programming ETL
  • 07 Dec 2022
  • PDF

Programming ETL

  • PDF

Article summary

Note

This is a machine-translated version of the original Japanese article.
Please understand that some of the information contained on this page may be inaccurate.

summary

This is a help page for the programming ETL function that realizes flexible processing that cannot be performed with template ETL by writing a program in STEP 2 of Transfer Settings.

How to activate

The programming function is a paid option.
Please contact Customer Success to use it.

If you have subscribed to the option, check the Use Programming ETL check box in the ETL Settings at the bottom of STEP2 of the Transfer Settings to display the Programming ETL screen.

Test data generation and test execution

  • When the first check is checked, test data is generated. It may take about the same time as the preview, so please be patient.
  • The test data is displayed in a state that reflects the settings from column definition to column encryption in the template ETL. The programming ETL is reflected at the end of the conversion process of the various template ETLs.
    However, please note that unsaved conversion processes will not be reflected. If you have changed the settings, click Regenerate from the preview to reflect them.

programingETL.png

  • Since the test data is sampled data, you can put different values in the test data yourself to test. For example, add a Null value to the data when you first previewed it and test again.
  • You can save the current test data by clicking Save Test Data. After editing the test data, you can use it when you want to leave the screen and edit it again.
  • Click Test Run and verify that the source code execution results are displayed in the test execution results at the bottom of the test data. The converted result is displayed in JSON format, just like the test data.
    If you print or other standard output in the source code, it will be displayed in standard output / error output. For example, you can use it for debugging.
  • The schema after ETL execution displays the schema information as it is when previewing for the first time, but is regenerated after the test execution with the results reflected.
  • If you add a date string column in the source code, it will be recognized as a String type by default. If you specify the data type and date format, you must enter it manually.

outputs_schema.png

Supported languages

  • Ruby 2.7
  • Python 3.9

Since libraries are not particularly implemented, please contact Customer Success for additional requests for libraries you want to use.

How to write a program

  • Only one-line conversions can be described. Please note that it is not possible to describe aggregation processing that spans multiple lines.
  • As an example of description,transform_row one element of the JSON ofrows the test data is passed over the argumentrow of.
  • If you rewrite the value of a key that already exists, the conversion process will be processed.
  • If you assign a new key, you can add a column.
  • You need to return eventuallyrow in the source code.
    You can return a null value (Python: None, ruby: nil) to skip the corresponding line.
  • before_actionYou can perform processes such as sandwiching a calculation formula before processing a row.

Was this article helpful?