Workflow data check
  • 17 Jul 2024
  • PDF

Workflow data check

  • PDF

Article summary

summary

What is Data Check?

This is one of the tasks that can be set up within the workflow.
This function checks the results of a query against the DWH against error conditions, and if the conditions are met, the corresponding task is marked as an error.

For example, if a particular string in a column is counted and the result is 2 or more, it is considered duplicate data and an error is generated.

DWH for data checking

  • Google BigQuery
  • Snowflake
  • Amazon Redshift

Setup Method

This section explains how to set up using Google BigQuery as an example.

  1. Specify the BigQuery connection information registered in advance.
    This connection information must be authorized to execute the described query.
    datacheck.png
Environment in which the query is executed

For Snowflake and Redshift, the query execution environment must also be specified.
Specify the warehouse for Snowflake and the database for Redshift.

  1. Enter a query to check data.
    Write the query in a SELECT statement so that the query results in a single row and a single column of numbers.
    Custom variables can be embedded in the query.
    Click Preview Run to see the results of the query you have written on the fly.
    query.png

  2. Specify error conditions.
    Specify the reference value and its condition. You can choose from six conditions.
    If the query result is null, you can also choose whether or not to mark the corresponding task as a success.
    image.png

Selectable error conditions

You can choose from the following six types

  • ... and upwards
  • the following
  • greater than ...
  • smaller
  • equal
  • not equal

Was this article helpful?