Data Mart - Google BigQuery

Prev Next

summary

Help page for setting up Data Mart Configuration with Google BigQuery on Google Cloud Platform.

Setting items

basic setting

item name indispensable Contents
Google BigQuery Connection Configuration Select the preregistered Google BigQuery Connection Configuration that has the necessary permissions for this Data Mart Configuration.
Custom Variable - Custom Variables set here can be embedded in queries, dataset names, table names, etc.
See About Custom Variables for more information.

Query settings

item name indispensable Contents
query execution mode Select one of the following modes
  • Data Transfer Mode
    Simply specify the query and the destination table, and you can easily perform rewashing and appending to the table.
  • free write mode
    You can write and execute any DML/DDL statement (``INSERT/DELETE,``CREATE/DROP, etc. ) to the DWH to which you are connecting.
  • query Enter a query.
    When specifying table names in a query, use the following format.
  • In data transfer mode: dataset_name.table_name
  • In free description mode: project_name.dataset_name.table_name

  • If you have selected the data transfer mode, you can also check the execution result by clicking Preview Execution.

    When data transfer mode is selected

    item name indispensable Contents
    output-data-set Enter the name of the data set to which the data will be output.
    For more information on dataset naming conventions, please refer to BigQuery Official Documentation - Naming Data Sets.
    output table Enter the name of the table to which the data will be output.
    For more information on table naming conventions, see BigQuery Official Documentation - Table Naming.
    write mode Select one of the following modes
    • PS
      • The result of the query execution is appended after the records of the existing table.
    • All cases washed out
      • Records in the existing table are deleted and replaced with the results of the query execution.

    Partitioning and clustering settings

    Data Setting is available when ETL Configuration mode is selected.
    For more information on partitioning and clustering, see OverviewofPartitioned Tables and Overview of Clustered Tables, respectively.

    Conditions for applying partitioning and clustering settings

    Partitioning and clustering settings are valid only when a table is newly created.
    If a table already exists at the output destination, the job will be executed using the Job Settings for the existing table, not the contents of this setting.

    Boundaries of partitions in "Split by fetch time"

    Due to Google BigQuery specifications, partition boundaries are based on UTC time. Please note

    Integer range partitioning

    TROCCO does not support integer range partitioning, which partitions a table based on the value of a particular INTEGER column.

    item name indispensable Contents
    partitioning - You can choose one of the following
  • Partitioned by acquisition time: partitioning is based on the job execution time in TROCCO.
  • Partition by field: Partitioning is done based on the reference column.
  • partition field Enter when division is selected by field.
    Enter a column name of type DATE, TIMESTAMP, or DATETIME.
    Partition Type Select this option when either partitioning method is selected for partitioning.
    Please select the granularity of table partitioning from the following
  • Hourly
  • per day
  • Every 1 month
  • Every 1 year
  • clustering - This can be set if you wish to create a clustered table.
    By entering a column name in the Clustering column, the table will be clustered based on the column in question.
    Up to four clustered columns can be specified.

    When free description mode is selected

    item name indispensable Contents
    Data Processing Location - Specify the Google BigQuery location where the query will be executed.
    Please specify if you want to specify a resource that is not tied to a location in the query.
    If not specified, Google BigQuery will automatically determine the location.
    For more information, please refer to the official BigQuery documentation - Specifying Locations.

    Job Setting

    item name indispensable default value Contents
    Parallel execution of jobs No parallel job execution. Select whether or not to run a job if another job with the same Data Mart Configuration is running at the time the job is run.
  • No parallel job execution: jobs are not executed and skipped.
  • Allow jobs to run in parallel: Jobs will run.
  • Required Authority

    The following permissions are required to use this service.

    • bigquery.datasets.get
    • bigquery.jobs.create
    • bigquery.tables.create
    • bigquery.tables.delete
    • bigquery.tables.get
    • bigquery.tables.getData
    • bigquery.tables.list
    • bigquery.tables.update
    • bigquery.tables.updateData