Data Destination - Amazon S3 Parquet

Prev Next

summary

Help page for ETL Configuration of Data Settings for transferring data in Apache Parquet (.parquet) format to Amazon Web Services' S3.

constraints

  • none in particular

Setting items

STEP1 Basic settings

item name indispensable default value Contents
S3 Connection Configuration Yes - Select the previously registered Connection Configuration that has the necessary permissions for this ETL Configuration.
Please refer to another page for Connection Configuration.
region Yes ap-northeast-1 Enter the region you specified when creating the bucket to be used.
Please refer to the official AWS page for an explanation of regions.
bucket Yes - Specify the name of the Data Destination bucket.
path prefix Yes - Specify the Data Destination path prefix.
TROCCO outputs multiple files in the Data Destination bucket, beginning with the path prefix.
Custom Variables can also be used to dynamically determine the value of a setting during ETL Configuration of TROCCO's data.
compressed format Yes uncompressed Specifies the file compression method.
・umcompressed
・snappy
・gzip
・lzo
・broti
・lz4
・zstd
You can choose from

STEP2 Detailed settings

item name default value Contents
Naming Conventions for Multiple File Output .%03d.%02d You can set the rules for naming files when outputting multiple files.
Output file extension parquet -
Default timestamp format %Y-%m-%d %H:%M:%S.%6N %z -
Default time zone UTC -
Block size (byte) 134217728 -
Page size (byte) 1048576 -
Maximum padding size (byte) 8388608 -
Data Catalog Setting local -
Column Setting - -