Data Destination - S3 Parquet
- 07 Dec 2022
- Print
- DarkLight
- PDF
Data Destination - S3 Parquet
- Updated on 07 Dec 2022
- Print
- DarkLight
- PDF
Article Summary
Share feedback
Thanks for sharing your feedback!
This is a machine-translated version of the original Japanese article.
Please understand that some of the information contained on this page may be inaccurate.
summary
This is a help page for setting up transferring data to Amazon Web Services S3 in Apache Parquet (.parquet) format.
Supported Protocols
- Data Transfer (Embulk)
Use embulk-output-s3_parquet
constraint
- Nothing in particular
Setting items
STEP1 Basic settings
Item | namerequireddefault | valuecontent | |
---|---|---|---|
S3 connection information | Yes | - | From the connection information registered in advance, select the one that has the necessary permissions for this transfer setting. Please refer to the separate page for how to set the connection information. |
Region | Yes | ap-northeast-1 | Enter the region you specified when creating the bucket to use. For an explanation of regions, please refer to the official AWS page. |
bucket | Yes | - | Specify the name of the bucket to which you want to transfer data. |
Path prefix | Yes | - | Specify the path prefix to which the data is transferred. trocco outputs multiple files to the destination bucket, starting with the path prefix. Custom variables can also be used to dynamically determine the setting value during trocco data transfer. |
Compression format | Yes | uncompressed | Specifies the compression method for the file. ・umcompressed ・Snappy ・gzip ・LZO ・broti ・LZ4 ・zstd You can choose from: |
STEP2 Advanced settings
Item name | default | value content |
---|---|---|
Naming conventions for multi-file output | .%03d.%02d | You can set rules for the name of multiple files when outputting them. |
Output file extension | parquet | - |
Default timestamp format | %Y-%m-%d %H:%M:%S.%6N %z | - |
Default time zone | UTC | - |
Block Size (byte) | 134217728 | - |
Page Size (byte) | 1048576 | - |
Maximum padding size (byte) | 8388608 | - |
Set up your data catalog | not | - |
Column settings | - | - |
Was this article helpful?