- Print
- PDF
Data Source - HTTP/HTTPS
- Print
- PDF
summary
This is a help page for Data Setting to retrieve data from HTTP/HTTPS protocol web services.
constraints
The following restrictions apply when using OAuth 2.0
- Only authorization code grants are supported for grant types.
- Other grant types are not supported.
- Only Bearer authentication is supported as the authentication method for connecting to the data source when executing an ETL Job.
- Other authentication methods are not supported.
- The parameters used in obtaining authorization codes and obtaining and updating tokens are assumed to be compliant with the standard OAuth 2.0 specifications.
- For more information, see How to Obtain and Renew Tokens and Authorization Codes in TROCCO.
- If the access token expires while the ETL Job is running, the ETL Job will fail.
- In this case, modify the ETL Configuration so that the ETL Job will be completed before the access token expires.
- The number of records to be retrieved may be reduced in the Filter Setting in ETL Configuration STEP 2.
Setting items
STEP1 Basic settings
item name | indispensable | default value | Contents |
---|---|---|---|
approval | No | OFF | Select whether to use OAuth 2.0. |
HTTP/HTTPS Connection Configuration | Yes | - | The selections will be displayed when "Enable OAuth 2.0" is enabled. Select the preregistered HTTP/HTTPS Connection Configuration that has the necessary permissions for this ETL Configuration. |
URL | Yes | - | Enter the URL from which the Data Source will be retrieved. |
HTTP Methods | Yes | GET | Select the HTTP method to be used to retrieve data from the followingGET POST |
user agent | No | - | You can enter a user agent name to be specified in the request header. |
character encoding | No | UTF-8 | You can enter a character code to be specified in the request header. |
Input file format | Yes | CSV/TSV | Select the input file format. For more information, see About input file format settings. |
paging configuration | Yes | invalid | Select a paging setting from the following When using a paging request, choose either offset-based or****cursor-based, depending on the specification of the request destination. See Paging Settings for more information. |
parameter | No | - | You can add any key/ value to the query parameter. |
request body | No | - | This can be entered when POST is selected as the HTTP method.You can add any key/ value to the request body. However, when paging is enabled or parameters are specified, this input value is not reflected in the request body. |
HTTP header | No | - | You can add any key/ value to the HTTP header. When OAuth 2.0 is used, there is no need to add an access token to the HTTP header. |
STEP1 detailed settings
Clicking on Advanced Settings will display the following configuration items.
item name | indispensable | default value | Contents |
---|---|---|---|
Status code to determine normal system at the time of transfer data acquisition | Yes | 200 | Only three-digit numbers in the 200 range can be entered. To set multiple status codes, enter them separated by commas, e.g., 200, 201, 202 . |
STEP2 Detailed settings
item name | default value | minimum value | greatest value |
---|---|---|---|
Connection timeout (milliseconds) | 2,000 | 1 | 300,000 |
Read timeout (milliseconds) | 10,000 | 1 | 1,800,000 |
Maximum number of retries | 5 | 0 | 10 |
Retry interval (ms) | 10,000 | 0 | 600,000 |
Request interval (milliseconds) | 0 | 0 | 120,000 |
paging configuration
Selecting Offset-Based or****Cursor-Based in the Paging Configuration allows you to include a paging request when retrieving ETL Configuration data.
Each option has different settings.
If offset base is selected
item name | indispensable | default value | Contents |
---|---|---|---|
from/offset parameter name | Yes | - | Enter the from/offset parameter name for the paging request. |
to parameter name | No | - | You can enter the to parameter name of the paging request. |
Number of Requests | Yes | 1 | Enter the number of requests for paging requests. |
Initial value of from/offset parameter | Yes | 0 | Enter the initial value for the from/offset parameter of the paging request. |
Number of from/offset parameters to advance in one request | Yes | 1 | Enter the number of from/offset parameters to advance in one paging request. |
Example of offset-based input:. When using from and to for paging requests
item name | value |
---|---|
from/offset parameter name | from |
to parameter name | to |
Number of Requests | 4 |
Initial value of from/offset parameter | 1 |
Number of from/offset parameters to advance in one request | 10 |
In this case, the following request parameters are added
?from=1&to=10
?from=11&to=20
?from=21&to=30
?from=31&to=40
Example of offset-based input: using page and size for a paging request
item name | value |
---|---|
Parameter (key) | size |
Parameters ( value) | 100 |
from/offset parameter name | page |
Number of Requests | 4 |
Initial value of from/offset parameter | 1 |
Number of from/offset parameters to advance in one request | 1 |
In this case, the following request parameters are added
?page=1&size=100
?page=2&size=100
?page=3&size=100
?page=4&size=100
When cursor base is selected
If cursor-based paging setting is selected, the request is repeated until the cursor in the response Data Setting becomes one of the following
- Cursor not included
- Cursor value is
null
Therefore, you can use the cursor base only when the API specification of the service from which you want to retrieve data is one of the following
- If no more subsequent pages exist, the cursor is not included in the response data
- If no more subsequent pages exist, the value of the cursor in the response data is
null
In the unlikely event that a Job is executed with an ETL Job Setting created using an API with specifications that do not meet the above specifications, please cancel the relevant Job manually.
In this case, the response is considered invalid and the ETL Job fails.
The following will be output to the error log
The requested cursor parameters and the response cursor parameters are the same. Please check the request_parameter_cursor_name parameter.
item name | indispensable | Contents |
---|---|---|
Path to cursor in response data (JSONPath notation) | Yes | Used to retrieve the cursor value from the response data. Enter in JSONPath notation. |
Name of parameter to set cursor on request | Yes | Used at the time of request. Enter the parameter name to set the cursor received in the response data on the previous page. |
Parameter name to set the maximum number of records to be retrieved in one request | No | Used at the time of request. Enter a parameter name that specifies the maximum number of data to be retrieved per request. If the maximum number of records to be retrieved in one request has not been entered, this input value will not be used. |
Maximum number of records to be retrieved in one request | No | Used at the time of request. Specify the maximum number of data to be retrieved per request. If the parameter name that sets the maximum number of records to be retrieved in one request is not entered, this input value will not be used. |
Cursor-based input example
Here is an example of input if the structure of the cursor-based response data was
{
"items": [
{ ... },
{ ... },
...
],
"responseMetaData": {
"nextCursor": "SAMPLE_CURSOR",
...
},
...
}
item name | value |
---|---|
Path to cursor in response data (JSONPath notation) | $.responseMetaData.nextCursor |
Name of parameter to set cursor on request | cursor |
Parameter name to set the maximum number of records to be retrieved in one request | limit |
Maximum number of records to be retrieved in one request | 100 |
In this case, request parameters such as ?cursor=SAMPLE_CURSOR&limit=100
are added.
For example, if the request is for data with 550 records, the above request will be executed 6 times.
Up to the fifth response will contain data for 100 records, and the sixth response will contain data for 50 records.
Since the sixth response data, for which no subsequent data exists, does not contain a cursor, the seventh request is not executed and data acquisition is complete.