Data Source - HTTP/HTTPS

Prev Next

summary

This is a help page for Data Setting to retrieve data from HTTP/HTTPS protocol web services.

constraints

Restriction on connections using OAuth 2.0

The following restrictions apply when using OAuth 2.0

  • Only authorization code grants are supported for grant types.
    • Other grant types are not supported.
  • Only Bearer authentication is supported as the authentication method for connecting to the data source when executing an ETL Job.
    • Other authentication methods are not supported.
  • The parameters used in obtaining authorization codes and obtaining and updating tokens are assumed to be compliant with the standard OAuth 2.0 specifications.
  • If the access token expires while the ETL Job is running, the ETL Job will fail.
    • In this case, modify the ETL Configuration so that the ETL Job will be completed before the access token expires.
    • The number of records to be retrieved may be reduced in the Filter Setting in ETL Configuration STEP 2.

Setting items

STEP1 Basic settings

item name indispensable default value Contents
approval No OFF Select whether to use OAuth 2.0.
HTTP/HTTPS Connection Configuration Yes - The selections will be displayed when "Enable OAuth 2.0" is enabled.
Select the preregistered HTTP/HTTPS Connection Configuration that has the necessary permissions for this ETL Configuration.
URL Yes - Enter the URL from which the Data Source will be retrieved.
HTTP Methods Yes GET Select the HTTP method to be used to retrieve data from the following
  • GET
  • POST
  • user agent No - You can enter a user agent name to be specified in the request header.
    character encoding No UTF-8 You can enter a character code to be specified in the request header.
    Input file format Yes CSV/TSV Select the input file format.
    For more information, see About input file format settings.
    paging configuration Yes invalid Select a paging setting from the following
  • invalid
  • Offset Base
  • cursor based

  • When using a paging request, choose either offset-based or****cursor-based, depending on the specification of the request destination.
    See Paging Settings for more information.
    parameter No - You can add any key/ value to the query parameter.
    request body No - This can be entered when POST is selected as the HTTP method.
    You can add any key/ value to the request body.
    However, when paging is enabled or parameters are specified, this input value is not reflected in the request body.
    HTTP header No - You can add any key/ value to the HTTP header.
    When OAuth 2.0 is used, there is no need to add an access token to the HTTP header.

    STEP1 detailed settings

    Clicking on Advanced Settings will display the following configuration items.

    item name indispensable default value Contents
    Status code to determine normal system at the time of transfer data acquisition Yes 200 Only three-digit numbers in the 200 range can be entered.
    To set multiple status codes, enter them separated by commas, e.g., 200, 201, 202.

    STEP2 Detailed settings

    item name default value minimum value greatest value
    Connection timeout (milliseconds) 2,000 1 300,000
    Read timeout (milliseconds) 10,000 1 1,800,000
    Maximum number of retries 5 0 10
    Retry interval (ms) 10,000 0 600,000
    Request interval (milliseconds) 0 0 120,000

    paging configuration

    Selecting Offset-Based or****Cursor-Based in the Paging Configuration allows you to include a paging request when retrieving ETL Configuration data.
    Each option has different settings.

    If offset base is selected

    item name indispensable default value Contents
    from/offset parameter name Yes - Enter the from/offset parameter name for the paging request.
    to parameter name No - You can enter the to parameter name of the paging request.
    Number of Requests Yes 1 Enter the number of requests for paging requests.
    Initial value of from/offset parameter Yes 0 Enter the initial value for the from/offset parameter of the paging request.
    Number of from/offset parameters to advance in one request Yes 1 Enter the number of from/offset parameters to advance in one paging request.

    Example of offset-based input:. When using from and to for paging requests

    item name value
    from/offset parameter name from
    to parameter name to
    Number of Requests 4
    Initial value of from/offset parameter 1
    Number of from/offset parameters to advance in one request 10

    In this case, the following request parameters are added

    1. ?from=1&to=10
    2. ?from=11&to=20
    3. ?from=21&to=30
    4. ?from=31&to=40

    Example of offset-based input: using page and size for a paging request

    item name value
    Parameter (key) size
    Parameters ( value) 100
    from/offset parameter name page
    Number of Requests 4
    Initial value of from/offset parameter 1
    Number of from/offset parameters to advance in one request 1

    In this case, the following request parameters are added

    1. ?page=1&size=100
    2. ?page=2&size=100
    3. ?page=3&size=100
    4. ?page=4&size=100

    When cursor base is selected

    Condition for completion of paging request

    If cursor-based paging setting is selected, the request is repeated until the cursor in the response Data Setting becomes one of the following

    • Cursor not included
    • Cursor value is null

    Therefore, you can use the cursor base only when the API specification of the service from which you want to retrieve data is one of the following

    • If no more subsequent pages exist, the cursor is not included in the response data
    • If no more subsequent pages exist, the value of the cursor in the response data is null

    In the unlikely event that a Job is executed with an ETL Job Setting created using an API with specifications that do not meet the above specifications, please cancel the relevant Job manually.

    When the cursor used in the request and the cursor value in the response are the same

    In this case, the response is considered invalid and the ETL Job fails.
    The following will be output to the error log
    The requested cursor parameters and the response cursor parameters are the same. Please check the request_parameter_cursor_name parameter.

    item name indispensable Contents
    Path to cursor in response data (JSONPath notation) Yes Used to retrieve the cursor value from the response data.
    Enter in JSONPath notation.
    Name of parameter to set cursor on request Yes Used at the time of request.
    Enter the parameter name to set the cursor received in the response data on the previous page.
    Parameter name to set the maximum number of records to be retrieved in one request No Used at the time of request.
    Enter a parameter name that specifies the maximum number of data to be retrieved per request.
    If the maximum number of records to be retrieved in one request has not been entered, this input value will not be used.
    Maximum number of records to be retrieved in one request No Used at the time of request.
    Specify the maximum number of data to be retrieved per request.
    If the parameter name that sets the maximum number of records to be retrieved in one request is not entered, this input value will not be used.

    Cursor-based input example

    Here is an example of input if the structure of the cursor-based response data was

    {
      "items": [
        { ... },
        { ... },
        ...
      ],
      "responseMetaData": {
        "nextCursor": "SAMPLE_CURSOR",
        ...
      },
      ...
    }
    
    item name value
    Path to cursor in response data (JSONPath notation) $.responseMetaData.nextCursor
    Name of parameter to set cursor on request cursor
    Parameter name to set the maximum number of records to be retrieved in one request limit
    Maximum number of records to be retrieved in one request 100

    In this case, request parameters such as ?cursor=SAMPLE_CURSOR&limit=100 are added.

    For example, if the request is for data with 550 records, the above request will be executed 6 times.
    Up to the fifth response will contain data for 100 records, and the sixth response will contain data for 50 records.
    Since the sixth response data, for which no subsequent data exists, does not contain a cursor, the seventh request is not executed and data acquisition is complete.