Data Source - HTTP/HTTPS
    • PDF

    Data Source - HTTP/HTTPS

    • PDF

    Article summary

    summary

    This is a help page for Data Setting to retrieve data from HTTP/HTTPS protocol web services.

    constraints

    Restriction on connections using OAuth 2.0

    The following restrictions apply when using OAuth 2.0

    • Only authorization code grants are supported for grant types.
      • Other grant types are not supported.
    • Only Bearer authentication is supported as the authentication method for connecting to the data source when executing an ETL Job.
      • Other authentication methods are not supported.
    • The parameters used in obtaining authorization codes and obtaining and updating tokens are assumed to be compliant with the standard OAuth 2.0 specifications.
    • If the access token expires while the ETL Job is running, the ETL Job will fail.
      • In this case, modify the ETL Configuration so that the ETL Job will be completed before the access token expires.
      • The number of records to be retrieved may be reduced in the Filter Setting in ETL Configuration STEP 2.

    Setting items

    STEP1 Basic settings

    item nameindispensabledefault valueContents
    approvalNoOFFSelect whether to use OAuth 2.0.
    HTTP/HTTPS Connection ConfigurationYes-The selections will be displayed when "Enable OAuth 2.0" is enabled.
    Select the preregistered HTTP/HTTPS Connection Configuration that has the necessary permissions for this ETL Configuration.
    URLYes-Enter the URL from which the Data Source will be retrieved.
    HTTP MethodsYesGETSelect the HTTP method to be used to retrieve data from the following
  • GET
  • POST
  • user agentNo-You can enter a user agent name to be specified in the request header.
    character encodingNoUTF-8You can enter a character code to be specified in the request header.
    Input file formatYesCSV/TSVSelect the input file format.
    For more information, see About input file format settings.
    paging configurationYesinvalidSelect a paging setting from the following
  • invalid
  • Offset Base
  • cursor based

  • When using a paging request, choose either offset-based or****cursor-based, depending on the specification of the request destination.
    See Paging Settings for more information.
    parameterNo-You can add any key/ value to the query parameter.
    request bodyNo-This can be entered when POST is selected as the HTTP method.
    You can add any key/ value to the request body.
    However, when paging is enabled or parameters are specified, this input value is not reflected in the request body.
    HTTP headerNo-You can add any key/ value to the HTTP header.
    When OAuth 2.0 is used, there is no need to add an access token to the HTTP header.

    STEP1 detailed settings

    Clicking on Advanced Settings will display the following configuration items.

    item nameindispensabledefault valueContents
    Status code to determine normal system at the time of transfer data acquisitionYes200Only three-digit numbers in the 200 range can be entered.
    To set multiple status codes, enter them separated by commas, e.g., 200, 201, 202.

    STEP2 Detailed settings

    item namedefault valueminimum valuegreatest value
    Connection timeout (milliseconds)2,0001300,000
    Read timeout (milliseconds)10,00011,800,000
    Maximum number of retries5010
    Retry interval (ms)10,0000600,000
    Request interval (milliseconds)00120,000

    paging configuration

    Selecting Offset-Based or****Cursor-Based in the Paging Configuration allows you to include a paging request when retrieving ETL Configuration data.
    Each option has different settings.

    If offset base is selected

    item nameindispensabledefault valueContents
    from/offset parameter nameYes-Enter the from/offset parameter name for the paging request.
    to parameter nameNo-You can enter the to parameter name of the paging request.
    Number of RequestsYes1Enter the number of requests for paging requests.
    Initial value of from/offset parameterYes0Enter the initial value for the from/offset parameter of the paging request.
    Number of from/offset parameters to advance in one requestYes1Enter the number of from/offset parameters to advance in one paging request.

    Example of offset-based input:. When using from and to for paging requests

    item namevalue
    from/offset parameter namefrom
    to parameter nameto
    Number of Requests4
    Initial value of from/offset parameter1
    Number of from/offset parameters to advance in one request10

    In this case, the following request parameters are added

    1. ?from=1&to=10
    2. ?from=11&to=20
    3. ?from=21&to=30
    4. ?from=31&to=40

    Example of offset-based input: using page and size for a paging request

    item namevalue
    Parameter (key)size
    Parameters ( value)100
    from/offset parameter namepage
    Number of Requests4
    Initial value of from/offset parameter1
    Number of from/offset parameters to advance in one request1

    In this case, the following request parameters are added

    1. ?page=1&size=100
    2. ?page=2&size=100
    3. ?page=3&size=100
    4. ?page=4&size=100

    When cursor base is selected

    Condition for completion of paging request

    If cursor-based paging setting is selected, the request is repeated until the cursor in the response Data Setting becomes one of the following

    • Cursor not included
    • Cursor value is null

    Therefore, you can use the cursor base only when the API specification of the service from which you want to retrieve data is one of the following

    • If no more subsequent pages exist, the cursor is not included in the response data
    • If no more subsequent pages exist, the value of the cursor in the response data is null

    In the unlikely event that a Job is executed with an ETL Job Setting created using an API with specifications that do not meet the above specifications, please cancel the relevant Job manually.

    When the cursor used in the request and the cursor value in the response are the same

    In this case, the response is considered invalid and the ETL Job fails.
    The following will be output to the error log
    The requested cursor parameters and the response cursor parameters are the same. Please check the request_parameter_cursor_name parameter.

    item nameindispensableContents
    Path to cursor in response data (JSONPath notation)YesUsed to retrieve the cursor value from the response data.
    Enter in JSONPath notation.
    Name of parameter to set cursor on requestYesUsed at the time of request.
    Enter the parameter name to set the cursor received in the response data on the previous page.
    Parameter name to set the maximum number of records to be retrieved in one requestNoUsed at the time of request.
    Enter a parameter name that specifies the maximum number of data to be retrieved per request.
    If the maximum number of records to be retrieved in one request has not been entered, this input value will not be used.
    Maximum number of records to be retrieved in one requestNoUsed at the time of request.
    Specify the maximum number of data to be retrieved per request.
    If the parameter name that sets the maximum number of records to be retrieved in one request is not entered, this input value will not be used.

    Cursor-based input example

    Here is an example of input if the structure of the cursor-based response data was

    {
      "items": [
        { ... },
        { ... },
        ...
      ],
      "responseMetaData": {
        "nextCursor": "SAMPLE_CURSOR",
        ...
      },
      ...
    }
    
    item namevalue
    Path to cursor in response data (JSONPath notation)$.responseMetaData.nextCursor
    Name of parameter to set cursor on requestcursor
    Parameter name to set the maximum number of records to be retrieved in one requestlimit
    Maximum number of records to be retrieved in one request100

    In this case, request parameters such as ?cursor=SAMPLE_CURSOR&limit=100 are added.

    For example, if the request is for data with 550 records, the above request will be executed 6 times.
    Up to the fifth response will contain data for 100 records, and the sixth response will contain data for 50 records.
    Since the sixth response data, for which no subsequent data exists, does not contain a cursor, the seventh request is not executed and data acquisition is complete.


    Was this article helpful?