Data Source - GitHub GraphQL API
    • PDF

    Data Source - GitHub GraphQL API

    • PDF

    Article summary

    summary

    Help page for Data Settings using GitHub API v4 (GraphQL) for ETL Configuration.

    Using the GitHub GraphQL API v4 (GraphQL), you can write a GraphQL query to get the results of various queries as JSON records.

    Obtainable data

    All items that can be retrieved by a GraphQL query with the GitHub GraphQL API v4.
    For query results (JSON), you can narrow down the fields to be forwarded. Filtering can be done by specifying the target path in JSON Path format.

    In addition, if the filter result is an array, it can be taken in as one element and one row.
    Example: If a list of Issues is retrieved, it is possible to capture each issue by specifying the path to the set of Issues.

    For items that require paging, use TROCCO's built-in GraphQL Variables ($__trocco_githubEndCursor__).
    API requests can be made while shifting cursors.
    See "How Pagination Works" below for more information.

    Setting items

    STEP1 Basic settings

    (data) itemindispensabledefault valueContents
    GitHub Connection ConfigurationYes-Specify the GitHub Connection Configuration registered with TROCCO.
    GraphQL QueryYesQuery sample availableYou can see the results of your query in the GitHub GraphQL API Explorer.
    Also, when using pagination, you can use $__trocco_githubEndCursor__ as Variables to describe a request that spans multiple pages.
    Path to be captured (JSON Path)Yes-The JSON results can be used to narrow down the target of acquisition.

    Example.)
    Suppose the following JSON can be obtained as a result of a GraphQL query.
    {
    "data": {
    "repository": {
    "issues": [{<#issue1>}, {<#issue2>}]
    }
    }
    }

    If you want to extract only issues, specify the location of the issues in JSON Path format.
    $.data.repository.issues
    The output results are as follows
    [{<#issue1>}, {<#issue2>}]
    Since the output is an array, issues 1 and 2 are captured as separate rows.
    paginationYesinvalid-
    Path of endCursor (JSON Path)No
    (Yes for pagination)
    -Specify the path of the endCursor in the JSON of the query result.
    Note that pageInfo must be specified in the GraphQL query in order to include endCursor in the results.
    See "How Pagination Works" below for more information.

    Example.)
    $.data.repository.issues.pageInfo.endCursor
    hasNextPage path (JSON Path)No
    (Yes for pagination)
    -Specify the path to hasNextPage in the JSON of the query result.
    Note that hasNextPage must be specified in the GraphQL query in order to include hasNextPage in the results.
    See "How Pagination Works" below for more information.

    Example.)
    $.data.repository.issues.pageInfo.hasNextPage

    The GraphQL query has a query sample written in advance as the default value as follows.
    image.png

    How Pagination Works

    In GitHub GraphQL API v4, you can specify the start position of an Edge by specifying the start cursor as after.
    TROCCO takes the final cursor contained in the query response and passes the value to the built-in variable $__trocco___githubEndCursor__.
    If the next page exists, request the next page by specifying a built-in variable in the query after.
    The maximum number of pages to be acquired is 10,000.

    Setting Example

    Acquisition of members belonging to Organization

    Try to get the account name, name, and role of the member under Organization.

    GraphQL Query

    *Replace "login" in ORAGANIZATION with your Organization name.

    query {
      organization(login: "primenumber-dev") {
        membersWithRole(first: 2) {
          edges {
            node {
              login
              name
            }
            role
          }
        }
      }
    }
    

    If you check the query results in the GitHub GraphQL API Explorer, you will see that the JSON is as follows

    query results

    {
      "data": {
        "organization": {
          "membersWithRole": {
            "edges": [
              {
                "node": {
                  "login": "trocco-taro",
                  "name": "TROCCO Taro"
                },
                "role": "ADMIN"
                  },
            {
              "node": {
                "login": "trocco-hanako",
                "name": "TROCCO Hanako"
              },
              "role": "ADMIN"
            }
        }
      }
    }
    }
    

    Since the information to be captured is for each member, specify data > organization > memberWithRole > edges with a JSON Path as follows,
    The data is imported as a record in member units.

    Specify the path to be captured (JSON Path)

    $.data.organization.memberWithRole.edges
    In this case, since edges is an array, each element can be imported as a single record.

    Finally, we were able to incorporate the following

    records
    {"node": { "login": "trocco-taro", "name": "TROCCO Taro" }, "role": "ADMIN"}
    {"node": { "login": "trocco-hanako", "name": "TROCCO Hanako" }, "role": "ADMIN"}

    Acquisition of all Issues under Repository (using pagination)

    GraphQL Query

    query($__trocco__githubEndCursor__:String){
      repository(owner: "<#input your organization>", name: "<#input your repository>") {
        issues(first: 100, states: OPEN, after:$__trocco__githubEndCursor__) {
          edges {
            node {
              title
              url
              number
              updatedAt
            }
          }
          pageInfo {
            endCursor
            hasNextPage
          }
        }
      }
    }
    

    Specify path

    Paths to be captured$.data.repository.issues.edges [*].node
    Path of pageInfo.endCursor$.data.repository.issues.pageInfo.endCursor
    Path of pageInfo.hasNextPage$.data.repository.issues.pageInfo.hasNextPage

    Was this article helpful?