Data Source - GitHub(GraphQL)
  • 27 Dec 2022
  • Dark
    Light
  • PDF

Data Source - GitHub(GraphQL)

  • Dark
    Light
  • PDF

Article Summary

Note

This is a machine-translated version of the original Japanese article.
Please understand that some of the information contained on this page may be inaccurate.

Overview

This is a help page for setting up data transfer using GitHub API v4 (GraphQL).

Using GitHub API v4 (GraphQL), you can get various query results as JSON records by writing GraphQL queries.

Importable Data

All items that can be retrieved by GraphQL query in GitHub API v4 are eligible.
You can narrow down the fields to be transferred to the query result (JSON). You can filter by specifying the target path in JSON Path format.

Also, if the filtering result target is an array, it can be imported as one element and one line.
Example) When you get a list of issues, you can import each issue by specifying the path of the set of issues.

For items that require paging, you can use the built-in GraphQL Varibales ( $__trocco_githubEndCursor__ )trocco
API requests are possible while shifting cursors.
For details, please see "How pagination works" below.

Settings

STEP 1: General Settings

Field NameRequiredDefault ValueDescription
GitHub connection informationYes-Specify the GitHub connection information registered with trocco.
GraphQL queriesYes※Query sample availableYou can check the query results in the GitHub GraphQL API Explorer.
Also, when using pagination, you can use it as a$\_\_trocco_githubEndCursor__ Variables to write requests that span multiple pages.
Path to import (JSON Path)Yes-You can narrow down the target to be retrieved from the JSON results.

e.g.)
Suppose you can get the following JSON as a GraphQL query result.
{
"data": {
    "repository": {
      "issues": [{<#issue1>}, {<#issue2>}]
    }
  }
}

If you want to retrieve only issues, specify the location of issues in JSON Path format.
$.data.repository.issues
The output result is as follows
\[\{\<#issue1>}, \{\<#issue2>}]
Since the output result is an array, issue1 and issue2 are captured as separate lines.
PaginationYesinvalid-
endCursor PathNo
(Yes for pagination)
-Specifies the path of endCursor in the JSON of the query results.
Note that in order to include endCursor in the results, you must specify pageInfo in the GraphQL query.
For details, please see "How pagination works" below.

e.g.)
$.data.repository.issues.pageInfo.endCursor
hasNextPage PathNo
(Yes for pagination)
-Specifies the path of hasNextPage in the JSON of the query results.
Note that in order to include hasNextPage in the results, you must specify hasNextPage in the GraphQL query.
For details, please see "How pagination works" below.

e.g.)
$.data.repository.issues.pageInfo.hasNextPage

* GraphQL queries have query samples written as default values as follows.
image.png

How pagination works

In GitHub GraphQL API v4, you can specify the start position of Edge by specifying the start cursor as after.
In trocco, we get the final cursor included in the query response and pass the value to the$\_\_trocco__githubEndCursor__ built-in variable.
If the next page exists, request the next page by specifying a built-in variable in the after of the query.
The maximum acquisition limit is 10,000 pages.

Configuration Example

Get members belonging to an organization

Let's get the account name, name, and role of member under Organization.

GraphQL queries

* Please change the oraganization login to your organization name.

query {
  organization(login: "primenumber-dev") {
    membersWithRole(first: 2) {
      edges {
        node {
          login
          name
        }
        role
      }
    }
  }
}

If you check the query results in the GitHub GraphQL API Explorer, you can see that the JSON looks like this:

Query Results

{
  "data": {
    "organization": {
      "membersWithRole": {
        "edges": [
          {
            "node": {
              "login": "trocco-taro",
              "name": "trocco Taro"
            },
            " role": "ADMIN"
              },
        {
          "node": {
            "login": "trocco-hanako",
            "name": "trocco Hanako"
          },
          "role": "ADMIN"
        }
    }
  }
}
}

Since what we want to import is the information of each member, specify the data > organization >memberWithRole > edges in the JSON Path as follows,
Imports as a record in member units.

Specify the path to be ingested (JSON Path)

$.data.organization.memberWithRole.edges
In this case, since edges is an array, each element can be captured as one record.

In the end, we were able to incorporate it as follows:

records
{"node": { "login": "trocco-taro", "name": "trocco Taro" }, "role": "ADMIN"}
{"node": { "login": "trocco-hanako", "name": "trocco Hanako" }, "role": "ADMIN"}

Import All Issues in a Repository (via pagination)

GraphQL queries

query($__trocco__githubEndCursor__:String){repository(owner: "<#input your organization>", name: "<#input your repository>") {
  
    issues(first: 100, states: OPEN, after:$__trocco__githubEndCursor__) {
      edges {
        node {
          title
          url
          number
          updatedAt
        }
      }
      pageInfo {
        endCursor
        hasNextPage
      }
    }
  }
}

Specifying the Path

Path to ingest$.data.repository.issues.edges[*].node
path to pageInfo.endCursor$.data.repository.issues.pageInfo.endCursor
path of pageInfo.hasNextPage$.data.repository.issues.pageInfo.hasNextPage

Was this article helpful?