Skip to content

Commit cac5d62

Browse files
committed
Limit commits_level requests to three branches
1 parent d4d5e09 commit cac5d62

2 files changed

Lines changed: 58 additions & 3 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ There are two components to each request:
225225
2. `files_editability`: Basic information about how "editable" the CAD files are in this repository.
226226
3. `license`: The license for the repository.
227227
4. `tags`: Aggregated tags for the repository and any associated with the maintainers of that repsitory.
228-
5. `commits_level`: The hash identifier (contribution `id` for Wikifactory projects) and timestamp of each commit to the repository. This can be used to graph the commit activity level in a frontend visualisation.
228+
5. `commits_level`: The hash identifier (contribution `id` for Wikifactory projects) and timestamp of each commit to the repository. This can be used to graph the commit activity level in a frontend visualisation. **Note:** This will be based on commits from the first three detected branches in the repository, including the default branch. This is because the time it takes to requests commits across various branches take a long time, and APIs might time out.
229229
6. `issues_level`: Similar to `commits_level`, but for all issues in the repository.
230230

231231
The following is an example request that could be sent to the API for three Wikifactory projects:

oshminer/GitHub.py

Lines changed: 57 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -334,8 +334,9 @@ async def get_commits_level(project: dict, session) -> dict:
334334
Each commit is a `dict` containing as `str`s: `oid`, `messageHeadline`,
335335
`committedDate`, and `url`.
336336
337-
The commits are aggregated and de-duplicated across all branches up to 100
338-
branches.
337+
The commits are aggregated and de-duplicated across branches up to 3
338+
branches including the default branch. The 3 branch limit is because
339+
otherwise the requests would take too long.
339340
"""
340341

341342
#
@@ -387,6 +388,60 @@ async def get_commits_level(project: dict, session) -> dict:
387388
# Add branches from query results to list of branches
388389
for node in branches_results:
389390
branches_list.append(node["name"])
391+
392+
#
393+
# Identify default branch name
394+
#
395+
396+
#
397+
# Determine name of default branch
398+
#
399+
400+
# This can be done with GitHub v4 GraphQL API
401+
query_default_branch = gql(
402+
"""
403+
query ($owner: String!, $name: String!) {
404+
repository(owner: $owner, name: $name) {
405+
defaultBranchRef {
406+
name
407+
}
408+
}
409+
}
410+
"""
411+
)
412+
# Query variables
413+
params: dict = {
414+
"owner": project["owner"],
415+
"name": project["name"]
416+
}
417+
# Execute query on the transport
418+
# `execute_async()` is the asynchronous version of `execute()`
419+
try:
420+
query_default_branch_response: dict = await session.execute(query_default_branch, variable_values = params)
421+
except Exception as exc:
422+
# Very hacky workaround for now:
423+
# When there's an authorisation error such as a bad personal access token,
424+
# GitHub would return a 401 error response. The `gql` library woudl throw the
425+
# `gql.transport.exceptions.TransportServerError` exception.
426+
# But for some reason `except gql.transport.exceptions.TransportServerError`
427+
# would fail, so for now I'm manually catching the 401 code here.
428+
if exc.code == 401:
429+
raise exceptions.BadGitHubTokenError()
430+
else:
431+
# For all other errors, continue throwing an exception to stop execution.
432+
raise Exception
433+
default_branch_name: str = query_default_branch_response["repository"]["defaultBranchRef"]["name"]
434+
435+
#
436+
# Keep 3 branches including default branch
437+
#
438+
439+
if len(branches_list) > 3:
440+
print("Keeping 3 branches to query", file=sys.stderr)
441+
# Keep two branches plus default branch
442+
branches_list.remove(default_branch_name)
443+
branches_list = branches_list[:2]
444+
branches_list.append(default_branch_name)
390445

391446
#
392447
# Get list of commits in each branch

0 commit comments

Comments
 (0)