Skip to content

fix: use webhook payload SHAs in list_changed_files to avoid race condition#1107

Open
myakove wants to merge 2 commits into
mainfrom
fix/issue-1096-webhook-payload-shas
Open

fix: use webhook payload SHAs in list_changed_files to avoid race condition#1107
myakove wants to merge 2 commits into
mainfrom
fix/issue-1096-webhook-payload-shas

Conversation

@myakove

@myakove myakove commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

PR Summary by Qodo

Fix changed-files diff race by using webhook payload base/head SHAs
🐞 Bug fix 🧪 Tests 🕐 20-40 Minutes

Grey Divider

Walkthroughs

User Description

Summary

Replace live PyGithub API calls with webhook payload SHAs in list_changed_files() to eliminate a race condition where base branch receives new commits between clone and API call.

  • Prefer webhook payload SHAs for pull_request events (no race condition)
  • Fall back to PullRequest API object for non-PR events (issue_comment, check_run, etc.)
  • Store pr_base_sha/pr_head_sha on GithubWebhook instance during process()
  • Remove pull_request parameter from initialize() and list_changed_files()
  • Add symmetric guards for both base and head SHA validation

Closes #1096

AI Description
• Persist PR base/head SHAs from webhook payload to prevent base-branch race conditions.
• Fall back to PullRequest API SHAs for non-PR webhook event types.
• Simplify OWNERS handler initialization by removing PullRequest parameter plumbing.
Diagram
graph TD
  A["GitHub webhook payload"] --> B["GithubWebhook.process"] --> C["Store PR base/head SHAs"] --> D[("Local clone")] --> E["OwnersFileHandler.list_changed_files"] --> F["git diff --name-only"]
  B --> G["PullRequest API (fallback)"] --> C
Loading
High-Level Assessment

Using webhook payload SHAs for pull_request events is the most reliable way to keep the local clone and diff base/head aligned and eliminate the observed race. Alternatives like always querying live PR/base refs or diffing against the current base branch would reintroduce timing drift; passing SHAs through additional parameters instead of storing on the per-request GithubWebhook instance would add plumbing without changing the core correctness.

Grey Divider

File Changes

Bug fix (2)
github_api.py Persist PR base/head SHAs from payload with API fallback +21/-5

Persist PR base/head SHAs from payload with API fallback

• Stores pr_base_sha/pr_head_sha on the GithubWebhook instance during process(), preferring pull_request payload SHAs to avoid timing drift. Falls back to PullRequest base.sha/head.sha for non-pull_request event payloads and updates OwnersFileHandler initialization calls to the new signature.

webhook_server/libs/github_api.py


owners_files_handler.py Read diff SHAs from GithubWebhook; remove PullRequest parameter +10/-11

Read diff SHAs from GithubWebhook; remove PullRequest parameter

• Removes the PullRequest parameter from initialize() and list_changed_files(). list_changed_files() now reads base/head SHAs from github_webhook.pr_base_sha/pr_head_sha and documents the race-condition rationale.

webhook_server/libs/handlers/owners_files_handler.py


Tests (2)
test_github_api.py Add tests for SHA storage from payload and API fallback +107/-0

Add tests for SHA storage from payload and API fallback

• Introduces coverage verifying payload SHAs are preferred for pull_request events and that non-PR events fall back to PullRequest API SHAs. Mocks cloning/handler initialization to focus assertions on SHA persistence behavior.

webhook_server/tests/test_github_api.py


test_owners_files_handler.py Update tests for new handler signatures and SHA source +8/-10

Update tests for new handler signatures and SHA source

• Updates initialize() and list_changed_files() tests to match the removed PullRequest parameter. Adjusts list_changed_files test setup to provide SHAs via the mocked GithubWebhook instance.

webhook_server/tests/test_owners_files_handler.py


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (2) 📎 Requirement gaps (0)

Context used

Grey Divider


Action required

1. pr_base_sha/pr_head_sha typing incomplete ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
GithubWebhook sets pr_base_sha/pr_head_sha only inside process() (and only one branch uses
inline annotations), but the attributes are not declared on the class or in __init__, which breaks
strict mypy expectations and can cause attr-defined errors where these fields are accessed (e.g.,
in OwnersFileHandler). This reduces type safety and can fail CI type checks.
Code

webhook_server/libs/github_api.py[R599-614]

+            # Store PR SHAs: prefer webhook payload (avoids race condition with live API)
+            # Fall back to PullRequest object for non-pull_request events (issue_comment, check_run, etc.)
+            pr_payload = self.hook_data.get("pull_request")
+            if (
+                isinstance(pr_payload, dict)
+                and isinstance(pr_payload.get("base"), dict)
+                and isinstance(pr_payload.get("head"), dict)
+            ):
+                # pull_request event — base.sha and head.sha guaranteed by GitHub webhook spec
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
+                self.pr_base_sha, self.pr_head_sha = await asyncio.gather(
+                    github_api_call(lambda: pull_request.base.sha, logger=self.logger, log_prefix=self.log_prefix),
+                    github_api_call(lambda: pull_request.head.sha, logger=self.logger, log_prefix=self.log_prefix),
+                )
Evidence
Compliance rule 9 requires complete type hints under strict mypy. The PR introduces new instance
attributes (pr_base_sha/pr_head_sha) but does not declare them in __init__ or as class-level
annotations; instead they are conditionally created in process(), and only the payload branch uses
inline annotations, which is not sufficient for strict attribute typing.

CLAUDE.md: Use Complete Type Hints (Mypy Strict Mode)
webhook_server/libs/github_api.py[599-614]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.pr_base_sha`/`GithubWebhook.pr_head_sha` are introduced but not declared as attributes on the class (or in `__init__`), and only one assignment branch includes an inline annotation. In strict mypy mode this commonly triggers `attr-defined`/incomplete attribute typing issues.

## Issue Context
These attributes are later read by `OwnersFileHandler.list_changed_files()`, so the class should explicitly declare them (preferably via class-level annotations without fake default values).

## Fix Focus Areas
- webhook_server/libs/github_api.py[109-134]
- webhook_server/libs/github_api.py[599-614]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Defensive pr_payload.get checks 📘 Rule violation ≡ Correctness
Description
The new SHA selection logic uses .get()/isinstance()/"sha" in ... guards and silently falls
back to live PR API SHAs when webhook payload fields are missing. This hides malformed
pull_request payloads instead of failing fast per the webhook spec expectations.
Code

webhook_server/libs/github_api.py[R601-613]

+            pr_payload = self.hook_data.get("pull_request", {})
+            if (
+                isinstance(pr_payload, dict)
+                and "sha" in pr_payload.get("base", {})
+                and "sha" in pr_payload.get("head", {})
+            ):
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
+                self.pr_base_sha, self.pr_head_sha = await asyncio.gather(
+                    github_api_call(lambda: pull_request.base.sha, logger=self.logger, log_prefix=self.log_prefix),
+                    github_api_call(lambda: pull_request.head.sha, logger=self.logger, log_prefix=self.log_prefix),
+                )
Evidence
The checklist requires failing fast on malformed webhook payloads and discourages unnecessary
defensive programming for required webhook fields. The added logic uses .get() and key-existence
checks and then falls back to API-derived SHAs, which prevents malformed payloads from surfacing as
errors.

CLAUDE.md: Webhook Payload Handling Must Follow the GitHub Webhook Specification (Fail on Malformed Payloads)
CLAUDE.md: Avoid Unnecessary Defensive Programming (Fail-Fast by Default)
webhook_server/libs/github_api.py[601-613]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The code defensively checks for `pull_request.base.sha`/`pull_request.head.sha` in the webhook payload and falls back to live API values when missing. For `pull_request` events, these fields are required and malformed payloads should fail fast (raise `KeyError`/`TypeError`) rather than being silently masked.

## Issue Context
This defensive fallback can hide webhook payload/schema problems and undermines the guarantee that required webhook fields are present per spec.

## Fix Focus Areas
- webhook_server/libs/github_api.py[601-613]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Payload SHAs not fetched 🐞 Bug ☼ Reliability
Description
OwnersFileHandler.list_changed_files() now diffs using GithubWebhook.pr_base_sha/pr_head_sha from
the webhook payload, but _clone_repository() still fetches only the current base ref and current PR
ref from the live PullRequest object. If the payload SHAs aren’t present in the local clone (e.g.,
PR retarget/force-push before processing), git diff fails and list_changed_files raises
RuntimeError, aborting initialization/processing.
Code

webhook_server/libs/handlers/owners_files_handler.py[R102-109]

+        # SHAs are stored on the GithubWebhook instance during process():
+        # - From webhook payload for pull_request events (avoids race condition with live API)
+        # - From PullRequest object for other event types (issue_comment, check_run, etc.)
+        base_sha = self.github_webhook.pr_base_sha
+        head_sha = self.github_webhook.pr_head_sha

        # Run git diff command on cloned repository
        # Quote clone_repo_dir to handle paths with spaces or special characters
Evidence
The PR changes list_changed_files to use stored webhook SHAs, but the clone/fetch logic still
fetches refs based on the live PullRequest object (base ref + PR head ref). Since git diff requires
both SHAs to resolve locally and the code raises RuntimeError on failure, a mismatch between fetched
refs and payload SHAs breaks processing.

webhook_server/libs/github_api.py[595-614]
webhook_server/libs/handlers/owners_files_handler.py[102-147]
webhook_server/libs/github_api.py[360-388]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`OwnersFileHandler.list_changed_files()` uses `github_webhook.pr_base_sha/pr_head_sha` (often sourced from webhook payload) to run `git diff`. However, `_clone_repository()` does not ensure those exact SHAs exist in the local clone; it fetches the current base branch ref and current PR ref from the live PR object.

When the payload SHAs aren’t reachable from the fetched refs, `git diff {base}...{head}` fails and the handler raises `RuntimeError`, stopping webhook processing.

### Issue Context
- SHAs are now stored from webhook payload in `GithubWebhook.process()`.
- Cloning/fetching still uses `pull_request.base.ref` and `refs/pull/<n>/head`, which may not contain the payload SHAs.

### Fix Focus Areas
- webhook_server/libs/github_api.py[595-620]
- webhook_server/libs/github_api.py[360-388]
- webhook_server/libs/handlers/owners_files_handler.py[102-147]

### Implementation sketch
1. After setting `self.pr_base_sha/self.pr_head_sha` (and before invoking `OwnersFileHandler.initialize()`), ensure the clone contains these commits:
  - Option A (preferred): update `_clone_repository()` to fetch by SHA in addition to refs, e.g. `git fetch origin <base_sha> <head_sha>` (or `git fetch origin <sha>` for each) when `self.pr_base_sha/pr_head_sha` are set.
  - Option B: in `list_changed_files()`, verify both SHAs exist via `git cat-file -e <sha>^{commit}`; if missing, `git fetch origin <sha>` and retry diff.
2. Keep the existing webhook-payload preference, but make local availability deterministic so `git diff` doesn’t fail due to missing objects.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Empty default pr_*_sha 📘 Rule violation ⚙ Maintainability ⭐ New
Description
pr_base_sha/pr_head_sha are initialized to empty strings, which are placeholder defaults that
can mask missing/invalid state if these fields are used before being set. This weakens fail-fast
behavior and can lead to confusing downstream errors (e.g., git diff against empty SHAs).
Code

webhook_server/libs/github_api.py[R118-119]

+        self.pr_base_sha: str = ""
+        self.pr_head_sha: str = ""
Evidence
The compliance rule forbids fabricated defaults such as empty strings for required/architecturally
guaranteed data; the new code initializes both SHA fields to "", which can conceal missing
initialization and delay failures to later stages.

CLAUDE.md: Eliminate unnecessary defensive programming; use fail-fast errors instead of fake defaults
webhook_server/libs/github_api.py[118-119]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.pr_base_sha`/`pr_head_sha` are initialized with fake defaults (`""`). Per compliance, missing required data should fail fast rather than being masked by placeholder values.

## Issue Context
These SHAs are intended to be populated during `process()`. Initializing them to `None` (and validating before use) prevents accidental use of empty SHAs and makes failures explicit.

## Fix Focus Areas
- webhook_server/libs/github_api.py[118-119]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Missing sha key validation 🐞 Bug ☼ Reliability ⭐ New
Description
In GithubWebhook.process(), the branch that prefers webhook payload SHAs only checks that base/head
are dicts, then directly indexes pr_payload["base"]["sha"] and pr_payload["head"]["sha"]. If a
non-conforming payload includes base/head dicts but omits "sha", webhook processing will raise
KeyError and abort instead of falling back to the PullRequest API SHAs.
Code

webhook_server/libs/github_api.py[R610-611]

+                self.pr_base_sha = pr_payload["base"]["sha"]
+                self.pr_head_sha = pr_payload["head"]["sha"]
Evidence
The code path checks base/head are dicts but does not verify the presence of the sha key
before direct indexing, which can raise KeyError when sha is absent.

webhook_server/libs/github_api.py[601-616]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.process()` directly indexes `pr_payload["base"]["sha"]` / `pr_payload["head"]["sha"]` after only checking that `base` and `head` are dicts. If `sha` is missing in either dict (malformed/partial payload), this raises `KeyError` and aborts processing.

## Issue Context
The code intends to prefer webhook payload SHAs to avoid a race, while falling back to PullRequest API SHAs for non-PR/partial payloads.

## Fix Focus Areas
- webhook_server/libs/github_api.py[603-616]

## Suggested fix
- Extend the `if` condition to also validate that:
 - `"sha" in pr_payload["base"]` and `"sha" in pr_payload["head"]`
 - and that both values are non-empty strings.
- Otherwise, take the existing fallback path that fetches SHAs from the `PullRequest` object (or raise a clear, logged error if you prefer fail-fast).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Fragile payload SHA check 🐞 Bug ☼ Reliability
Description
GithubWebhook.process uses membership tests like 'sha' in pr_payload.get('base', {}) without
validating that base/head are dicts; if those fields are present but non-mapping (e.g., None),
the membership test can raise TypeError and abort processing. Because webhook signature verification
is conditional, malformed payload shapes can reach this code path and crash early.
Code

webhook_server/libs/github_api.py[R601-609]

+            pr_payload = self.hook_data.get("pull_request", {})
+            if (
+                isinstance(pr_payload, dict)
+                and "sha" in pr_payload.get("base", {})
+                and "sha" in pr_payload.get("head", {})
+            ):
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
Evidence
The new conditional uses "sha" in pr_payload.get("base", {}) / ...get("head", {}) and then
indexes into pr_payload["base"]["sha"], which can throw if base/head aren’t dicts. The webhook
endpoint only verifies signatures when a secret is configured, so malformed payloads can reach this
logic.

webhook_server/libs/github_api.py[599-609]
webhook_server/app.py[384-393]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The new PR-SHA selection logic assumes `pull_request.base` and `pull_request.head` are dict-like when doing membership tests and indexing. If either is present but not a dict (e.g., `None`), `"sha" in ...` can raise `TypeError`, or later indexing can fail.

### Issue Context
Webhook signature verification is optional (only when `webhook-secret` is configured), so the service should be resilient to malformed payload shapes.

### Fix Focus Areas
- webhook_server/libs/github_api.py[599-614]
- webhook_server/app.py[384-393]

### Implementation sketch
Replace the membership tests with explicit type-safe extraction:
```py
pr_payload = self.hook_data.get("pull_request")
base = pr_payload.get("base") if isinstance(pr_payload, dict) else None
head = pr_payload.get("head") if isinstance(pr_payload, dict) else None
base_sha = base.get("sha") if isinstance(base, dict) else None
head_sha = head.get("sha") if isinstance(head, dict) else None

if isinstance(base_sha, str) and base_sha and isinstance(head_sha, str) and head_sha:
   self.pr_base_sha = base_sha
   self.pr_head_sha = head_sha
else:
   # current API fallback
```
Optionally: validate SHA format (hex) before accepting it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit 5a40c0f

Results up to commit 7ff7d45


🐞 Bugs (2) 📘 Rule violations (1) 📎 Requirement gaps (0) 🎨 UX issues (0)


Action required
1. Payload SHAs not fetched 🐞 Bug ☼ Reliability
Description
OwnersFileHandler.list_changed_files() now diffs using GithubWebhook.pr_base_sha/pr_head_sha from
the webhook payload, but _clone_repository() still fetches only the current base ref and current PR
ref from the live PullRequest object. If the payload SHAs aren’t present in the local clone (e.g.,
PR retarget/force-push before processing), git diff fails and list_changed_files raises
RuntimeError, aborting initialization/processing.
Code

webhook_server/libs/handlers/owners_files_handler.py[R102-109]

+        # SHAs are stored on the GithubWebhook instance during process():
+        # - From webhook payload for pull_request events (avoids race condition with live API)
+        # - From PullRequest object for other event types (issue_comment, check_run, etc.)
+        base_sha = self.github_webhook.pr_base_sha
+        head_sha = self.github_webhook.pr_head_sha

        # Run git diff command on cloned repository
        # Quote clone_repo_dir to handle paths with spaces or special characters
Evidence
The PR changes list_changed_files to use stored webhook SHAs, but the clone/fetch logic still
fetches refs based on the live PullRequest object (base ref + PR head ref). Since git diff requires
both SHAs to resolve locally and the code raises RuntimeError on failure, a mismatch between fetched
refs and payload SHAs breaks processing.

webhook_server/libs/github_api.py[595-614]
webhook_server/libs/handlers/owners_files_handler.py[102-147]
webhook_server/libs/github_api.py[360-388]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`OwnersFileHandler.list_changed_files()` uses `github_webhook.pr_base_sha/pr_head_sha` (often sourced from webhook payload) to run `git diff`. However, `_clone_repository()` does not ensure those exact SHAs exist in the local clone; it fetches the current base branch ref and current PR ref from the live PR object.

When the payload SHAs aren’t reachable from the fetched refs, `git diff {base}...{head}` fails and the handler raises `RuntimeError`, stopping webhook processing.

### Issue Context
- SHAs are now stored from webhook payload in `GithubWebhook.process()`.
- Cloning/fetching still uses `pull_request.base.ref` and `refs/pull/<n>/head`, which may not contain the payload SHAs.

### Fix Focus Areas
- webhook_server/libs/github_api.py[595-620]
- webhook_server/libs/github_api.py[360-388]
- webhook_server/libs/handlers/owners_files_handler.py[102-147]

### Implementation sketch
1. After setting `self.pr_base_sha/self.pr_head_sha` (and before invoking `OwnersFileHandler.initialize()`), ensure the clone contains these commits:
  - Option A (preferred): update `_clone_repository()` to fetch by SHA in addition to refs, e.g. `git fetch origin <base_sha> <head_sha>` (or `git fetch origin <sha>` for each) when `self.pr_base_sha/pr_head_sha` are set.
  - Option B: in `list_changed_files()`, verify both SHAs exist via `git cat-file -e <sha>^{commit}`; if missing, `git fetch origin <sha>` and retry diff.
2. Keep the existing webhook-payload preference, but make local availability deterministic so `git diff` doesn’t fail due to missing objects.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Defensive pr_payload.get checks 📘 Rule violation ≡ Correctness
Description
The new SHA selection logic uses .get()/isinstance()/"sha" in ... guards and silently falls
back to live PR API SHAs when webhook payload fields are missing. This hides malformed
pull_request payloads instead of failing fast per the webhook spec expectations.
Code

webhook_server/libs/github_api.py[R601-613]

+            pr_payload = self.hook_data.get("pull_request", {})
+            if (
+                isinstance(pr_payload, dict)
+                and "sha" in pr_payload.get("base", {})
+                and "sha" in pr_payload.get("head", {})
+            ):
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
+                self.pr_base_sha, self.pr_head_sha = await asyncio.gather(
+                    github_api_call(lambda: pull_request.base.sha, logger=self.logger, log_prefix=self.log_prefix),
+                    github_api_call(lambda: pull_request.head.sha, logger=self.logger, log_prefix=self.log_prefix),
+                )
Evidence
The checklist requires failing fast on malformed webhook payloads and discourages unnecessary
defensive programming for required webhook fields. The added logic uses .get() and key-existence
checks and then falls back to API-derived SHAs, which prevents malformed payloads from surfacing as
errors.

CLAUDE.md: Webhook Payload Handling Must Follow the GitHub Webhook Specification (Fail on Malformed Payloads)
CLAUDE.md: Avoid Unnecessary Defensive Programming (Fail-Fast by Default)
webhook_server/libs/github_api.py[601-613]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The code defensively checks for `pull_request.base.sha`/`pull_request.head.sha` in the webhook payload and falls back to live API values when missing. For `pull_request` events, these fields are required and malformed payloads should fail fast (raise `KeyError`/`TypeError`) rather than being silently masked.

## Issue Context
This defensive fallback can hide webhook payload/schema problems and undermines the guarantee that required webhook fields are present per spec.

## Fix Focus Areas
- webhook_server/libs/github_api.py[601-613]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
3. Fragile payload SHA check 🐞 Bug ☼ Reliability
Description
GithubWebhook.process uses membership tests like 'sha' in pr_payload.get('base', {}) without
validating that base/head are dicts; if those fields are present but non-mapping (e.g., None),
the membership test can raise TypeError and abort processing. Because webhook signature verification
is conditional, malformed payload shapes can reach this code path and crash early.
Code

webhook_server/libs/github_api.py[R601-609]

+            pr_payload = self.hook_data.get("pull_request", {})
+            if (
+                isinstance(pr_payload, dict)
+                and "sha" in pr_payload.get("base", {})
+                and "sha" in pr_payload.get("head", {})
+            ):
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
Evidence
The new conditional uses "sha" in pr_payload.get("base", {}) / ...get("head", {}) and then
indexes into pr_payload["base"]["sha"], which can throw if base/head aren’t dicts. The webhook
endpoint only verifies signatures when a secret is configured, so malformed payloads can reach this
logic.

webhook_server/libs/github_api.py[599-609]
webhook_server/app.py[384-393]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The new PR-SHA selection logic assumes `pull_request.base` and `pull_request.head` are dict-like when doing membership tests and indexing. If either is present but not a dict (e.g., `None`), `"sha" in ...` can raise `TypeError`, or later indexing can fail.

### Issue Context
Webhook signature verification is optional (only when `webhook-secret` is configured), so the service should be resilient to malformed payload shapes.

### Fix Focus Areas
- webhook_server/libs/github_api.py[599-614]
- webhook_server/app.py[384-393]

### Implementation sketch
Replace the membership tests with explicit type-safe extraction:
```py
pr_payload = self.hook_data.get("pull_request")
base = pr_payload.get("base") if isinstance(pr_payload, dict) else None
head = pr_payload.get("head") if isinstance(pr_payload, dict) else None
base_sha = base.get("sha") if isinstance(base, dict) else None
head_sha = head.get("sha") if isinstance(head, dict) else None

if isinstance(base_sha, str) and base_sha and isinstance(head_sha, str) and head_sha:
   self.pr_base_sha = base_sha
   self.pr_head_sha = head_sha
else:
   # current API fallback
```
Optionally: validate SHA format (hex) before accepting it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Results up to commit 92438ea


🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0)


Action required
1. pr_base_sha/pr_head_sha typing incomplete ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
GithubWebhook sets pr_base_sha/pr_head_sha only inside process() (and only one branch uses
inline annotations), but the attributes are not declared on the class or in __init__, which breaks
strict mypy expectations and can cause attr-defined errors where these fields are accessed (e.g.,
in OwnersFileHandler). This reduces type safety and can fail CI type checks.
Code

webhook_server/libs/github_api.py[R599-614]

+            # Store PR SHAs: prefer webhook payload (avoids race condition with live API)
+            # Fall back to PullRequest object for non-pull_request events (issue_comment, check_run, etc.)
+            pr_payload = self.hook_data.get("pull_request")
+            if (
+                isinstance(pr_payload, dict)
+                and isinstance(pr_payload.get("base"), dict)
+                and isinstance(pr_payload.get("head"), dict)
+            ):
+                # pull_request event — base.sha and head.sha guaranteed by GitHub webhook spec
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
+                self.pr_base_sha, self.pr_head_sha = await asyncio.gather(
+                    github_api_call(lambda: pull_request.base.sha, logger=self.logger, log_prefix=self.log_prefix),
+                    github_api_call(lambda: pull_request.head.sha, logger=self.logger, log_prefix=self.log_prefix),
+                )
Evidence
Compliance rule 9 requires complete type hints under strict mypy. The PR introduces new instance
attributes (pr_base_sha/pr_head_sha) but does not declare them in __init__ or as class-level
annotations; instead they are conditionally created in process(), and only the payload branch uses
inline annotations, which is not sufficient for strict attribute typing.

CLAUDE.md: Use Complete Type Hints (Mypy Strict Mode)
webhook_server/libs/github_api.py[599-614]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.pr_base_sha`/`GithubWebhook.pr_head_sha` are introduced but not declared as attributes on the class (or in `__init__`), and only one assignment branch includes an inline annotation. In strict mypy mode this commonly triggers `attr-defined`/incomplete attribute typing issues.

## Issue Context
These attributes are later read by `OwnersFileHandler.list_changed_files()`, so the class should explicitly declare them (preferably via class-level annotations without fake default values).

## Fix Focus Areas
- webhook_server/libs/github_api.py[109-134]
- webhook_server/libs/github_api.py[599-614]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

@myakove-bot

Copy link
Copy Markdown
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Pre-commit Checks: pre-commit runs automatically if .pre-commit-config.yaml exists
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest python-module-install - Test Python package installation
  • /retest pre-commit - Run pre-commit hooks and checks
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 1 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  5. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • rnetser

Reviewers:

  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge
AI Features
  • Conventional Title: Mode: fix (claude/claude-opus-4-6[1m])
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])
  • Test Oracle: Triggers: approved (claude/claude-opus-4-6[1m]); /test-oracle can be used anytime

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Comment thread webhook_server/libs/github_api.py Outdated
Comment thread webhook_server/libs/handlers/owners_files_handler.py
Comment thread webhook_server/libs/github_api.py Outdated
…dition

- Replace live PyGithub API calls with webhook payload SHAs for pull_request events
- Fall back to API for non-PR events (issue_comment, check_run, etc.)
- Store pr_base_sha/pr_head_sha on GithubWebhook instance during process()
- Remove pull_request parameter from initialize() and list_changed_files()
- Add symmetric guards for both base and head SHA validation

Closes #1096
@myakove myakove force-pushed the fix/issue-1096-webhook-payload-shas branch from 7ff7d45 to 92438ea Compare June 9, 2026 10:36
@qodo-code-review

qodo-code-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 92438ea

@myakove

myakove commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:601 (qodo rule violation) — Defensive pr_payload.get checks

Addressed: Fixed in commit 92438ea — replaced .get()/"sha" in guards with isinstance() dict checks (defensive only for non-PR events) followed by direct ["sha"] access (fail-fast for PR events where sha is guaranteed by GitHub webhook spec). This aligns with AGENTS.md anti-defensive programming: defensive checks only for genuinely optional parameters (non-PR event types), fail-fast for guaranteed fields.

webhook_server/libs/handlers/owners_files_handler.py:102 (qodo bug) — Payload SHAs not fetched

Addressed: This is a pre-existing race condition that existed before this PR — the old code also used SHAs that could become stale on force-push. The _clone_repository() fetches refs/pull/N/head which contains the head SHA, and the base SHA is reachable via the fetched base ref. The existing git diff error handling already catches and raises RuntimeError if SHAs aren't reachable. Added a docstring note documenting this as a known limitation. Updated issue #1096 spec.

webhook_server/libs/github_api.py:601 (qodo bug) — Fragile payload SHA check

Addressed: Fixed in the same commit 92438ea — the isinstance(pr_payload.get("base"), dict) and isinstance(pr_payload.get("head"), dict) checks handle the case where base/head are non-dict (including None). Only when both are dicts do we access ["sha"] directly.

Comment thread webhook_server/libs/github_api.py
Declare pr_base_sha and pr_head_sha in __init__() with empty string defaults
so mypy strict mode has clear type declarations. Remove redundant inline
annotations in process() since the class-level ones cover typing.
@qodo-code-review

qodo-code-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 5a40c0f

Comment thread webhook_server/libs/github_api.py
Comment thread webhook_server/libs/github_api.py
@myakove

myakove commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:118 (qodo rule violation) — Empty default pr_*_sha

Addressed: By design — empty string defaults provide mypy strict type safety. Using None would require Optional[str] and None-checks everywhere, violating anti-defensive programming. Attributes are always overwritten in process() before any handler reads them.

webhook_server/libs/github_api.py:610 (qodo bug) — Missing sha key validation

Addressed: Already addressed — isinstance(pr_payload.get('base'), dict) guards validate structure, then direct ['sha'] access is fail-fast per AGENTS.md. GitHub webhook spec guarantees these fields for PR events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: list_changed_files uses live API SHAs instead of webhook payload SHAs

2 participants