Skip to content

fix(backfill): drop unparseable timestamps from the latestDate reduction#1715

Open
RenzoMXD wants to merge 1 commit into
JSONbored:mainfrom
RenzoMXD:fix/backfill-latest-date-guard
Open

fix(backfill): drop unparseable timestamps from the latestDate reduction#1715
RenzoMXD wants to merge 1 commit into
JSONbored:mainfrom
RenzoMXD:fix/backfill-latest-date-guard

Conversation

@RenzoMXD

Copy link
Copy Markdown
Contributor

Summary

latestDate (src/github/backfill.ts) selects the contributor's latest activity timestamp with a lexicographic .sort().at(-1) and only a truthiness filter:

function latestDate(values: Array<string | null | undefined>): string | undefined {
  return values.filter(Boolean).sort().at(-1) ?? undefined;
}

A malformed/sentinel GitHub timestamp whose first character sorts after 2 (e.g. bad-date, pending, not-a-date) therefore wins over a real 2026-... ISO stamp and is persisted into contributor_repo_stats.last_activity_at. Every downstream freshness check then does Date.parse(bad-date) -> NaN and treats a genuinely active repo as stale/missing.

Fix

Mirror the guarded canonical form already in src/signals/data-quality.ts (newest()/oldest()):

return values?.filter((value): value is string => Boolean(value && Number.isFinite(Date.parse(value)))).sort().at(-1);

Also require Number.isFinite(Date.parse(value)) in latestDate's filter, so unparseable values are dropped from the reduction. No behavior change when every value is a well-formed ISO timestamp.

Regression proof

An existing test (test/unit/backfill.test.ts) feeds a fixture mixing updatedAt: bad-date with valid ISO stamps and previously asserted lastActivityAt: bad-date -- encoding the bug. That assertion is corrected to the real latest stamp (2026-05-24T00:00:00Z); the flip is the regression proof (pre-fix the malformed value won the lexicographic max; post-fix it is dropped).

Scope

Validation

  • git diff --check
  • npm run actionlint
  • npm run db:migrations:check
  • npm run typecheck
  • npm run test:coverage -- test/unit/backfill.test.ts 94/94 pass; the changed filter is covered (both arms: the malformed bad-date is dropped, the valid ISO stamps are kept).

Targeted run:

npx vitest run test/unit/backfill.test.ts
# 94/94 passed

Safety

  • No secrets, wallet details, hotkeys, coldkeys, user PATs, private keys, raw trust scores, private rankings, or private maintainer evidence are exposed.
  • Public GitHub text stays sanitized, low-noise, and does not imply compensation guarantees or optimization tactics.
  • No auth, cookie, CORS, GitHub App, Cloudflare, or session changes -- a pure-helper timestamp reduction on the contributor-activity ingestion path, using a guard pattern already established in the same codebase.
  • No UI changes.
  • No docs/changelog changes.

UI Evidence

Not applicable -- backend timestamp-reduction helper with no visible UI, frontend, docs, or extension surface.

Closes #1714

latestDate selected the contributor's latest activity with a lexicographic
.sort().at(-1) and only a truthiness filter, so a malformed/sentinel
GitHub timestamp whose first character sorts after "2" (e.g. "bad-date",
"pending") outranked a real 2026-... ISO stamp and was persisted into
contributor_repo_stats.last_activity_at. Every downstream freshness check
then did Date.parse("bad-date") -> NaN and treated a genuinely active
repo as stale/missing.

Mirror the guarded newest()/oldest() in signals/data-quality.ts: also
require Number.isFinite(Date.parse(value)) in the filter, so unparseable
values are dropped from the reduction. No behavior change when every
value is a well-formed ISO timestamp.

An existing test asserted lastActivityAt: "bad-date" against a fixture
that mixes "bad-date" with valid ISO stamps; that assertion encoded the
bug and is corrected to the real latest stamp (2026-05-24T00:00:00Z).

Closes JSONbored#1714
@RenzoMXD RenzoMXD requested a review from JSONbored as a code owner June 29, 2026 06:17
@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 29, 2026
@superagent-security superagent-security Bot removed the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 29, 2026
@gittensory-orb

gittensory-orb Bot commented Jun 29, 2026

Copy link
Copy Markdown

Warning

🟨🟨🟨🟨🟨🟨🟨🟨🟨🟨🟨🟨

⏸️ Gittensory review result - manual review recommended

Review updated: 2026-06-29 14:12:43 UTC

2 files · 1 AI reviewer · no blockers · readiness 55/100 · CI green · clean

⏸️ Suggested Action - Manual Review

  • Touches a guarded path — held for manual review

Review summary
This changes the backfill `latestDate` helper to ignore truthy but unparseable activity timestamps before selecting the lexicographic maximum. For GitHub ISO `updatedAt` values, the ordering remains correct while malformed values like `bad-date` no longer win and get stored as `lastActivityAt`; the updated test drives the real `refreshContributorActivity` path. I do not see a reachable correctness defect in this diff.

Nits — 8 non-blocking
  • src/github/backfill.ts:2809: nit: The inline predicate is long enough to obscure the actual reduction; a small named helper would match the existing `newest()` pattern and make the guard reusable.
  • test/unit/backfill.test.ts:287: nit: The mixed-validity regression is covered, but there is no explicit all-invalid case documenting that `latestDate` should return `undefined` rather than preserving any malformed timestamp.
  • src/github/backfill.ts:2809: Extract the parseability check into a local `isParseableDateString` helper or share the same helper shape used by `src/signals/data-quality.ts:newest`.
  • test/unit/backfill.test.ts:287: Add a focused assertion where every candidate timestamp is invalid so the no-valid-date behavior is locked down.
  • PR author also opened the linked issue — Link an issue that was opened by a different contributor, or provide a rationale for why this self-authored issue represents genuine discovery work.
  • Code changes lack test evidence — Add focused regression tests or explain why existing coverage is sufficient.
  • Readiness score is below the configured threshold — Use the readiness panel as advisory maintainer context; the score does not block this PR.
  • Touches a guarded path — held for manual review — A maintainer must review and merge this change.
Signal Result Evidence
Code review ✅ No blockers 1 reviewer
Linked issue ✅ Linked #1714
Related work ⚠️ 2 scoped overlaps Top overlaps are listed below; lower-confidence bulk is hidden.
Review load ❌ 8/20 Readiness component derived from cached public PR metadata and labels.
Validation evidence ❌ 5/25 Cached preflight status is hold.
Open PR queue ❌ 3/10 22 open PR(s), 10 likely reviewable, 12 unlinked.
Contributor context ✅ Confirmed Gittensor contributor RenzoMXD; Gittensor profile; 44 PR(s), 9 issue(s).
Gate result ⚠️ Not blocking Advisory; not blocking this PR.
Review context
Contributor next steps
  • Review top overlaps.
  • Add scope summary.
  • Fix blocker.
  • Expect slower review.
  • Refresh registry data or choose a registered active repo.
  • Check active issues and PRs before submitting.
Signal definitions
  • Related work = same linked issue, overlapping active PRs, or title/path similarity.
  • Review load = cached public PR metadata such as size labels, changed paths, and preflight status.
  • Open PR queue = repo-wide review pressure; it is not a PR quality failure.
  • Contributor context = public GitHub/Gittensor identity context; non-Gittensor status is not a blocker.

🟩 Safe / merged · 🟦 Advisory · 🟨 Held for review · 🟥 Blocked / closed


💰 Earn for open-source contributions like this. Gittensor lets GitHub contributors earn for the work they already do — register to start earning →.

Checked by Gittensory, a quiet PR intelligence layer for OSS maintainers.

  • Re-run Gittensory review

@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.58%. Comparing base (59cf13f) to head (a9e26be).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1715   +/-   ##
=======================================
  Coverage   95.58%   95.58%           
=======================================
  Files         204      204           
  Lines       22313    22313           
  Branches     8065     8064    -1     
=======================================
  Hits        21328    21328           
  Misses        408      408           
  Partials      577      577           
Files with missing lines Coverage Δ
src/github/backfill.ts 92.95% <100.00%> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gittensory-orb gittensory-orb Bot added gittensor Gittensor contributor context gittensor:bug Gittensor-scored bug fix - worth 0.5x multiplier. labels Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gittensor:bug Gittensor-scored bug fix - worth 0.5x multiplier. gittensor Gittensor contributor context

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(backfill): a malformed GitHub updatedAt must not outrank a real timestamp in lastActivityAt

1 participant