Fix dbt YAML review file classification by anandgupta42 · Pull Request #920 · AltimateAI/altimate-code

anandgupta42 · 2026-06-08T21:00:29Z

PINEAPPLE

Summary

classify only dbt resource YAML and conventional dbt property YAML as schema.yml
avoid treating non-dbt YAML, such as GitHub workflow files, as dbt schema metadata
add a regression test for the workflow YAML false positive

Validation

bun test --timeout 30000 test/altimate/review-dbt-patterns.test.ts
bun run typecheck
pre-push bun turbo typecheck passed

Summary by CodeRabbit

Bug Fixes
- Improved YAML file classification accuracy. YAML files are now properly categorized as schema files only when located in specific directories or matching schema-related naming patterns. YAML files in workflow directories are no longer incorrectly flagged as schema changes.
Tests
- Added test coverage to ensure YAML files in workflow directories are correctly excluded from schema file detection.

coderabbitai · 2026-06-08T21:00:44Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4fc3c0a1-d8da-4292-9811-ed3c0e9c0686

📥 Commits

Reviewing files that changed from the base of the PR and between 6185bfd and 0ac9a2a.

📒 Files selected for processing (2)

packages/opencode/src/altimate/review/diff-filter.ts
packages/opencode/test/altimate/review-dbt-patterns.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/opencode/test/altimate/review-dbt-patterns.test.ts
packages/opencode/src/altimate/review/diff-filter.ts

📝 Walkthrough

Walkthrough

The PR narrows dbt YAML detection: classifyDbtFile now computes isYaml and returns schema_yml only for YAML files under dbt model/snapshot/seed/test directories or matching schema-like filename/path patterns. A test ensures workflow YAML (e.g., .github/workflows/*) is excluded.

Changes

YAML Classification Precision

Layer / File(s)	Summary
YAML classification refinement and validation `packages/opencode/src/altimate/review/diff-filter.ts`, `packages/opencode/test/altimate/review-dbt-patterns.test.ts`	`classifyDbtFile` introduces an `isYaml` flag and tightens `schema_yml` classification to require YAML located in model/snapshot/seed/test directories or matching schema-like filename/path patterns. Added test asserts `.github/workflows/*.yml` are not classified as schema YAML.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I sniffed the paths both near and far,
YAML truth now shines — workflow files don't mar.
Schema names and dbt nests I mind,
Only real schema_yml I now find. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: fixing how dbt YAML files are classified in the review process, distinguishing actual dbt schema files from unrelated YAML files.
Description check	✅ Passed	The description includes the required PINEAPPLE marker, explains the summary and changes, and documents validation steps. All required template sections are substantially covered.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/dbt-review-demo-corpus

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts`:
- Around line 10-16: The test unconditionally deletes
ALTIMATE_TELEMETRY_DISABLED in afterAll which clobbers any pre-existing
environment value; modify the setup/teardown to capture the prior value in
beforeAll (e.g., const prevTelemetry = process.env.ALTIMATE_TELEMETRY_DISABLED)
then set process.env.ALTIMATE_TELEMETRY_DISABLED = "true", and in afterAll
restore that saved value (if prevTelemetry is undefined delete the env var,
otherwise set it back to prevTelemetry) before calling Registry.reset(); update
references in the beforeAll/afterAll blocks accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e73537bc-66fa-47be-ae98-c2425126aaba

📥 Commits

Reviewing files that changed from the base of the PR and between c2019ba and b0c0fce.

📒 Files selected for processing (16)

docs/docs/usage/dbt-pr-review.md
docs/internal/2026-06-08-dbt-pr-review-self-improvement-loop.md
docs/internal/dbt-pr-review-demo-scenario-corpus.md
packages/drivers/src/duckdb.ts
packages/opencode/src/altimate/native/altimate-core.ts
packages/opencode/src/altimate/native/types.ts
packages/opencode/src/altimate/review/dbt-patterns.ts
packages/opencode/src/altimate/review/diff-filter.ts
packages/opencode/src/altimate/review/orchestrate.ts
packages/opencode/src/altimate/review/runner.ts
packages/opencode/src/cli/cmd/review.ts
packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts
packages/opencode/test/altimate/review-ci.test.ts
packages/opencode/test/altimate/review-dbt-patterns.test.ts
packages/opencode/test/altimate/review-runner.test.ts
packages/opencode/test/altimate/review.test.ts

cubic-dev-ai

4 issues found across 16 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/opencode/src/altimate/review/dbt-patterns.ts">

<violation number="1" location="packages/opencode/src/altimate/review/dbt-patterns.ts:822">
P2: `detectSchemaYmlPatterns` now double-reports removed `relationships` tests by combining an existing hard-coded detector with the catalog rule for the same condition.</violation>
</file>

<file name="docs/internal/dbt-pr-review-demo-scenario-corpus.md">

<violation number="1" location="docs/internal/dbt-pr-review-demo-scenario-corpus.md:6">
P3: Machine-specific absolute paths in the runbook make the documented validation command non-reproducible for other developers.</violation>

<violation number="2" location="docs/internal/dbt-pr-review-demo-scenario-corpus.md:68">
P3: Implemented 50-Scenario Corpus table is inconsistent with the defined metadata schema: the evidence column uses the header `deterministic evidence` (taken from the meaning text) instead of the schema field name `evidence`, and the required `validation` column is missing entirely.</violation>
</file>

<file name="packages/opencode/src/altimate/review/orchestrate.ts">

<violation number="1" location="packages/opencode/src/altimate/review/orchestrate.ts:1152">
P1: Fallback PII findings are suppressed based on capability, even when diff-scoped core PII classification produced no result, which can hide real PII exposure findings on classifier failure.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

cubic-dev-ai · 2026-06-08T21:10:41Z

+  const diffScopedPiiFiles = new Set<string>()
+  if (input.runner.classifyPii && input.runner.columnLineage) {
+    for (const ctx of ctxByPath.values()) {
+      if (ctx.engineNewSql && ctx.file.status !== "deleted") diffScopedPiiFiles.add(ctx.file.path)


P1: Fallback PII findings are suppressed based on capability, even when diff-scoped core PII classification produced no result, which can hide real PII exposure findings on classifier failure.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/altimate/review/orchestrate.ts, line 1152: <comment>Fallback PII findings are suppressed based on capability, even when diff-scoped core PII classification produced no result, which can hide real PII exposure findings on classifier failure.</comment> <file context> @@ -1138,6 +1146,12 @@ export async function runReview(input: OrchestrateInput): Promise<VerdictEnvelop + const diffScopedPiiFiles = new Set<string>() + if (input.runner.classifyPii && input.runner.columnLineage) { + for (const ctx of ctxByPath.values()) { + if (ctx.engineNewSql && ctx.file.status !== "deleted") diffScopedPiiFiles.add(ctx.file.path) + } + } </file context>

Addressed by rebuilding PR #920 to the scoped production change only. This file/comment is from the previous b0c0fce version and is no longer in the PR diff. Current PR changes only packages/opencode/src/altimate/review/diff-filter.ts and packages/opencode/test/altimate/review-dbt-patterns.test.ts.

Thanks for the heads-up — I’ll skip that obsolete comment.

cubic-dev-ai · 2026-06-08T21:10:41Z

+    })
+    out.push(f)
+  }
+  out.push(...evaluateCatalog(file, "", added, removed, rubric))


P2: detectSchemaYmlPatterns now double-reports removed relationships tests by combining an existing hard-coded detector with the catalog rule for the same condition.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/altimate/review/dbt-patterns.ts, line 822: <comment>`detectSchemaYmlPatterns` now double-reports removed `relationships` tests by combining an existing hard-coded detector with the catalog rule for the same condition.</comment> <file context> @@ -794,29 +794,33 @@ export function detectSchemaYmlPatterns(file: ChangedFile, rubric: Rubric): Find + }) + out.push(f) + } + out.push(...evaluateCatalog(file, "", added, removed, rubric)) + return out + .map((f) => ({ ...f, severity: clampSeverity(f.category, f.severity, f.confidence) })) </file context>

Suggested change

out.push(...evaluateCatalog(file, "", added, removed, rubric))

out.push(

...evaluateCatalog(file, "", added, removed, rubric).filter(

(f) =>

!(

genuinelyRemoved.some((l) => /\brelationships\b/i.test(l)) &&

f.evidence?.tool === "rule-catalog" &&

(f.evidence?.result as any)?.rule === "relationship-test-removed-yml"

),

),

)

Addressed by rebuilding PR #920 to the scoped production change only. This file/comment is from the previous b0c0fce version and is no longer in the PR diff. Current PR changes only packages/opencode/src/altimate/review/diff-filter.ts and packages/opencode/test/altimate/review-dbt-patterns.test.ts.

Thanks for the update — I’ll ignore that obsolete comment.

cubic-dev-ai · 2026-06-08T21:10:41Z

+with DuckDB, `dbt build`, `dbt compile`, `dbt docs generate`, and the default
+focused review command above. AI must remain disabled for demo acceptance.
+
+| id | branch | status | title | category | expected | deterministic evidence | artifact_needs | demo_script |


P3: Implemented 50-Scenario Corpus table is inconsistent with the defined metadata schema: the evidence column uses the header deterministic evidence (taken from the meaning text) instead of the schema field name evidence, and the required validation column is missing entirely.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/internal/dbt-pr-review-demo-scenario-corpus.md, line 68: <comment>Implemented 50-Scenario Corpus table is inconsistent with the defined metadata schema: the evidence column uses the header `deterministic evidence` (taken from the meaning text) instead of the schema field name `evidence`, and the required `validation` column is missing entirely.</comment> <file context> @@ -0,0 +1,217 @@ +with DuckDB, `dbt build`, `dbt compile`, `dbt docs generate`, and the default +focused review command above. AI must remain disabled for demo acceptance. + +| id | branch | status | title | category | expected | deterministic evidence | artifact_needs | demo_script | +|---|---|---|---|---|---|---|---|---| +| s001 | `demo/safe-refactor` | implemented | Safe CTE refactor is approved | safe_refactor | APPROVE | no findings | manifest, catalog, target-base | Proves the reviewer stays quiet for harmless SQL cleanup. | </file context>

Addressed by rebuilding PR #920 to the scoped production change only. This file/comment is from the previous b0c0fce version and is no longer in the PR diff. Current PR changes only packages/opencode/src/altimate/review/diff-filter.ts and packages/opencode/test/altimate/review-dbt-patterns.test.ts.

Got it, thanks for the scope update.

cubic-dev-ai · 2026-06-08T21:10:41Z

+Status: active corpus build, started 2026-06-08.
+
+Goal: build 50 customer-demo-ready dbt PR scenarios in
+`/Users/anandgupta/codebase/altimate-code/demo/dbt-pr-review-demo`. Each


P3: Machine-specific absolute paths in the runbook make the documented validation command non-reproducible for other developers.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/internal/dbt-pr-review-demo-scenario-corpus.md, line 6: <comment>Machine-specific absolute paths in the runbook make the documented validation command non-reproducible for other developers.</comment> <file context> @@ -0,0 +1,217 @@ +Status: active corpus build, started 2026-06-08. + +Goal: build 50 customer-demo-ready dbt PR scenarios in +`/Users/anandgupta/codebase/altimate-code/demo/dbt-pr-review-demo`. Each +scenario should be a small PR branch that either demonstrates a deterministic +reviewer catch or proves the reviewer correctly stays quiet for a safe change. </file context>

Addressed by rebuilding PR #920 to the scoped production change only. This file/comment is from the previous b0c0fce version and is no longer in the PR diff. Current PR changes only packages/opencode/src/altimate/review/diff-filter.ts and packages/opencode/test/altimate/review-dbt-patterns.test.ts.

Got it, thanks for the update.

dev-punia-altimate · 2026-06-09T05:44:28Z

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

connection_refused [1.00ms]
timeout
permission_denied
parse_error
network_error
auth_failure
rate_limit
internal_error
empty_error
connection_refused
timeout
permission_denied [1.00ms]
parse_error
network_error
auth_failure

Next Step

Please address the failing cases above and re-run verification.

cc @anandgupta42

anandgupta42 · 2026-06-10T00:06:44Z

Addressed review feedback for the current scope:

PR Fix dbt YAML review file classification #920 was rebuilt from the earlier 16-file/internal-doc version to the scoped production fix only.
Stale inline comments on files no longer in the PR have been replied to individually.
The PR body now starts with the required PINEAPPLE marker.
Current changed files are only:
- packages/opencode/src/altimate/review/diff-filter.ts
- packages/opencode/test/altimate/review-dbt-patterns.test.ts
Local validation passed: bun test --timeout 30000 test/altimate/review-dbt-patterns.test.ts and bun run typecheck.

The older centralized test failure comment was for the prior PR contents; the current branch has been force-updated and checks are re-running on commit 6185bfd.

github-actions Bot added the contributor label Jun 8, 2026

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts Outdated

cubic-dev-ai Bot reviewed Jun 8, 2026

View reviewed changes

anandgupta42 force-pushed the codex/dbt-review-demo-corpus branch from b0c0fce to 6185bfd Compare June 10, 2026 00:01

anandgupta42 changed the title ~~Finalize dbt PR review demo corpus~~ Fix dbt YAML review file classification Jun 10, 2026

Fix dbt YAML review file classification

0ac9a2a

anandgupta42 force-pushed the codex/dbt-review-demo-corpus branch from 6185bfd to 0ac9a2a Compare June 10, 2026 00:13

anandgupta42 merged commit 954cc32 into main Jun 10, 2026
15 checks passed

-  out.push(...evaluateCatalog(file, "", added, removed, rubric))
+  out.push(
+    ...evaluateCatalog(file, "", added, removed, rubric).filter(
+      (f) =>
+        !(
+          genuinelyRemoved.some((l) => /\brelationships\b/i.test(l)) &&
+          f.evidence?.tool === "rule-catalog" &&
+          (f.evidence?.result as any)?.rule === "relationship-test-removed-yml"
+        ),
+    ),
+  )

Conversation

anandgupta42 commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

anandgupta42 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

anandgupta42 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

anandgupta42 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

anandgupta42 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

dev-punia-altimate commented Jun 9, 2026

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

Next Step

Uh oh!

anandgupta42 commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anandgupta42 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading