Skip to content

fix(engine): validate processor plugin impls#609

Merged
johnnygreco merged 2 commits intomainfrom
johnny/fix-processor-plugin-validation
May 6, 2026
Merged

fix(engine): validate processor plugin impls#609
johnnygreco merged 2 commits intomainfrom
johnny/fix-processor-plugin-validation

Conversation

@johnnygreco
Copy link
Copy Markdown
Contributor

@johnnygreco johnnygreco commented May 6, 2026

📋 Summary

assert_valid_plugin now validates processor plugin implementations against the Processor base class. This catches malformed processor plugins before they reach the processor registry and keeps plugin-type implementation checks in one maintainable mapping.

🔗 Related Issue

N/A

🔄 Changes

  • Added PluginType.PROCESSOR implementation validation in engine/testing/utils.py.
  • Made plugin implementation base checks table-driven for column generators, seed readers, and processors.
  • Replaced raw assert statements with explicit AssertionError raises so validation still runs under optimized Python.
  • Added parametrized tests covering valid and invalid implementation checks for column generator, seed reader, and processor plugins.
  • Added a processor-specific regression test that rejects a ConfigurableTask implementation that is not a Processor subclass.

🧪 Testing

  • uv run --package data-designer-engine pytest packages/data-designer-engine/tests --collect-only -q
  • uv run --package data-designer-engine pytest packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py packages/data-designer-engine/tests/engine/processing/processors/test_registry.py
  • uv run --package data-designer-engine ruff check packages/data-designer-engine/src/data_designer/engine/testing/utils.py packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py
  • uv run --package data-designer-engine ruff format --check packages/data-designer-engine/src/data_designer/engine/testing/utils.py packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py
  • Unit tests added/updated
  • E2E tests N/A — testing utility change only

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated (N/A — no architecture changes)

@johnnygreco johnnygreco requested a review from a team as a code owner May 6, 2026 15:03
@johnnygreco johnnygreco deployed to agentic-ci May 6, 2026 15:03 — with GitHub Actions Active
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 6, 2026

Greptile Summary

This PR refactors assert_valid_plugin in the engine testing utilities to be table-driven and adds missing PluginType.PROCESSOR implementation validation. The changes also replace bare assert statements with explicit raise AssertionError calls so validation remains active under Python's -O optimization flag.

  • testing/utils.py: Introduces _PLUGIN_IMPLEMENTATION_BASES, a PluginType → base class dict; adds a module-level completeness guard that raises at import time if the map doesn't cover all PluginType members; and collapses the old if/elif dispatch into a single issubclass lookup, adding processor validation for the first time.
  • test_plugin_testing_utils.py: Adds parametrized tests for all three plugin types (valid and invalid impl) plus a dedicated regression test that rejects a ConfigurableTask subclass (that is not a Processor) when registered as a processor plugin.

Confidence Score: 5/5

Safe to merge — the changes are a clean, additive refactor with no altered external contracts and comprehensive test coverage.

Both files touch only test/validation utilities. The module-level completeness guard means any future PluginType addition that misses the map will be caught at import time. The explicit raise AssertionError replacement correctly preserves validation under optimized Python. Tests cover all three plugin types for valid and invalid impls, including the key regression case.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/testing/utils.py Table-driven refactor of assert_valid_plugin; adds Processor validation and module-level PluginType completeness guard. Logic is correct and future-safe.
packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py New parametrized test suite covering all three plugin types for both valid and invalid impl checks, plus a targeted regression test for TaskButNotProcessor. Coverage is complete and assertions are well-scoped.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[assert_valid_plugin called] --> B{config_cls subclass of ConfigBase?}
    B -- No --> C[raise AssertionError: config not ConfigBase]
    B -- Yes --> D[lookup implementation_base\nfrom _PLUGIN_IMPLEMENTATION_BASES]
    D --> E{impl_cls subclass of\nimplementation_base?}
    E -- No --> F[raise AssertionError: wrong impl base]
    E -- Yes --> G[validation passes]

    subgraph "Module import guard"
        H[utils.py imported] --> I{_PLUGIN_IMPLEMENTATION_BASES\ncovers all PluginType members?}
        I -- No --> J[raise AssertionError at import time]
        I -- Yes --> K[module ready]
    end

    subgraph "_PLUGIN_IMPLEMENTATION_BASES"
        L[COLUMN_GENERATOR → ConfigurableTask]
        M[SEED_READER → SeedReader]
        N[PROCESSOR → Processor]
    end
Loading

Reviews (5): Last reviewed commit: "test(engine): require plugin base map co..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Review of PR #609fix(engine): validate processor plugin impls

Summary

Small, well-scoped fix to assert_valid_plugin in packages/data-designer-engine/src/data_designer/engine/testing/utils.py:

  • Adds the missing PluginType.PROCESSOR branch, so processor plugins are now validated against the Processor base class (previously they silently passed).
  • Refactors the previous if/elif chain into a table-driven lookup (_PLUGIN_IMPLEMENTATION_BASES) so the mapping of plugin type → expected base class lives in one place.
  • Replaces bare assert with raise AssertionError(...) so validation still runs under python -O.
  • Adds three focused unit tests: a valid processor plugin, a non-Processor impl, and a ConfigurableTask subclass that isn't a Processor (nice corner case).

+87 / -8, testing-utility-only change. Very low blast radius.

Findings

Correctness

  • The new behavior is strictly stricter than before: previously the elif chain fell through silently for PluginType.PROCESSOR. Now a missing base produces a clear AssertionError. That matches the intent of a function called assert_valid_plugin.
  • _PLUGIN_IMPLEMENTATION_BASES covers all three current PluginType values (COLUMN_GENERATOR, SEED_READER, PROCESSOR). If a future PluginType is added without updating the table, the dict lookup will raise KeyError — louder failure than the old silent pass, which is the right direction. Worth noting only because the error surface changes from AssertionError to KeyError; a small .get() + explicit raise could yield a clearer message, but this is a nit on a test utility.
  • Error-message wording is preserved for the pre-existing branches: "Column generator plugin impl class must be a subclass of ConfigurableTask" and "Seed reader plugin impl class must be a subclass of SeedReader" still match via display_name.capitalize(). Good — keeps existing tests / external assertions stable.
  • Switching from assert to raise AssertionError is a reasonable hardening; the function name implies assertion semantics, and runs correctly under -O.

Style / project conventions

  • from __future__ import annotations present, absolute imports, modern typing (dict[...], tuple[...]) — follows AGENTS.md / STYLEGUIDE.md.
  • Import direction is preserved (engine importing engine/config — fine, no reverse imports).
  • Minor nit: the tuple[type[object], str] carries a redundant string — base_cls.__name__ on the class itself would drop a column:
    _PLUGIN_IMPLEMENTATION_BASES: dict[PluginType, type[object]] = {
        PluginType.COLUMN_GENERATOR: ConfigurableTask,
        PluginType.SEED_READER: SeedReader,
        PluginType.PROCESSOR: Processor,
    }
    ...
    f"... must be a subclass of {implementation_base.__name__}"
    Not blocking — keeping the name explicit may be deliberate if the display name is ever decoupled from the class name.
  • The _assert_subclass helper is a thin wrapper (3 lines). Readable, no objection.

Tests

  • New file tests/engine/testing/test_utils.py has correct SPDX header, matches the package test layout.
  • Three tests cover: happy path, impl isn't a class hierarchy member at all (NonProcessor), and a ConfigurableTask that isn't a Processor (the tricky case that the previous code missed entirely). Coverage matches the change.
  • pytest.raises(AssertionError, match=...) asserts the error message surface too — protects the user-facing wording.
  • No test for the PluginType.COLUMN_GENERATOR / SEED_READER branches being re-validated through the new table. Existing registry tests likely exercise these indirectly, but a couple more parametrized cases here would cheaply confirm that the refactor didn't regress the pre-existing branches.

Security / performance

  • No security implications (test utility, no I/O, no untrusted input).
  • Performance — negligible; single dict lookup on plugin registration.

Docs

  • No doc changes needed; the PR description accurately captures the behavior change.

Verdict

Looks good to merge. The change closes a real validation gap for PluginType.PROCESSOR, keeps pre-existing error messages stable, and is well tested. Two minor, optional improvements:

  1. Drop the redundant class-name string from _PLUGIN_IMPLEMENTATION_BASES and use base_cls.__name__.
  2. Add parametrized cases for the COLUMN_GENERATOR and SEED_READER branches so the refactor itself is test-covered, not just the new branch.

Neither is blocking.

@johnnygreco johnnygreco force-pushed the johnny/fix-processor-plugin-validation branch from e466d03 to 8f2b375 Compare May 6, 2026 15:07
Add the processor implementation base to assert_valid_plugin so
processor plugins are checked against Processor instead of only the
generic config contract. Keep plugin type validation table-driven and
raise explicit AssertionError messages so checks are not skipped under
optimized Python.

Signed-off-by: Johnny Greco <[email protected]>
@johnnygreco johnnygreco force-pushed the johnny/fix-processor-plugin-validation branch from 8f2b375 to 1b110f2 Compare May 6, 2026 15:13
@johnnygreco
Copy link
Copy Markdown
Contributor Author

Addressed the optional follow-ups:

  • Simplified _PLUGIN_IMPLEMENTATION_BASES to store only the base class and use base_cls.__name__ in the error message.
  • Added parametrized assert_valid_plugin tests covering both valid and invalid implementation cases for COLUMN_GENERATOR, SEED_READER, and PROCESSOR.
  • Kept the processor-specific regression case that rejects a ConfigurableTask implementation when it is not a Processor subclass.

Validation run locally:

  • uv run --package data-designer-engine pytest packages/data-designer-engine/tests --collect-only -q
  • uv run --package data-designer-engine pytest packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py packages/data-designer-engine/tests/engine/processing/processors/test_registry.py
  • uv run --package data-designer-engine ruff check packages/data-designer-engine/src/data_designer/engine/testing/utils.py packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py
  • uv run --package data-designer-engine ruff format --check packages/data-designer-engine/src/data_designer/engine/testing/utils.py packages/data-designer-engine/tests/engine/testing/test_plugin_testing_utils.py

CI is green for the substantive checks after the update.

@andreatgretel
Copy link
Copy Markdown
Contributor

(posting as a regular comment because GitHub is currently failing new inline review threads, see status)

nit on packages/data-designer-engine/src/data_designer/engine/testing/utils.py:16: would lock the map to PluginType so a future plugin type can't slip past. non-blocking, wdyt?

}
assert set(_PLUGIN_IMPLEMENTATION_BASES) == set(PluginType), "Plugin implementation base map must cover all plugin types"

andreatgretel
andreatgretel previously approved these changes May 6, 2026
Copy link
Copy Markdown
Contributor

@andreatgretel andreatgretel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! added one nit only for future-proofing.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@johnnygreco
Copy link
Copy Markdown
Contributor Author

johnnygreco commented May 6, 2026

@andreatgretel good call, addressed in ef37911. I added the coverage check after _PLUGIN_IMPLEMENTATION_BASES; I used an explicit AssertionError instead of a bare assert to keep it consistent with the rest of this utility.

@johnnygreco johnnygreco force-pushed the johnny/fix-processor-plugin-validation branch from c72f2cd to ef37911 Compare May 6, 2026 16:47
@johnnygreco johnnygreco merged commit 9214637 into main May 6, 2026
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants