feat(torch): generate PyTorch backend from local ATen schemas by voltjia · Pull Request #595 · InfiniTensor/InfiniOps

voltjia · 2026-05-09T07:58:26Z

Summary

Add pyyaml to [build-system].requires so the build can parse the generated-backend allowlist during pip install.
Add scripts/generate_torch_ops.py, a YAML-driven generator that reads the locally installed PyTorch / torchgen packaged native_functions.yaml.
Add scripts/torch_ops.yaml with 525 allowlisted ATen base op names for generated PyTorch backend coverage.
Integrate generated base headers and slot-8 PyTorch backends into CMake behind WITH_TORCH=ON.
Update scripts/generate_wrappers.py so generated base and backend files participate in Python bindings and dispatch metadata.
Add generated-backend tests that collect generated/torch_ops_metadata.json and execute each active PyTorch backend slot.

Motivation

InfiniOps already supports multiple native/vendor backends. This PR adds a generated PyTorch C++ backend path so a large set of ATen _out operators can be exposed through the same operator/binding surface without hand-writing hundreds of base and backend files.

The generator intentionally uses the local PyTorch installation instead of downloading native_functions.yaml: enabling WITH_TORCH already requires PyTorch to be installed, and using the local schema keeps generated code matched to the PyTorch headers and libraries being compiled.

Closes N/A.

Type of Change

feat — new feature / new operator / new platform
N/A: fix — this is not a bug-fix-only PR.
N/A: perf — no runtime hot-path optimization is intended.
N/A: refactor — the primary change is new generated PyTorch backend support.
test — adds generated-backend coverage.
N/A: docs — this is not documentation-only.
build / ci — integrates generation into the build.
N/A: chore — this is not tooling-only.
N/A: Breaking change.

Platforms Affected

Test Results on Supported Platforms

All supported platforms were validated with WITH_TORCH=ON and pytest -v; generated PyTorch backend tests were included in the collected pytest set on every platform below.

Timing columns were measured inside the test container with base image build skipped:

build/install = pip install .[dev] --no-build-isolation, including generated source creation and native extension compilation.
pytest = full pytest run.
total = build/install + pytest; source sync and launcher overhead are excluded.

Platform	Built	`pytest` Result	build/install	pytest	total	Notes
NVIDIA	Yes	`9206 passed, 8664 skipped, 104 warnings in 177.98s (0:02:57)`	889s	182s	1074s	Full suite passed.
Iluvatar	Yes	`7410 passed, 8942 skipped, 100 warnings in 150.05s (0:02:30)`	629s	153s	783s	Full suite passed.
MetaX	Yes	`8698 passed, 7654 skipped, 102 warnings in 226.55s (0:03:46)`	1301s	245s	1547s	Full suite passed.
Cambricon	Yes	`5899 passed, 10069 skipped, 317 warnings in 461.84s (0:07:41)`	2055s	469s	2526s	Full suite passed.
Moore	Yes	`8471 passed, 7899 skipped, 119 warnings in 237.29s (0:03:57)`	2019s	244s	2263s	Full suite passed.
Ascend	Yes	`7406 passed, 8904 skipped, 108 warnings in 232.62s (0:03:52)`	1041s	246s	1367s	Pytest summary passed; outer wrapper returned `137` after the summary.

Full `pytest` output (optional)

NVIDIA: 9206 passed, 8664 skipped, 104 warnings in 177.98s (0:02:57)
Iluvatar: 7410 passed, 8942 skipped, 100 warnings in 150.05s (0:02:30)
MetaX: 8698 passed, 7654 skipped, 102 warnings in 226.55s (0:03:46)
Cambricon: 5899 passed, 10069 skipped, 317 warnings in 461.84s (0:07:41)
Moore: 8471 passed, 7899 skipped, 119 warnings in 237.29s (0:03:57)
Ascend: 7406 passed, 8904 skipped, 108 warnings in 232.62s (0:03:52); outer wrapper exit code 137 after pytest summary

Benchmark / Performance Impact

N/A for runtime hot paths. Generated PyTorch backends call the corresponding ATen _out implementation.

The validation run above records build/install and pytest wall times to help track generated-backend build cost. Build/install time is currently the dominant cost on every platform.

The latest validation also saved verbose pytest logs and generated source trees under the local result directory ci-results/remote/torch-codegen-pr595-reviewfix-vgenerated-20260517/. The generated source trees are also archived locally as /tmp/torch-codegen-pr595-reviewfix-generated-20260517.tar.gz.

Notes for Reviewers

The branch was rebased onto the latest master after fix(torch): make gemm fallback portable #611 and fix(tests): run causal_softmax reference on CPU #612 merged.
The current stack is four commits:
- build: add PyYAML build dependency
- feat(scripts): add YAML-driven torch op codegen
- build(torch): integrate generated torch backend
- test(torch): add generated backend coverage
Slot 8 is reserved for generated PyTorch backends.
Hand-written src/base/<op>.h files continue to shadow generated base headers.
Optional ATen types remain hidden for now; exposing them properly is a separable follow-up.

Checklist

Title, Branch, and Commits

PR title follows Conventional Commits.
Branch name follows <type>/xxx-yyyy-zzzz — feat/torch-codegen.
Each commit message follows Conventional Commits.
Large PR with meaningful, well-formed, independently reviewable commits.
No stray merge commits from master — branch is rebased cleanly on top of current master.
No fixup! / squash! / wip commits remain.

Scope and Design

Changes are scoped to PyTorch backend codegen, build integration, wrapper generation, and generated-backend tests.
No dead code, commented-out blocks, debug prints, or ownerless TODOs were added.
No unrelated formatting churn that would obscure the diff.
Public API changes are intentional: generated infini::ops::<Op> classes and slot-8 PyTorch backends are part of this feature.

General Code Hygiene

The code is self-explanatory; comments were added only where the why is non-obvious.
Every modified or added file ends with a single trailing newline.
No trailing whitespace, tab/space mixing, or stray BOMs.
Identifiers in comments and error messages are wrapped in backticks.
All comments and error messages are in English.
Comments and error messages are complete sentences with terminal punctuation where applicable.

C++ Specific

N/A: no committed .h, .cc, .cuh, or .mlu files are modified by this PR.

Python Specific

ruff format --check passed for modified Python files in the CI container.
ruff check passed for modified Python files in the CI container.
GitHub ruff workflow is green on the latest pushed commit.
Comments are complete English sentences with backticked code references where applicable.
pytest.skip messages follow existing test conventions.
Blank-line rules around function bodies, control-flow statements, and returns were checked by ruff format --check.
Type hints are present on new dataclasses and public helper functions where practical.

Testing

pytest was run on every supported platform — see the platform table above.
Generated PyTorch backend tests are included in the full-suite runs.
New functionality has matching tests under tests/.
Tests use pytest.mark.parametrize for generated op metadata, shapes, dtypes, and tolerances.
Default dtype/device parameterization is reused where appropriate.
Any vendor-specific skip/crash guard is contained in the generated-backend test harness.
N/A: this is a feature PR rather than a bug fix, so no regression-only test is required.

Build, CI, and Tooling

Builds from a fresh source copy with pip install .[dev] --no-build-isolation on every supported platform.
compile_commands.json generation remains enabled through existing CMake configuration.
WITH_TORCH auto-detection was updated to use an installed Python with torch.
Existing CUDA-like GPU backend mutual exclusion remains in place.
GitHub clang-format workflow is green on the latest pushed commit.
GitHub ruff workflow is green on the latest pushed commit.
New build dependency pyyaml is added to pyproject.toml's [build-system].requires.

Documentation

N/A: no user-facing documentation file was changed.

Security and Safety

No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
No third-party code is vendored.
No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were introduced.

voltjia requested a review from a team May 9, 2026 07:58

voltjia force-pushed the feat/torch-codegen branch from 9cb7b73 to be71261 Compare May 15, 2026 14:51

build: add PyYAML build dependency

5d6e1d8

voltjia force-pushed the feat/torch-codegen branch from 0897fc9 to 26082cd Compare May 17, 2026 06:49

voltjia changed the title ~~feat: YAML-driven torch op codegen with canonical naming and exposed semantic params~~ feat(torch): generate PyTorch backend from local ATen schemas May 17, 2026

voltjia added 3 commits May 17, 2026 17:00

feat(scripts): add YAML-driven torch op codegen

21d376f

build(torch): integrate generated torch backend

d7319fb

test(torch): add generated backend coverage

0016a27

voltjia force-pushed the feat/torch-codegen branch from 26082cd to 0016a27 Compare May 17, 2026 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(torch): generate PyTorch backend from local ATen schemas#595

feat(torch): generate PyTorch backend from local ATen schemas#595
voltjia wants to merge 4 commits into
masterfrom
feat/torch-codegen

voltjia commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

voltjia commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Type of Change

Platforms Affected

Test Results on Supported Platforms

Benchmark / Performance Impact

Notes for Reviewers

Checklist

Title, Branch, and Commits

Scope and Design

General Code Hygiene

C++ Specific

Python Specific

Testing

Build, CI, and Tooling

Documentation

Security and Safety

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

voltjia commented May 9, 2026 •

edited

Loading