Skip to content

Ensure unique node names and enable CPU fallback for OpenVINO#192

Open
zhaixuejun1993 wants to merge 88 commits into
ravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/openvino-fallback-cpu-v2
Open

Ensure unique node names and enable CPU fallback for OpenVINO#192
zhaixuejun1993 wants to merge 88 commits into
ravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/openvino-fallback-cpu-v2

Conversation

@zhaixuejun1993
Copy link
Copy Markdown
Collaborator

This pull request introduces several improvements and fixes related to tensor source tracking, OpenVINO backend compatibility, and dynamic dimension handling in the ggml library. The main changes include adding an org_src pointer to tensors for tracking original sources, ensuring unique tensor names for OpenVINO graphs, and refining logic for dynamic dimension inference and test configuration.

Tensor source tracking improvements:

  • Added a new org_src pointer to the struct ggml_tensor, initialized to NULL, to keep track of the original source tensor, especially for in-place operations. This is reflected in both the struct definition and tensor initialization. [1] [2]
  • When copying tensors during backend graph splitting, the org_src field is set to reference the source tensor, enabling better tracking of tensor origins.

OpenVINO backend compatibility:

  • Implemented a workaround to ensure unique tensor names in OpenVINO graphs by appending node IDs to tensor names, preventing issues with duplicate names in certain models.
  • Updated test configuration logic to avoid adding the "Meta" device configuration if OpenVINO is present, preventing redundant or conflicting test setups.

Dynamic dimension and input detection fixes:

  • Updated dynamic dimension inference and input detection logic in OpenVINO decoder code to consider the new org_src field, improving accuracy in identifying input tokens and positions. [1] [2] [3]## Overview

Additional information

Requirements

zhaixuejun1993 and others added 30 commits May 20, 2026 16:08
* added translate_1to1_match_1_input function and updated gelu and tanh translations

* Remove unused translation function calls

---------

Co-authored-by: Mustafa Cavus <[email protected]>
* OpenVINO backend: refactor VIEW related operation

* Enable VIEW handling in following ops

* OpenVINO backend does not support GGML_OP_NORM & GGML_OP_L2_NORM with VIEW input accuracy issue from OpenVINO
zhaixuejun1993 and others added 20 commits May 25, 2026 10:49
Enable T5 model for architecture testing in OpenVINO backend
Enable jamba and kimi-linear for architecture tests
…-oss

Fix accuracy issue and enable Arctic and Grok for arch tests
* Initiall gemma4 npu support

* temp. fix for gemma4 accuracy bug on npu

* Remove hardcoded names for npu-fold handling

* revert static n tokens for cont translation as it is not needed

* removed unused variable
…der cache. Add environment variable GGML_OPENVINO_ENABLE_CACHE (default: YES). When set to NO, the decoder_cache is bypassed and models are rebuilt from the cgraph on every inference call in both dynamic and static compute paths. This is useful for debugging and verifying correctness without caching interference.
…model_env

Add GGML_OPENVINO_ENABLE_CACHE env var for decoder cache control
…_log

Disable debug log printing in OpenVINO backend
@ravi9 ravi9 force-pushed the dev_backend_openvino branch from d4aa38a to 5cdd4f0 Compare June 2, 2026 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants