Ensure unique node names and enable CPU fallback for OpenVINO by zhaixuejun1993 · Pull Request #192 · ravi9/llama.cpp

zhaixuejun1993 · 2026-05-27T06:04:15Z

This pull request introduces several improvements and fixes related to tensor source tracking, OpenVINO backend compatibility, and dynamic dimension handling in the ggml library. The main changes include adding an org_src pointer to tensors for tracking original sources, ensuring unique tensor names for OpenVINO graphs, and refining logic for dynamic dimension inference and test configuration.

Tensor source tracking improvements:

Added a new org_src pointer to the struct ggml_tensor, initialized to NULL, to keep track of the original source tensor, especially for in-place operations. This is reflected in both the struct definition and tensor initialization. [1] [2]
When copying tensors during backend graph splitting, the org_src field is set to reference the source tensor, enabling better tracking of tensor origins.

OpenVINO backend compatibility:

Implemented a workaround to ensure unique tensor names in OpenVINO graphs by appending node IDs to tensor names, preventing issues with duplicate names in certain models.
Updated test configuration logic to avoid adding the "Meta" device configuration if OpenVINO is present, preventing redundant or conflicting test setups.

Dynamic dimension and input detection fixes:

Updated dynamic dimension inference and input detection logic in OpenVINO decoder code to consider the new org_src field, improving accuracy in identifying input tokens and positions. [1] [2] [3]## Overview

Additional information

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure:

…GGML graph in api compute_model_outputs()

… models

…backend

…avi9#81)

…on handling

…ention

…ntion_pattern_case to easy extand

* added translate_1to1_match_1_input function and updated gelu and tanh translations * Remove unused translation function calls --------- Co-authored-by: Mustafa Cavus <[email protected]>

* OpenVINO backend: refactor VIEW related operation * Enable VIEW handling in following ops * OpenVINO backend does not support GGML_OP_NORM & GGML_OP_L2_NORM with VIEW input accuracy issue from OpenVINO

…matched types

Enable T5 model for architecture testing in OpenVINO backend

Enable jamba and kimi-linear for architecture tests

…-oss Fix accuracy issue and enable Arctic and Grok for arch tests

* Initiall gemma4 npu support * temp. fix for gemma4 accuracy bug on npu * Remove hardcoded names for npu-fold handling * revert static n tokens for cont translation as it is not needed * removed unused variable

…der cache. Add environment variable GGML_OPENVINO_ENABLE_CACHE (default: YES). When set to NO, the decoder_cache is bypassed and models are rebuilt from the cgraph on every inference call in both dynamic and static compute paths. This is useful for debugging and verifying correctness without caching interference.

…model_env Add GGML_OPENVINO_ENABLE_CACHE env var for decoder cache control

This reverts commit 0d29a9c.

…_log Disable debug log printing in OpenVINO backend

Enable OpenVINO cache support

…g_src to recorde the src ggml tensor for OpenVINO dynamic shape infer

zhaixuejun1993 and others added 30 commits May 20, 2026 16:08

Add interface is_model_splitted() to check the c-graph is splited or not

a8f15fb

Infer and propagate dynamic-dimension indices for all tensors in the …

c8c3bd4

…GGML graph in api compute_model_outputs()

Only do this for fallback sub graph

6c855e7

Move dynamic dims compute in graph missmatch

c7af12b

ggml-openvino: fix tensor data handling for PERMUTE/VIEW ops in split…

2a118eb

… models

ggml-openvino:add comments

54fe67e

ggml-openvino: override VIEW op_case to 0 for split model inputs

74ba8fd

openvino backend: Handle unsupported VIEW shape-mismatch in OpenVINO …

5ec12bd

…backend

Enable additional mul_mat tests and add tensor data saving function (r…

6f3e20f

…avi9#81)

ggml-openvino: fix CONT/TRANSPOSE mapping and improve dynamic-dimensi…

713bcb0

…on handling

OpenVINO: add NORM/TANH support and rework SOFT_MAX translation

4fbc557

ggml-openvino: extend VIEW handling

015b607

Enable -fa off (ravi9#118)

9e0f352

Enable --context-shift

8f05691

Fix llm param compute error for normal softmax not the softmax in att…

4c9b609

…ention

OpenVINO backend: fix error for attention size compute in llm param

1ba5fd8

use tensor->extra in infer_request i/o

644dbea

OpenVINO backend: refacter the compute_llm_params() func add get_atte…

a979e24

…ntion_pattern_case to easy extand

OpenVINO backend: clean unused code

3f433c5

1to1 match op update (ravi9#146)

3bc7e76

* added translate_1to1_match_1_input function and updated gelu and tanh translations * Remove unused translation function calls --------- Co-authored-by: Mustafa Cavus <[email protected]>

initial gemma4 support

19c79fd

removed hardcoded names for kv cache slicing

7897870

OpenVINO backend: Add new attention pattern for llm parameters compute

329c4b5

flash attn Q shape static conversion

f1e32c5

Remove slice in permute translation when n_seq is 1

33a2160

return optional in extract_layer_from_name

05c0385

OpenVINO backend: refactor VIEW related operation (ravi9#148)

bdc858d

* OpenVINO backend: refactor VIEW related operation * Enable VIEW handling in following ops * OpenVINO backend does not support GGML_OP_NORM & GGML_OP_L2_NORM with VIEW input accuracy issue from OpenVINO

OpenVINO backend: Add ops l2_norm & pad

51114e5

OpenVINO backend does not support CPY with non-contiguous data or mis…

05ff7d0

…matched types

add op SSM_CONV GATED_DELTA_NET

322bb87

zhaixuejun1993 and others added 20 commits May 25, 2026 10:49

OpenVINO backend: enable t5 for arch test

c2c5fe7

Merge pull request ravi9#181 from zhaixuejun1993/xuejun/arch-test-t5

58e411d

Enable T5 model for architecture testing in OpenVINO backend

OpenVINO backend: enable jamba for arch test

e2e143d

OpenVINO backend: remove warning for tmp

0e80117

OpenVINO backend: enable kimi-linear for arch test

1564679

Remove unused

2e7bb2f

Merge pull request ravi9#182 from zhaixuejun1993/xuejun/arch-test-jamba

24393b2

Enable jamba and kimi-linear for architecture tests

Fix gpt-oss accuracy issue

25cd873

OpenVINO backend: enable arctic for arch test

5dd95ea

OpenVINO backend: enable grok for arch test

6665562

Merge pull request ravi9#183 from zhaixuejun1993/xuejun/arch-test-gpt…

48ef5fe

…-oss Fix accuracy issue and enable Arctic and Grok for arch tests

Gemma4 initial npu support (ravi9#179)

0d29a9c

* Initiall gemma4 npu support * temp. fix for gemma4 accuracy bug on npu * Remove hardcoded names for npu-fold handling * revert static n tokens for cont translation as it is not needed * removed unused variable

Merge pull request ravi9#185 from zhaixuejun1993/xuejun/enable_cache_…

7466c28

…model_env Add GGML_OPENVINO_ENABLE_CACHE env var for decoder cache control

OpenVINO backend: disable debug log print

5f868d1

Revert "Gemma4 initial npu support (ravi9#179)"

fff8cd7

This reverts commit 0d29a9c.

Merge pull request ravi9#187 from zhaixuejun1993/xuejun/disable_debug…

f84c065

…_log Disable debug log printing in OpenVINO backend

Update TBB discovery. Delegated to OpenVINOs own config.

b2bcc3b

OpenVINO backend: GGML_OPENVINO_ENABLE_CACHE YES -> 1

c933520

Merge pull request ravi9#191 from zhaixuejun1993/xuejun/cache-modify

5d51822

Enable OpenVINO cache support

zhaixuejun1993 requested review from cavusmustafa and wine99 as code owners May 27, 2026 06:04

zhaixuejun1993 mentioned this pull request May 27, 2026

Enable OpenVINO backend fallback to CPU backend #184

Closed

github-actions Bot added OpenVINO ggml testing labels May 27, 2026

zhaixuejun1993 added 3 commits May 27, 2026 14:06

OpenVINO backend: 1) ensure unique node names for OpenVINO; 2) add or…

9ae7e83

…g_src to recorde the src ggml tensor for OpenVINO dynamic shape infer

OpenVINO backend: enable fallback for openVINO to CPU backend

cec9de2

OpenVINO backend: fprintf -> GGML_LOG_INFO

800a338

ravi9 force-pushed the dev_backend_openvino branch from d4aa38a to 5cdd4f0 Compare June 2, 2026 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure unique node names and enable CPU fallback for OpenVINO#192

Ensure unique node names and enable CPU fallback for OpenVINO#192
zhaixuejun1993 wants to merge 88 commits into
ravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/openvino-fallback-cpu-v2

zhaixuejun1993 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

zhaixuejun1993 commented May 27, 2026

Additional information

Requirements

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants