Summary
is_valid_plugin() in chatbot-core/api/tools/utils.py opens, reads, and JSON-parses plugin_names.json from disk on every invocation.since this function is called during every search_plugin_docs() tool execution,it adds unnecessary disk I/O and JSON parsing latency to every plugin-related query.
Present Behavior
def is_valid_plugin(plugin_name: str) -> bool:
def tokenize(item: str) -> str:
item = item.replace('-', '')
return item.replace(' ', '').lower()
list_plugin_names_path = os.path.join(os.path.abspath(__file__),
"..", "..", "data", "raw", "plugin_names.json")
with open(list_plugin_names_path, "r", encoding="utf-8") as f:
list_plugin_names = json.load(f) # ← disk I/O + JSON parse every call
for name in list_plugin_names:
if tokenize(plugin_name) == tokenize(name): # ← linear scan every call
return True
return False
Why this Is a Problem
plugin_names.json is static data-it never changes at runtime, so re-reading it from disk is wasted I/O.
each call performs a full linear scan over the entire plugin list with tokenization applied to every element.
in the agentic architecture (_get_reply_simple_query_pipeline), tool calls can be invoked multiple times per query due to the relevance reformulation loop (up to max_reformulate_iterations), amplifying the overhead.
Proposed Fix
import functools
@functools.lru_cache(maxsize=1)
def _load_plugin_names() -> frozenset:
"""Load and cache the set of known plugin names (tokenized)."""
def tokenize(item: str) -> str:
return item.replace('-', '').replace(' ', '').lower()
list_plugin_names_path = os.path.join(
os.path.abspath(__file__), "..", "..", "data", "raw", "plugin_names.json"
)
with open(list_plugin_names_path, "r", encoding="utf-8") as f:
list_plugin_names = json.load(f)
return frozenset(tokenize(name) for name in list_plugin_names)
def is_valid_plugin(plugin_name: str) -> bool:
def tokenize(item: str) -> str:
return item.replace('-', '').replace(' ', '').lower()
return tokenize(plugin_name) in _load_plugin_names()
Benefits:
file is read once, not on every call
frozenset gives O(1) lookup instead of O(n) linear scan
lru_cache is thread-safe for reads (safe with the existing threading model)....
Summary
is_valid_plugin()inchatbot-core/api/tools/utils.pyopens, reads, and JSON-parsesplugin_names.jsonfrom disk on every invocation.since this function is called during everysearch_plugin_docs()tool execution,it adds unnecessary disk I/O and JSON parsing latency to every plugin-related query.Present Behavior
Why this Is a Problem
plugin_names.json is static data-it never changes at runtime, so re-reading it from disk is wasted I/O.
each call performs a full linear scan over the entire plugin list with tokenization applied to every element.
in the agentic architecture (_get_reply_simple_query_pipeline), tool calls can be invoked multiple times per query due to the relevance reformulation loop (up to max_reformulate_iterations), amplifying the overhead.
Proposed Fix
Benefits:
file is read once, not on every call
frozenset gives O(1) lookup instead of O(n) linear scan
lru_cache is thread-safe for reads (safe with the existing threading model)....