Skip to content

perf: cache plugin_names.json with lru_cache for O(1) lookups#353

Open
sharma-sugurthi wants to merge 2 commits intojenkinsci:mainfrom
sharma-sugurthi:perf/cache-plugin-names
Open

perf: cache plugin_names.json with lru_cache for O(1) lookups#353
sharma-sugurthi wants to merge 2 commits intojenkinsci:mainfrom
sharma-sugurthi:perf/cache-plugin-names

Conversation

@sharma-sugurthi
Copy link
Copy Markdown
Contributor

Summary

Fixes #352

is_valid_plugin() was re-reading plugin_names.json from disk and performing an O(n) linear scan on every invocation. since this function is called during every search_plugin_docs() tool execution, it adds unnecessary disk I/O latency to every plugin-related query.

Changes

chatbot-core/api/tools/utils.py

  • add _load_plugin_names() - reads plugin_names.json once on first access using @functools.lru_cache(maxsize=1), stores tokenized names in a frozenset for O(1) membership checks.
  • add _tokenize_plugin_name() - shared module-level helper that normalizes plugin names (strips hyphens, spaces, lowercases). Eliminates the duplicate tokenize() closures previously defined in both is_valid_plugin() and filter_retrieved_data().
  • simplify is_valid_plugin() - now a one-liner using cached set lookup.
  • update filter_retrieved_data() - uses the shared _tokenize_plugin_name() helper.

chatbot-core/tests/unit/tools/test_utils.py [NEW]

  • 15 unit tests covering:
    • _tokenize_plugin_name() - case, hyphens, spaces, empty string
    • _load_plugin_names() - frozenset return, tokenization, caching (file read once)
    • is_valid_plugin() - exact match, case-insensitive, hyphen-insensitive, invalid names
    • filter_retrieved_data() - matching, no match, empty input

@sharma-sugurthi sharma-sugurthi requested a review from a team as a code owner April 16, 2026 05:15
@berviantoleo
Copy link
Copy Markdown
Contributor

Looks good. I believe in this case, we will need an integration test to check if it really accesses the cache for the second, third, and more calls.

Comment thread chatbot-core/api/tools/utils.py Outdated
Comment thread chatbot-core/tests/unit/tools/test_utils.py Outdated
- Add load_plugin_names() with @lru_cache to read file once at first access
- Replace O(n) linear scan with frozenset O(1) membership check
- Extract shared tokenize_plugin_name() helper (removes duplicate tokenize())
- Add 10 tests: public API validation + integration tests for cache behavior

Fixes jenkinsci#352
@sharma-sugurthi sharma-sugurthi force-pushed the perf/cache-plugin-names branch from e7bc690 to 47c19ed Compare April 18, 2026 16:04
@sharma-sugurthi
Copy link
Copy Markdown
Contributor Author

removed TestTokenizePluginName and TestLoadPluginNames classes entirely and all tests now exercise only the public API (is_valid_plugin, load_plugin_names, filter_retrieved_data),also added a TestPluginNameCacheIntegration class with 3 integration tests that verify the cache works correctly (file read once across multiple calls, shared cache across functions, cache_clear forces reload) as you suggested.

@berviantoleo berviantoleo added enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted maintenance Targets chores, refactors and cleanups labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted maintenance Targets chores, refactors and cleanups

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Performance] is_valid_plugin() re-reads plugin_names.json from disk on every call - no caching

2 participants