fix(extraction): plumb custom_extraction_instructions into entity summary prompts#1578
Open
maxziro wants to merge 1 commit into
Open
fix(extraction): plumb custom_extraction_instructions into entity summary prompts#1578maxziro wants to merge 1 commit into
maxziro wants to merge 1 commit into
Conversation
…mary prompts custom_extraction_instructions reaches the node and edge extraction prompts but not the entity summary stage: extract_attributes_from_nodes and its summary batch helpers had no way to receive it, so the extract_summaries_batch / extract_entity_summaries_from_episodes prompts ran unconstrained. With instructions like 'write in Italian', names and facts honored the directive while freshly composed summaries came back in English. Thread the parameter through extract_attributes_from_nodes -> _extract_entity_summaries_batch -> _process_summary_flight into both summary prompt templates (empty string when unset, matching the other extraction prompts), and pass it at both call sites (add_episode and the bulk path via _resolve_nodes_and_edges_bulk). Co-Authored-By: Claude Fable 5 <[email protected]>
Contributor
|
I have read the CLA Document and I hereby sign the CLA behalf on myself, e-mail: [email protected] or I have read the CLA Document and I hereby sign the CLA behalf of my company, e-mail: [email protected] Signature is valid for 6 months. This bot will be retriggered when the Contributor License Agreement comment has been provided. Posted by the CLA Assistant Lite bot. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
custom_extraction_instructionsis honored by the node and edge extraction prompts, but it never reaches the entity summary stage:extract_attributes_from_nodesand its summary helpers have no way to receive it, so theextract_summaries_batch/extract_entity_summaries_from_episodesprompts always run unconstrained.Practical effect (how we hit this): with instructions like "Perform all extraction in Italian", entity names and edge facts come back in Italian as expected, while freshly composed entity summaries come back in English — even when the source episode is entirely Italian. Summaries that are produced by the fact-appending shortcut look fine (the facts are already constrained), which makes the gap surface only intermittently, on summaries the LLM actually composes.
Change
Thread the existing parameter through the summary path, mirroring how the extraction prompts already handle it:
extract_attributes_from_nodes(..., custom_extraction_instructions=None)→_extract_entity_summaries_batch→_process_summary_flight→ prompt context (''when unset).{context['custom_extraction_instructions']}in both summary prompt templates (extract_summaries_batchandextract_entity_summaries_from_episodes), same bare-line pattern asextract_message/extract_json/extract_text.add_episodeand the bulk path (_resolve_nodes_and_edges_bulk, called byadd_episode_bulk).No behavior change when the parameter is unset (empty string interpolation, like the other prompts). The Go-mirrored system prompt (
_entity_episode_summary_system_prompt) is intentionally untouched — the injection lands in the user message.Tests
Three unit tests in
tests/utils/maintenance/test_node_operations.py, mirroring the existing custom-instructions plumbing tests intest_bulk_utils.py:skip_fact_appending=True) receives them too,extract_attributes_from_nodespasses them through to the batch helper.tests/utils/maintenance/test_node_operations.py: 36 passed (33 existing + 3 new).🤖 Generated with Claude Code