What
Currently, the existing tools return whatever the API call is giving them.
We need to minify and shorten the output schema to only include the fields that are actually relevant and required.
Why
When agent start doing MCP calls, the output from the tool calls gets added to the input tokens/contexts. This can easily explode the context size as well as starts poisoning the LLM context. This is classic output poisoning problem. And this adds to $$ value of the API.
We need to make sure we reduce the output responses.
How
For Repo search tool, maybe only:
- url
- full_text (something like the
supplemented full ltext field existing in the SDE response. This is what SDE actually indexed. So, we can just have this returned.
- reliability score
- github metadata
For SDE Search TOol, maybe:
- url
- document/text
- relevance score
Only that much I think is good enough for agent.
cc: @sanzog03 @iamsims
What
Currently, the existing tools return whatever the API call is giving them.
We need to minify and shorten the output schema to only include the fields that are actually relevant and required.
Why
When agent start doing MCP calls, the output from the tool calls gets added to the input tokens/contexts. This can easily explode the context size as well as starts poisoning the LLM context. This is classic output poisoning problem. And this adds to $$ value of the API.
We need to make sure we reduce the output responses.
How
For Repo search tool, maybe only:
supplemented full ltextfield existing in the SDE response. This is what SDE actually indexed. So, we can just have this returned.For SDE Search TOol, maybe:
Only that much I think is good enough for agent.
cc: @sanzog03 @iamsims