Skip to content

[Feature Request] FileTools: regex-capable content search via ripgrep with Python fallback #7645

@Vigtu

Description

@Vigtu

Problem Description

FileTools.search_content today has several limitations that make it unsuitable for anything beyond trivial lookups:

  • Substring-only matching — no regex. An agent can't search for r"def \w+", r"class [A-Z]", or r"TODO.*priority".
  • Pure Python — walks the tree, opens each file, scans line by line. Slow on monorepos.
  • Hardcoded file types — only files with an extension in TEXT_EXTENSIONS are searched. .proto, .tf, .nix, .go, and everything else is invisible.
  • Snippet-based output — returns file path + size + a character snippet around the first match. Useful for a human reader, not for an agent that wants structured per-line hits it can reason about.

Claude Code, Cursor, and every modern coding assistant reach for ripgrep as the search primitive. Agno should too, with a pure-Python fallback for users who don't have rg installed so the tool remains zero-dependency.

Proposed Solution

Add grep to FileTools with the following shape:

def grep(
    self,
    pattern: str,
    path: Optional[str] = None,
    output_mode: Literal["files_with_matches", "content", "count"] = "files_with_matches",
    include: Optional[str] = None,
    ignore_case: bool = False,
    context: int = 0,
    limit: int = 250,
    multiline: bool = False,
) -> str

Key properties:

  1. ripgrep when available, Python re fallback otherwise. shutil.which("rg") resolved once at __init__; the JSON result shape is identical across backends so callers never special-case.
  2. Three output modes. files_with_matches (default, paths only), content (per-line hits with line numbers), count (per-file totals). Gives the agent an explicit verbosity knob.
  3. include glob — filename filter like "*.py" or "**/*.tsx".
  4. multiline — pattern can cross line boundaries when needed (ripgrep -U --multiline-dotall, Python re.DOTALL).
  5. Hard cap at GREP_MAX_LIMIT=1000 — protects the agent's context from accidental blow-up; docstring instructs the agent to narrow the search when truncated: true rather than raise limit.
  6. search_content stays untouchedgrep is additive, not a replacement.

Parameter names (ignore_case, include, context, limit) are aligned with the existing CodingTools.grep for consistent vocabulary.

Alternatives Considered

  • Extend search_content with use_regex=True flag plus output mode flags. Rejected: clutter of flags on a method whose name doesn't promise regex. Cleaner to add grep as a sibling.
  • Hard ripgrep dependency. Rejected: adds a non-Python dependency to what is today a zero-dep toolkit. The fallback path keeps Agno installable without any binary.
  • Use fff.nvim or similar. Rejected: no Python binding, no standalone CLI, requires Neovim.

Additional Context

Implementation and test coverage in #7642.

Would you like to work on this?

  • Yes, I'd love to work on it! (PR open)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions