-
Notifications
You must be signed in to change notification settings - Fork 3.4k
feat: ✨ Refactor and improve grepTool #7370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: ✨ Refactor and improve grepTool #7370
Conversation
@RomneyDa thoughts on these additional enhancements? I wanted to solve for a number of use cases I saw attempted but failed. The big ones were -A -B, -l and path scoped searches. Note often I see failed grep attempts by Claude Sonnet with fallback to full file reads or terminal use. |
aede3e7
to
bfae80b
Compare
try { | ||
// const { query, warning } = prepareQueryForRipgrep(rawQuery); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RomneyDa in my testing, I noted adding this check in before processing reduces the number of successful queries slightly. Does it make sense to include it? Without it I see a higher degree of success, now this is with Claude Sonnet 4 which might be more reliable at building regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which cases does it reduce successful queries? Might be a specific bug with the prepareQueryForRipgrep function? We could look at fixing/adding a test case. The preparation was introduced because even claude 4 regularly outputs queries in a way where the JSON parsed pattern is invalid regex. Not sure if it's a prompting issue or model issue or parsing issue but prepareQueryForRipgrep
resolves most cases
Open to other approaches especially if seems to be a prompting issue or JSON parsing issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note comment from first review about the systemMessageDescription
Partly requesting changes because we're still not sure we want another tool that uses array args for now, because of issues with array args when using system message tools, and many models just aren't great with array args. Seems like a definite improvement for e.g. claude 4. I think we'd want a simple pattern grep search tool available for some models. Might make sense to more or less duplicate and use config-dependent to only add the fleshed out grep search tool for great models. Thoughts?
exampleArgs: [["query", ".*main_services.*"]], | ||
exampleArgs: [ | ||
["query", ".*main_services.*"], | ||
["query", "throw new Error"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path?
EDIT: to clarify, I mean the system message tools example for this will not be valid. missing args "path" and "args"
try { | ||
// const { query, warning } = prepareQueryForRipgrep(rawQuery); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which cases does it reduce successful queries? Might be a specific bug with the prepareQueryForRipgrep function? We could look at fixing/adding a test case. The preparation was introduced because even claude 4 regularly outputs queries in a way where the JSON parsed pattern is invalid regex. Not sure if it's a prompting issue or model issue or parsing issue but prepareQueryForRipgrep
resolves most cases
Open to other approaches especially if seems to be a prompting issue or JSON parsing issue
@RomneyDa got it! Why don't I look at if it's possible to scrape the args and path out of the query string too. I like the idea of maybe two tools and applying by model. I'll take your lead on what you are doing for the search and replace work if that makes sense? |
Description
Performed extensive analysis of common grep scenarios being attempted by the LLM and researched common features enabled in grep tools in other code assistants.
Implementing recommended changes to allow the LLM to request specific arguments like -A and -B or -l for example to research code just before or just after.
Implementating scoping by add an optional path parameter so the LLM doesn't need to always search the whole repo which might result in truncation. Also allow scoping searches to specific files to avoid the need to always read the whole file.
AI Code Review
@continue-general-review
or@continue-detailed-review
Checklist
Screen recording or screenshot
[ When applicable, please include a short screen recording or screenshot - this makes it much easier for us as contributors to review and understand your changes. See this PR as a good example. ]
Tests
Added exstensive tests.
Perfomed extensive manual testing by asking the LLM to generate common lsTool type queries against the test repo and observed the outcomes expecting full passes.
Summary by cubic
Refactored grep search to accept full ripgrep args and added path scoping, context flags, and -l support. This makes searches more precise, reduces noise, and helps avoid repo-wide scans.
New Features
Migration