Omit state from the Assist LLM prompts #141034
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed change
Omit state from Assist LLM prompts, instead will rely on the GetHomeState tool for getting the current device state, when needed. This removes the need for a separate stateless API in MCP.
Quality
All changes to tool calling are all within confidence interval because all of the commands in our eval do not require the current state of the home to issue a command. In manual testing, models are able to use the
get_home_state
tool to get the current state when needed by a question (e.g. what is the temperature in the living room). These types of queries are not currently part of the eval set.On the
assist-mini
dataset we see that all models are still within confidence interval.Token Stats
Using token stats computed by #141118
assist
We see a 49% reduction in token count, with quality increased though still in confidence interval. (The "before"was lower and after run was higher and current leader board has accuracy at 91.2% with a CI of +/-6.2, so fairly wide and both the before and after as basically within range)
Before:
After:
assist-mini
We see a 16% reduction in token count with quality still in the confidence interval:
Before:
After:
Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
.To help with the load of incoming pull requests: