Skip to content

Saving state #5440

Answered by ggerganov
HarewVlad asked this question in Q&A
Discussion options

You must be logged in to vote

Yes, it's also more correct since kv_head does not necessarily correspond to the size of the KV cache. It works for main since the generation is (almost) always single-sequence.

It will use more data though since we will be storing the full cache regardless if it has been used or not. There should be some logic to skip empty cells. Overall, saving the state could be improved

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by HarewVlad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants