-
-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better normalization cache #738
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
They key seems to be too specific. Specially by using the prop, which basically makes it redudant to cache tokens that are found in different props. The goal of that cache seems to be to trade memory for time, but right now seems to be storing equal computations in different keys which basically is inefficient. The only thing that the prop is needed for is the `stemmerSkipProperties`.
6303b59
to
4ded31a
Compare
Hi there! Thank you so much for your PR. Were you able to run the tests locally? |
hey, not really! I also found another positive side-effect of simplifying the normalization cache. Right now, these are always cache misses, which means that using the highlight plugin is pretty slow: https://github.com/askorama/orama/blob/7512ca936ffa1543a0e267ae9e9b3d4187be0bdd/packages/plugin-match-highlight/src/index.ts#L74 |
I can't run tests locally, I get this when doing
|
so it looks like the regression was introduced here: https://github.com/askorama/orama/pull/350/files I'm currently working it around doing my own tokenizer, but it's not ideal |
You should use |
They key seems to be too specific. Specially by using the prop, which
basically makes it redudant to cache tokens that are found in different
props. The goal of that cache seems to be to trade memory for time, but
right now seems to be storing equal computations in different keys which
basically is inefficient. The only thing that the prop is needed for is
the
stemmerSkipProperties
.