Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeQuery surprisingly slow #412

Closed
enkelmedia opened this issue Mar 31, 2025 · 3 comments
Closed

NativeQuery surprisingly slow #412

enkelmedia opened this issue Mar 31, 2025 · 3 comments
Labels

Comments

@enkelmedia
Copy link

enkelmedia commented Mar 31, 2025

I'm not sure if this is an Examine or a Lucene.NET issue or if it might be expected.

I have a query that looks like this:

( ((pageTitle:lorem~0.8^10 OR pageTitle_sv-se:lorem~0.8^10) (FullTextContent:lorem~0.8 OR FullTextContent_sv-se:lorem~0.8) ) ) AND (FullTextPath:1051) AND ((__VariesByCulture:y AND __Published_sv-se:y) OR (__VariesByCulture:n AND __Published:y)) AND __IndexType:content AND -(hideFromSearch_sv-se:1 OR hideFromSearch:1) AND -(templateID:0)

It executes in Luke in 1534us but when I execute the same query using the Examine abstractions in Umbraco it takes over 250ms.

I wrapped some of the code in a simple StopWatch and it seems like index.Searcher.CreateQuery().NativeQuery(nativeQuery) is taking up most of the time here.

Image

  1. The call to NativeQuery takes 215ms
  2. The search takes 28ms

Is this to be expected or I'm I doing something wrong here?

EDIT:
After doing some more research it seems like the underlying query parser in Lucene.NET is very slow. I tried to build up the same query using types like FuzzyQuery, BooleanQuery and use a strongly typed query to avoid the parser. After this the query is created in no time.

Should this be interpreted as "avoid raw string queries"?

@Shazwazza
Copy link
Owner

There's a lot of assumptions here :) Lucene vs Lucene.NET can't be directly compared, nor can Luke and Examine. Also Lucene.NET is still in beta and there are some performance issues being actively worked on (i.e. apache/lucenenet#1151). There's a lot going on in Examine compared to just a simple query, there's multi-field query parsers, NRT, SearchManagers, etc...

Thx for doing further research into the query parser aspect of this though :) If this can be properly profiled, then we can see where the bottleneck is and either invest in updating Examine if applicable, or invest in Lucene.NET itself.

I'll close this for now but feel free to provider further information here if you get time to run some profiling.

@enkelmedia
Copy link
Author

Thanks for your answer @Shazwazza,

I think you're right I spent 2 hours trying to replicate the issue with a unit test in both Examine and the Lucene.NET project without success and suddenly I cannot reproduce the issue in the project where it first surfaced - it just "works" 🤣 I hate when this happens.

Sorry for the false alarm. Have a great day!

@Shazwazza
Copy link
Owner

Ugh those investigations are the worst ☹️ on one hand, it's good that poor performance can be replicated so it can be fixed, on the other its good that it doesn't actually perform poorly 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants