You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
When running sort queries with large LIMIT and small memory limit, it can go out of memory
select *
from tbl
order by c1
limit 1000000000
It's possible to enable spilling capability inside TopK data structure to let such queries complete.
Describe the solution you'd like
Approach 1
Add back fetch field back to ExternalSorter to support external TopK queries, which is removed in #15525 (because it's unused now)
This approach can be slightly faster, but requires to add a configuration option to switch to ExternalSorter path for large LIMIT, instead of the default TopK path. (or let optimizer figure out when to switch automatically, though it's also tricky)
Approach 2
Add spilling capability inside TopK executor. When the memory limit is reached, it can fallback to out-of-core execution without introducing a new configuration.
Describe alternatives you've considered
No response
Additional context
The sort + limit query is usually run with a small LIMIT count, so it's mostly memory-efficient. @alamb is referring to this issue as a sort of exploratory idea, so perhaps someone with real usage knows better how to get it implemented 🤔
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge?
When running sort queries with large
LIMIT
and small memory limit, it can go out of memoryIt's possible to enable spilling capability inside
TopK
data structure to let such queries complete.Describe the solution you'd like
Approach 1
Add back
fetch
field back toExternalSorter
to support external TopK queries, which is removed in #15525 (because it's unused now)This approach can be slightly faster, but requires to add a configuration option to switch to
ExternalSorter
path for largeLIMIT
, instead of the defaultTopK
path. (or let optimizer figure out when to switch automatically, though it's also tricky)Approach 2
Add spilling capability inside TopK executor. When the memory limit is reached, it can fallback to out-of-core execution without introducing a new configuration.
Describe alternatives you've considered
No response
Additional context
The sort + limit query is usually run with a small
LIMIT
count, so it's mostly memory-efficient.@alamb is referring to this issue as a sort of exploratory idea, so perhaps someone with real usage knows better how to get it implemented 🤔
The text was updated successfully, but these errors were encountered: