[Approximation Framework] Multifold Improvement in Multi-Clause Boolean Query (Window Scoring Approach)

### Is your feature request related to a problem? Please describe

# Implement Windowed Bulk Scoring Approach for ApproximateBooleanQuery

## Problem Statement

The current ResumableDISI approach for ApproximateBooleanQuery outlined in [this RFC](https://github.com/opensearch-project/OpenSearch/issues/18784) has a fundamental issue: **BKD trees traverse documents by point values, not docID order**. This causes the conjunction to miss lower docIDs that appear later in the traversal sequence, resulting in fewer hits than expected.

### Root Cause
- BKD trees organize leaves by **spatial/temporal proximity** (point values), not docIDs
- Lower docIDs can appear later in BKD traversal order
- When ResumableDISI stops at 10k documents, it may miss lower docIDs that come later
- ConjunctionDISI positions past these missed docIDs, permanently excluding potential matches

### Example Issue
```
BKD Traversal Order: [737k, 825k, 847k, ..., 698k, 699k, 700k, ...]
                      ↑                        ↑
                   Early stop              Dense cluster (missed!)

ConjunctionDISI Position: 737348 → Can never go back to find 698k cluster
```

### Describe the solution you'd like

## Implemented Solution: Windowed Approach

Based on **Alternative 1** from the RFC, we implemented a windowed bulk scoring approach:

### Key Features
1. **Window Size Heuristic**: `min(minCost, maxCost/2^7)`
2. **Progressive Expansion**: 3x growth (10k → 30k → 90k → 270k)
3. **Fresh Iterators**: Each window creates completely new iterators (no BKD state)
4. **CancellableBulkScorer Compatible**: Returns lower docIDs when window expands

### Implementation Details
- **ApproximateBooleanScorerSupplier**: Manages windowed expansion logic
- **ApproximatePointRangeQuery**: Modified to accept dynamic window sizes via `getWithSize()`
- **Fresh BKD Traversal**: Each window starts from root, avoiding ResumableDISI and BKD state

## Future Optimization Opportunity

The current windowed approach has **repeated work overhead** - each window re-processes documents from the beginning.

### Proposed Enhancement
Store a copy of the most recent DISI as an instance variable and use `public void visit(DocIdSetIterator iterator)` in ApproximatePointRangeQuery to:

1. **Resume from last position**: Add the stored DISI to the currently constructed one using BKDState
2. **Eliminate repeated collection**: Only collect new documents beyond the previous window
3. **Minimize overhead**: Only leapfrogging conjunction work remains, no document re-processing

### Benefits
- **Performance**: Eliminates redundant document collection across windows
- **Efficiency**: Maintains windowed approach benefits while reducing computational overhead
- **Scalability**: Better performance for queries requiring large windows

## Advanced Expansion Strategies

The current **3x expansion** is arbitrary and may not be optimal for all use cases. Future enhancements could include:

### 1. Cost-Based Fast Expansion
Expand iterators in order of their costs for dense hit scenarios:
- **Example**: Iterators with costs `(200k, 800k, 10M, 30M)`
- **Strategy**: Expand by 200k first, then 800k, then 10M, etc.
- **Benefit**: Dense datasets find hits quickly with smaller expansions

### 2. Hit-Based Adaptive Expansion
Scale expansion based on conjunction hit density:
- **Initial**: `max(10k, minCost/2^7)`
- **Adaptive Scaling**:
  - 1k hits found → expand by 10x
  - 5k hits found → expand by 2x
  - High hit density → smaller expansions
  - Low hit density → larger expansions

### 3. User-Configurable Strategies
Allow users to choose expansion strategy based on their data characteristics:
- **Fast Expansion**: Cost-ordered for dense datasets
- **Adaptive Expansion**: Hit-density based scaling
- **Fixed Expansion**: Current 3x approach (default)
- **Custom**: User-defined expansion factors

### Implementation Considerations
- **Strategy Selection**: Query-level or index-level configuration
- **Performance Metrics**: Track expansion efficiency for auto-tuning
- **Fallback Logic**: Default to fixed expansion if adaptive strategies fail

### Related component

Search:Performance

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Approximation Framework] Multifold Improvement in Multi-Clause Boolean Query (Window Scoring Approach) #19045

Is your feature request related to a problem? Please describe

Implement Windowed Bulk Scoring Approach for ApproximateBooleanQuery

Problem Statement

Root Cause

Example Issue

Describe the solution you'd like

Implemented Solution: Windowed Approach

Key Features

Implementation Details

Future Optimization Opportunity

Proposed Enhancement

Benefits

Advanced Expansion Strategies

1. Cost-Based Fast Expansion

2. Hit-Based Adaptive Expansion

3. User-Configurable Strategies

Implementation Considerations

Related component

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Approximation Framework] Multifold Improvement in Multi-Clause Boolean Query (Window Scoring Approach) #19045

Description

Is your feature request related to a problem? Please describe

Implement Windowed Bulk Scoring Approach for ApproximateBooleanQuery

Problem Statement

Root Cause

Example Issue

Describe the solution you'd like

Implemented Solution: Windowed Approach

Key Features

Implementation Details

Future Optimization Opportunity

Proposed Enhancement

Benefits

Advanced Expansion Strategies

1. Cost-Based Fast Expansion

2. Hit-Based Adaptive Expansion

3. User-Configurable Strategies

Implementation Considerations

Related component

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions