Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support backwards index scan and seeks + utilize indexes in removing ORDER BY #1209

Merged
merged 12 commits into from
Apr 9, 2025

Conversation

jussisaurio
Copy link
Collaborator

@jussisaurio jussisaurio commented Mar 29, 2025

Main stuff

  • Support iterating an index backwards
  • Support scanning an index (instead of seeking with a condition)
  • Support backwards index seeks
  • Support backwards rowid seeks
  • Fix existing backwards iteration logic for table btrees
  • Remove ORDER BY entirely if any index satisfies the ordering
  • Add fuzz tests for rowid seeks, 1 and 2 column index seeks

Bytecode examples (note the lack of order by sorting):

one column index order by, forwards:

limbo> explain select first_name from users order by age;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     OpenReadAsync      0     2     0                    0   table=users, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     274   0                    0   table=age_idx, root=274
4     OpenReadAwait      0     0     0                    0   
5     RewindAsync        1     0     0                    0   
6     RewindAwait        1     12    0                    0   Rewind table age_idx
7       DeferredSeek     1     0     0                    0   
8       Column           0     1     1                    0   r[1]=users.first_name
9       ResultRow        1     1     0                    0   output=r[1]
10    NextAsync          1     0     0                    0   
11    NextAwait          1     7     0                    0   
12    Halt               0     0     0                    0   
13    Transaction        0     0     0                    0   write=false
14    Goto               0     1     0                    0 

one column index order by, backwards:

limbo> explain select first_name from users order by age desc;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     OpenReadAsync      0     2     0                    0   table=users, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     274   0                    0   table=age_idx, root=274
4     OpenReadAwait      0     0     0                    0   
5     LastAsync          1     0     0                    0   
6     LastAwait          1     0     0                    0   
7       DeferredSeek     1     0     0                    0   
8       Column           0     1     1                    0   r[1]=users.first_name
9       ResultRow        1     1     0                    0   output=r[1]
10    PrevAsync          1     0     0                    0   
11    PrevAwait          1     0     0                    0   
12    Halt               0     0     0                    0   
13    Transaction        0     0     0                    0   write=false
14    Goto               0     1     0                    0 

rowid seek, backwards:

limbo> explain select * from users where id < 100 order by id desc;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     19    0                    0   Start at 19
1     OpenReadAsync      0     2     0                    0   table=users, root=2
2     OpenReadAwait      0     0     0                    0   
3     Integer            100   11    0                    0   r[11]=100
4     SeekLT             0     18    11                   0   
5       RowId            0     1     0                    0   r[1]=users.rowid
6       Column           0     1     2                    0   r[2]=users.first_name
7       Column           0     2     3                    0   r[3]=users.last_name
8       Column           0     3     4                    0   r[4]=users.email
9       Column           0     4     5                    0   r[5]=users.phone_number
10      Column           0     5     6                    0   r[6]=users.address
11      Column           0     6     7                    0   r[7]=users.city
12      Column           0     7     8                    0   r[8]=users.state
13      Column           0     8     9                    0   r[9]=users.zipcode
14      Column           0     9     10                   0   r[10]=users.age
15      ResultRow        1     10    0                    0   output=r[1..10]
16    PrevAsync          0     0     0                    0   
17    PrevAwait          0     0     0                    0   
18    Halt               0     0     0                    0   
19    Transaction        0     0     0                    0   write=false
20    Goto               0     1     0                    0 

two column order by, setup:

cargo run dualpk.db

Limbo v0.0.18-pre.3
Enter ".help" for usage hints.
limbo> .schema
CREATE TABLE a(b,c,d,e, primary key (d,c));

two column order by, forwards:

limbo> explain select * from a order by d,c;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     OpenReadAsync      0     2     0                    0   table=a, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     3     0                    0   table=sqlite_autoindex_a_1, root=3
4     OpenReadAwait      0     0     0                    0   
5     RewindAsync        1     0     0                    0   
6     RewindAwait        1     15    0                    0   Rewind table sqlite_autoindex_a_1
7       DeferredSeek     1     0     0                    0   
8       Column           0     0     1                    0   r[1]=a.b
9       Column           0     1     2                    0   r[2]=a.c
10      Column           0     2     3                    0   r[3]=a.d
11      Column           0     3     4                    0   r[4]=a.e
12      ResultRow        1     4     0                    0   output=r[1..4]
13    NextAsync          1     0     0                    0   
14    NextAwait          1     7     0                    0   
15    Halt               0     0     0                    0   
16    Transaction        0     0     0                    0   write=false
17    Goto               0     1     0                    0 

two column order by, forwards with index seek:

limbo> explain select * from a where d > 100 order by d,c;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     OpenReadAsync      0     2     0                    0   table=a, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     3     0                    0   table=sqlite_autoindex_a_1, root=3
4     OpenReadAwait      0     0     0                    0   
5     Integer            100   5     0                    0   r[5]=100
6     SeekGT             1     15    5                    0   
7       DeferredSeek     1     0     0                    0   
8       Column           0     0     1                    0   r[1]=a.b
9       Column           0     1     2                    0   r[2]=a.c
10      Column           0     2     3                    0   r[3]=a.d
11      Column           0     3     4                    0   r[4]=a.e
12      ResultRow        1     4     0                    0   output=r[1..4]
13    NextAsync          1     0     0                    0   
14    NextAwait          1     7     0                    0   
15    Halt               0     0     0                    0   
16    Transaction        0     0     0                    0   write=false
17    Goto               0     1     0                    0 

two column order by, forwards with index scan and termination condition:

limbo> explain select * from a where d < 100 order by d,c;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     18    0                    0   Start at 18
1     OpenReadAsync      0     2     0                    0   table=a, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     3     0                    0   table=sqlite_autoindex_a_1, root=3
4     OpenReadAwait      0     0     0                    0   
5     Null               0     5     0                    0   r[5]=NULL
6     SeekGT             1     17    5                    0   
7       Integer          100   6     0                    0   r[6]=100
8       IdxGE            1     17    6                    0   
9       DeferredSeek     1     0     0                    0   
10      Column           0     0     1                    0   r[1]=a.b
11      Column           0     1     2                    0   r[2]=a.c
12      Column           0     2     3                    0   r[3]=a.d
13      Column           0     3     4                    0   r[4]=a.e
14      ResultRow        1     4     0                    0   output=r[1..4]
15    NextAsync          1     0     0                    0   
16    NextAwait          1     7     0                    0   
17    Halt               0     0     0                    0   
18    Transaction        0     0     0                    0   write=false
19    Goto               0     1     0                    0 

two column order by, backwards:

limbo> explain select * from a order by d desc,c desc;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     OpenReadAsync      0     2     0                    0   table=a, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     3     0                    0   table=sqlite_autoindex_a_1, root=3
4     OpenReadAwait      0     0     0                    0   
5     LastAsync          1     0     0                    0   
6     LastAwait          1     0     0                    0   
7       DeferredSeek     1     0     0                    0   
8       Column           0     0     1                    0   r[1]=a.b
9       Column           0     1     2                    0   r[2]=a.c
10      Column           0     2     3                    0   r[3]=a.d
11      Column           0     3     4                    0   r[4]=a.e
12      ResultRow        1     4     0                    0   output=r[1..4]
13    PrevAsync          1     0     0                    0   
14    PrevAwait          1     0     0                    0   
15    Halt               0     0     0                    0   
16    Transaction        0     0     0                    0   write=false
17    Goto               0     1     0                    0 

two column order by, backwards with index seek:

limbo> explain select * from a where d < 100 order by d desc,c desc;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     OpenReadAsync      0     2     0                    0   table=a, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     3     0                    0   table=sqlite_autoindex_a_1, root=3
4     OpenReadAwait      0     0     0                    0   
5     Integer            100   5     0                    0   r[5]=100
6     SeekLT             1     15    5                    0   
7       DeferredSeek     1     0     0                    0   
8       Column           0     0     1                    0   r[1]=a.b
9       Column           0     1     2                    0   r[2]=a.c
10      Column           0     2     3                    0   r[3]=a.d
11      Column           0     3     4                    0   r[4]=a.e
12      ResultRow        1     4     0                    0   output=r[1..4]
13    PrevAsync          1     0     0                    0   
14    PrevAwait          1     0     0                    0   
15    Halt               0     0     0                    0   
16    Transaction        0     0     0                    0   write=false
17    Goto               0     1     0                    0

two column order by, backwards with index scan and termination condition:

limbo> explain select * from a where d > 100 order by d desc,c desc;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     18    0                    0   Start at 18
1     OpenReadAsync      0     2     0                    0   table=a, root=2
2     OpenReadAwait      0     0     0                    0   
3     OpenReadAsync      1     3     0                    0   table=sqlite_autoindex_a_1, root=3
4     OpenReadAwait      0     0     0                    0   
5     LastAsync          1     0     0                    0   
6     LastAwait          1     0     0                    0   
7       Integer          100   6     0                    0   r[6]=100
8       IdxLE            1     17    6                    0   
9       DeferredSeek     1     0     0                    0   
10      Column           0     0     1                    0   r[1]=a.b
11      Column           0     1     2                    0   r[2]=a.c
12      Column           0     2     3                    0   r[3]=a.d
13      Column           0     3     4                    0   r[4]=a.e
14      ResultRow        1     4     0                    0   output=r[1..4]
15    PrevAsync          1     0     0                    0   
16    PrevAwait          1     0     0                    0   
17    Halt               0     0     0                    0   
18    Transaction        0     0     0                    0   write=false
19    Goto               0     1     0                    0 

@jussisaurio jussisaurio changed the title Index fixes Index querying fixes Mar 29, 2025
@jussisaurio jussisaurio marked this pull request as ready for review March 29, 2025 16:13
@jussisaurio
Copy link
Collaborator Author

@PThorpe92 github wont let me add you as a reviewer for some reason so I'm adding you this way

@jussisaurio jussisaurio marked this pull request as draft March 30, 2025 20:08
@jussisaurio jussisaurio force-pushed the index-fixes branch 7 times, most recently from 51ffa6f to 190c0ae Compare April 8, 2025 08:29
@jussisaurio jussisaurio marked this pull request as ready for review April 8, 2025 08:32
@jussisaurio jussisaurio changed the title Index querying fixes Support backwards index scan and seeks + utilize indexes in removing ORDER BY Apr 8, 2025
Copy link
Contributor

@PThorpe92 PThorpe92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After very extensive review and local testing sir I can confidently say that this does indeed LGTM 🫡

The fuzz tests ran for so long that I thought it might have caught an infinite loop but it ended up pasing.

@jussisaurio jussisaurio merged commit aa6e2d8 into main Apr 9, 2025
38 checks passed
@jussisaurio jussisaurio deleted the index-fixes branch April 9, 2025 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants