Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: move symbolic and boolean algebra code into new crate #3570

Merged
merged 4 commits into from
Dec 17, 2024

Conversation

kevinzwang
Copy link
Member

@kevinzwang kevinzwang commented Dec 13, 2024

Moving some of the code we have around symbolic/boolean algebra on expressions into its own crate, because I anticipate that we will be building more of this kind of thing, so it would be nicer to consolidate it as well as make it easier to reuse. It also allows us to better test these things in isolation of the context they are being used in.

For example, we'll be building some optimization rules that more intelligently finds predicates for filter pushdown into joins, and that may use both split_conjunction as well as some expression simplification logic.

Also took this opportunity to fix a typo (conjuct -> conjunct) and rename conjunct (which is an adjective) to combine_conjunction. Otherwise everything else is pretty much a straightforward move

Copy link

codspeed-hq bot commented Dec 13, 2024

CodSpeed Performance Report

Merging #3570 will not alter performance

Comparing kevin/daft-algebra (7a97eee) with main (47f5897)

Summary

✅ 27 untouched benchmarks

Copy link

codecov bot commented Dec 14, 2024

Codecov Report

Attention: Patch coverage is 81.97674% with 31 lines in your changes missing coverage. Please review.

Project coverage is 77.81%. Comparing base (35ed63c) to head (7a97eee).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-algebra/src/simplify.rs 77.34% 29 Missing ⚠️
...al-plan/src/optimization/rules/push_down_filter.rs 87.50% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##             main    #3570    +/-   ##
========================================
  Coverage   77.81%   77.81%            
========================================
  Files         716      718     +2     
  Lines       87987    88303   +316     
========================================
+ Hits        68463    68712   +249     
- Misses      19524    19591    +67     
Files with missing lines Coverage Δ
src/common/scan-info/src/expr_rewriter.rs 47.87% <100.00%> (ø)
src/daft-algebra/src/boolean.rs 100.00% <100.00%> (ø)
src/daft-dsl/src/optimization.rs 100.00% <ø> (+1.88%) ⬆️
...lan/src/optimization/rules/simplify_expressions.rs 92.75% <ø> (+9.04%) ⬆️
...cal-plan/src/optimization/rules/unnest_subquery.rs 90.08% <100.00%> (-0.05%) ⬇️
src/daft-sql/src/planner.rs 73.88% <100.00%> (+1.10%) ⬆️
...al-plan/src/optimization/rules/push_down_filter.rs 94.29% <87.50%> (ø)
src/daft-algebra/src/simplify.rs 77.34% <77.34%> (ø)

... and 15 files with indirect coverage changes

Copy link
Contributor

@universalmind303 universalmind303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have any strong opinions on this, but wondering why this needs to be it's own crate instead of a module of daft-dsl?

@kevinzwang
Copy link
Member Author

I don't really have any strong opinions on this, but wondering why this needs to be it's own crate instead of a module of daft-dsl?

I considered this too and I don't have a strong opinion either. Just thought that this was a good enough abstraction to be able to separate out into its own crate, and if we use any libraries for symbolic algebra in the future, that can be isolated into this crate. But other than that I don't really have much reason.

@universalmind303
Copy link
Contributor

I considered this too and I don't have a strong opinion either. Just thought that this was a good enough abstraction to be able to separate out into its own crate, and if we use any libraries for symbolic algebra in the future, that can be isolated into this crate. But other than that I don't really have much reason.

👍 I'm fine with these changes then as long as no one else has strong objections.

@kevinzwang
Copy link
Member Author

@samster25 any thoughts? Otherwise I will merge this in

@kevinzwang
Copy link
Member Author

kevinzwang commented Dec 16, 2024

@samster25 follow-up on weekly kickoff discussion, daft-core is only used in daft-algebra for the Schema struct since that is necessary to get the datatype of an expression. I could also get it from daft-schema instead, was just following convention of importing it via daft_core::prelude::*.

@samster25
Copy link
Member

@kevinzwang It would be better to get it from daft-schema! We want to minimize the dependancies on daft-core since we are going to split up daft-dsl soon

src/common/scan-info/src/expr_rewriter.rs Show resolved Hide resolved
src/daft-algebra/Cargo.toml Outdated Show resolved Hide resolved
@kevinzwang kevinzwang enabled auto-merge (squash) December 17, 2024 20:01
@kevinzwang kevinzwang merged commit 5165e5e into main Dec 17, 2024
41 checks passed
@kevinzwang kevinzwang deleted the kevin/daft-algebra branch December 17, 2024 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants