-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Garbage Collection for Interned Values #602
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for salsa-rs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
CodSpeed Performance ReportMerging #602 will degrade performances by 22.49%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This looks good to me after adding a few tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks good to me!
I see there is a test failure....have you investigated it at all? |
My hunch is that the test is testing behavior that's not relevant anymore but I can look. |
@nikomatsakis I'm also hitting the assertion error in this example: #[salsa::tracked]
fn function<'db>(db: &'db dyn Database, input: Input) -> Interned<'db> {
Interned::new(db, 0)
}
let input = Input::new(&db, 0);
function(&db, input);
input.set_field1(&mut db).to(1);
function(&db, input); The function does not read the input so the interned value is memoized, meaning it is not re-interned in R2. What should be done in this case (and the similar failing test)? I'm also a little unclear about the testing instructions. Should I add a log event for when values are re-interned, or is there another way I can test that? I'm also not sure how to access the |
That would sound reasonable to me, similar to the backdate event. We could also consider adding an event when a value is garbage collected. |
If I understand it correctly the problem is that interned ingredients created outside of a query are never revalidated because there's no corresponding query that creates them. I see two options:
|
I took a closer look at I think the problem here is that |
We had a brief discussion in our meeting today--- @ibraheemdev I think values that are interned outside of any salsa query have to be concerned immortal. |
The This one seems trickier. The query does not have a dependency on its input. However, changing the input still causes the database revision to be advanced. I'm not exactly sure why, but I believe the second query call only performs a shallow memoization check, meaning that |
src/interned.rs
Outdated
|
||
// Record a dependency on this value. | ||
let index = self.database_key_index(id); | ||
zalsa_local.report_tracked_read(index, Durability::MAX, current_revision); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure, is current_revision
the correct argument to pass here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels incorrect, should it be the last_interned_at
value?
Oh this is interesting. I have to take a closer look. We probably have a similar problem for when the input has a higher durability and Salsa skips the deep verification because of it. |
Okay, I think I now understand the underlying problem. I'm not yet sure what the right fix is. I'll reach out in https://salsa.zulipchat.com/#narrow/channel/333573-Contributing-to-Salsa/topic/LRU.20.28GC.29.20for.20interned.20values.20.23597 |
I thought more about this and I feel somewhat confident that we should store the What's different for interned values is that multiple queries can create the same tracked struct and the queries might also change their durability over time. For now, I'd suggest that we store the highest durability across all reads because we have to be pessimistic and assume the interned value only gets recreated after a HIGH durability change if we have one HIGH and two low durability queries that both create the same interned value. The read path should compare against the most recent revision for the interned value's durability instead of using the current revision. This should unblock this PR. What's unclear is how garbage collection would work because there's no guarantee that a query has to run in every revision. That means it's possible for interned values never to have been read in the last 1000 revisions, and they might get garbage collected because of it. However, the result of a cached query might still depend on the interned value (or even reference it in the output). |
@MichaReiser @ibraheemdev -- just to confirm, this is still pending the changes we discussed in our last sync meeting, correct? |
@nikomatsakis yes, I'll be working on it this week. |
Hmm, the accumulator regression is surprising. Did we accidentally break the optimization that skips traversing queries that are known not to have any accumulated values? |
I don't see anything obvious that should have caused the regression but I suspect it's simply the fact that we now need to look up the revision from the active query? This otherwise looks good to me. Do we need to remove any code? I remember you mentioned that we don't need the "immortal" handling. |
I'm going to try to add some more tests before this gets merged. |
|
||
id | ||
} | ||
|
||
// We won any races so should intern the data | ||
Err(slot) => { | ||
let zalsa = db.zalsa(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, we already have zalsa in scope here
let zalsa = db.zalsa(); |
let internal_data = db.zalsa().table().get::<Value<C>>(id); | ||
let last_changed_revision = db.zalsa().last_changed_revision(internal_data.durability()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let internal_data = db.zalsa().table().get::<Value<C>>(id); | |
let last_changed_revision = db.zalsa().last_changed_revision(internal_data.durability()); | |
let zalsa = db.zalsa(); | |
let internal_data = zalsa.table().get::<Value<C>>(id); | |
let last_changed_revision = zalsa.last_changed_revision(internal_data.durability()); |
to reduce the amount of dynamically dispatched calls
// Trigger a new revision. | ||
let _zalsa_mut = db.zalsa_mut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Trigger a new revision. | |
let _zalsa_mut = db.zalsa_mut(); | |
// Trigger a new revision. | |
db.zalsa_mut().new_revision(); |
The behavior here has changed on master.
pub struct OutputDependencyIndex { | ||
pub struct DatabaseKeyIndex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation behind discarding the type level differentiation of input and output keys btw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to take a closer look but I think the reason is simply that they're the same now? see #597
Per the mentoring instructions in #597.
Still needs some tests. Note this does not actually implement garbage collection but adds the necessary plumbing to do so.