-
Notifications
You must be signed in to change notification settings - Fork 75
Make chunk map as global side metadata, and each space only lists its own chunks using chunk map #1304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
6c236b8
to
4472263
Compare
I don't quite follow what you mean. Do you mean that chunks may be used so local metadata is mapped? Because how can it be unmapped if the chunk is used? Or do you mean some chunks in the range may not be used so the side metadata is unmapped. |
Yeah. That is a typo. I meant chunks may not be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The major question I have with this PR is that why do we want to associate free chunks with a space? A free chunk is free and to me it doesn't make sense to associate with a space. In my branch I just used 0x0
to mark a chunk as completely free and any space could use it theoretically.
When I implemented this in my branch I didn't do any performance evaluation or think about alternatives. I would at least like to make sure this doesn't degrade performance much for Immix
or MarkSweep
.
Also I would like if the PR title was made less verbose. Something like: "Make chunk map a global metadata" or "Fix bugs with chunk map being shared across spaces" is sufficient I think.
/// Set chunk state | ||
pub fn set(&self, chunk: Chunk, state: ChunkState) { | ||
/// Set a chunk as allocated, or as free. | ||
pub fn set_allocated(&self, chunk: Chunk, allocated: bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this function name is confusing. set_allocated
to me sounds like this function will set the chunk to the allocated state, but it can actually set it to both allocated or free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The boolean argument in set_allocated(bool)
should be pretty clear that you can either set it to 'allocated' or 'not allocated'. Do you have a better suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why set
is not good enough? The previous one's signature is pretty much the same except state
is now a bool
. Maybe @wks has a better idea. Or if he's fine with the new name, then that's also okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt after changing the argument from an enum to a boolean, chunk_map.set(chunk, true)
was not as clear as chunk_map.set_allocated(chunk, true)
so I made the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough 👍 I'll wait for @wks to see if he has a better suggestion.
} | ||
|
||
/// Get chunk state, regardless of the space. This should always be private. | ||
fn get_any(&self, chunk: Chunk) -> ChunkState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_any
is not really descriptive of what it's doing. Should be something like get_internal
or get_private
I feel like.
/// A range of all chunks in the heap. | ||
pub fn all_chunks(&self) -> impl Iterator<Item = Chunk> + '_ { | ||
let chunk_range = self.chunk_range.lock(); | ||
RegionIterator::<Chunk>::new(chunk_range.start, chunk_range.end) | ||
.filter(|c| self.get(*c).is_some()) | ||
} | ||
|
||
/// A range of all chunks in the heap. | ||
pub fn all_chunks(&self) -> RegionIterator<Chunk> { | ||
pub fn all_allocated_chunks(&self) -> impl Iterator<Item = Chunk> + '_ { | ||
let chunk_range = self.chunk_range.lock(); | ||
RegionIterator::<Chunk>::new(chunk_range.start, chunk_range.end) | ||
.filter(|c| self.get(*c).is_some_and(|state| state.is_allocated())) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand why we need this. The issue here is that when a chunk is freed, we still associate a space index with it. But why is that the case? We can just have the 0x0
space descriptor associated with it (I think 0x0
is reserved for the empty space). This would mean we don't need to have two functions for iterating through chunks for a space since we will automatically skip the free ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just making sure all_chunks
does the same thing as before, which lists all the chunks for the space no matter it is allocated or free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a grep of where the all_chunks
function is used. Most of the times it's used as if all_allocated_chunks
. The only times it's used as actually all chunks regardless of allocation state is when we are iterating through the chunks to bulk clear/set metadata bits. But I think it doesn't make sense to do that for free chunks since that (i.e. bulk clearing/setting metadata) should happen when we either free or allocate a (new) chunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I don't think the above is a good idea anyway. We shouldn't be going through free chunks when clearing/setting metadata. It should almost always be the allocated chunks. Do you have a use-case in mind where it makes sense to have a space associated with a free chunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. I will try replace the current use of all_chunks
to all_allocated_chunks
and see if it works.
} | ||
|
||
/// A byte-map to record all the allocated chunks. | ||
/// A plan can use this to maintain records for the chunks that they used, and the states of the chunks. | ||
/// Any plan that uses the chunk map should include the `ALLOC_TABLE` spec in their local sidemetadata specs | ||
/// Any plan that uses the chunk map should include the `ALLOC_TABLE` spec in their local sidemetadata specs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Any plan that uses the chunk map should include the `ALLOC_TABLE` spec in their local sidemetadata specs. | |
/// Any plan that uses the chunk map should include the `ALLOC_TABLE` spec in their global sidemetadata specs. |
@@ -60,6 +60,8 @@ define_side_metadata_specs!( | |||
MS_ACTIVE_CHUNK = (global: true, log_num_of_bits: 3, log_bytes_in_region: LOG_BYTES_IN_CHUNK), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my branch I did a quick fix and didn't bother to coalesce MS_ACTIVE_CHUNK
and CHUNK_MARK
but I think we should for this PR. It's not useful to have the same metadata under different names (they basically are the same metadata).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MS_ACTIVE_CHUNK
only cares about chunks with malloc in them. But with this PR, we should be able to distinguish between a chunk mapped by the non-moving immix space and the malloc space (for example). I feel like MS_ACTIVE_CHUNK
can be subsumed by CHUNK_MARK
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be possible. We could simply replace MS_ACTIVE_CHUNK
in malloc mark sweep to CHUNK_MARK
(which might be an issue if we have other spaces that use ChunkMap
). Or we can start using ChunkMap
in malloc mark sweep, but this may not be an easy refactoring. Either way, I will give it a try after this PR.
@@ -60,6 +60,8 @@ define_side_metadata_specs!( | |||
MS_ACTIVE_CHUNK = (global: true, log_num_of_bits: 3, log_bytes_in_region: LOG_BYTES_IN_CHUNK), | |||
// Track the index in SFT map for a chunk (only used for SFT sparse chunk map) | |||
SFT_DENSE_CHUNK_MAP_INDEX = (global: true, log_num_of_bits: 3, log_bytes_in_region: LOG_BYTES_IN_CHUNK), | |||
// Mark chunks (any plan that uses the chunk map should include this spec in their local sidemetadata specs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Mark chunks (any plan that uses the chunk map should include this spec in their local sidemetadata specs) | |
// Mark chunks (any plan that uses the chunk map should include this spec in their global side-metadata specs) |
I have the same concern as @k-sareen's. But I'd like to make clear what purpose
The second purpose can be supported by The first purpose, however, cannot always be supported by If we make use of
I did some Git archaeology into the JikesRVM repo. @steveblackburn developed its predecessor |
After this PR, So I think it should be sound to simply record the space ID in each And if we can use both |
I think I can answer this. In my branch,
The old |
The chunk map records the lowest chunk and the highest chunk used by a space, and accesses all the chunk states (local side metadata) in the range. There are multiple issues when we use discontiguous spaces.
usedunused, thus the local side metadata may be unmapped.This PR makes the chunk state side metadata as a global side metadata. We need the chunk map for plans that use the chunk map. As we will use mark sweep or nonmoving immix as the non moving policy, they both use the chunk map, and the non moving policy is included in every plan, we just include the side metadata for every plan.
This PR fixes #1197.