-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple memory allocations from aligned memory allocations #87003
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question/comment about alignment in z_heap_aligned_alloc(). Everything else looks good to me.
__ASSERT((align & (align - 1)) == 0, | ||
"align must be a power of 2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a minimum alignment? I note that if align is 0, this assert still passes. I think that lib/heap/heap.c uses a minimum alignment of 1 and k_aligned_alloc() had a minimum aligment of sizeof(void *).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimum alignment from heap.c is 4 bytes on 32-bits archs and 8 bytes on 64-bits archs already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems very reasonable. Doesn't look like the impact on code size should be very high, but might be worth measuring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Just one suggestion, but not a blocker (use IS_POWER_OF_TWO()
).
Re: measurements, I was curious about the size difference and as well, and also a little curious about timing. If there are fewer conditions on a potentially hot path like this, there could be some perf gain.
Would be really great to see some memory allocation benchmarks down the road.
Due to popular demand, here's some code size comparisons using the tests/lib/heap_align:
tests/kernel/mem_heap/k_heap_api:
samples/userspace/hello_world_user:
So... it seems there isn't much difference as code size goes. If anything, |
78b7c1a
And here's some perf numbers using qemu_riscv64 which I hope can be Before malloc changes:
With the whole series applied:
In short, |
Useful to evaluate malloc performance changes. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Let's avoid division and modulus operations as they're costly... and even more so when they're unnecessary as in this case the main constraint is about the alignment being a power of 2 which is a very small subset of sizeof(void *) multiples. Then move the assertion to common code for wider coverage. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
When k_heap_alloc() is expressed in terms of k_heap_aligned_alloc() it invokes a longer aligned allocation code path with an extra runtime overhead even though no alignment is necessary. Let's reference and invoke the aligned allocation code path only when an actual aligned allocation is requested. This opens the possibility for the linker to garbage-collect the aligning code otherwise. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
In .../modules/debug/percepio/TraceRecorder/kernelports/Zephyr/include/\ tracing_tracerecorder.h there is a concealed non-parameterized direct reference to a local variable that is no longer in scope. Provide a dummy stub for compilation to succeed until that module's layering violation is fixed, after which this could be reverted. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
When k_malloc() is expressed in terms of k_aligned_alloc() it invokes a longer aligned allocation code path with an extra runtime overhead even though no alignment is necessary. Let's reference and invoke the aligned allocation code path only when an actual aligned allocation is requested. This opens the possibility for the linker to garbage-collect the aligning code otherwise. Also bypass k_heap_malloc() and friends given they're invoked with K_NO_WAIT. Go directly to sys_heap_*() instead to cut some more unneeded overhead. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
When sys_heap_realloc() is expressed in terms of sys_heap_aligned_realloc() it invokes a longer aligned allocation code path with an extra runtime overhead even though no alignment is necessary. Let's reference and invoke the aligned allocation code path only when an actual aligned allocation is requested. This opens the possibility for the linker to garbage-collect the aligning code otherwise. Improve realloc documentation while at it. Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
@andyross, @peter-mitsis: ping |
This is no April Fools' prank: assignees' (re)approval is still needed. ;-) |
When a simple memory allocation is accomplished in terms of an aligned memory
allocation, a longer aligned allocation code path with an extra runtime
overhead is involved even though no alignment is necessary.
Let's rework the code so the aligned allocation code path is invoked only
when an actual aligned allocation is requested. This opens the possibility
for the linker to garbage-collect the aligning code otherwise.