-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu: aarch64: Enable stateless ACL LayerNorm #2804
base: main
Are you sure you want to change the base?
Conversation
Could you post the benchdnn line you used to test your lnorm changes? Along with performance numbers and oneDNN verbose for the before and after please. Thanks |
Please squash your commits as well please. The first one doesn't build without the fixes in the second |
WITHOUT CHANGE OMP_NUM_THREADS=1 ONEDNN_VERBOSE=all ./benchdnn --lnorm --dir=FWD_I --dt=f32:s8 --tag=axb 256x768_n"lnorm_ci_0d:2" --mode=P WITH CHANGE OMP_NUM_THREADS=1 ONEDNN_VERBOSE=all ./benchdnn --lnorm --dir=FWD_I --dt=f32:s8 --tag=axb 256x768_n"lnorm_ci_0d:2" --mode=P |
Done. |
Could you show the results for |
Perf numbers without the change: OMP_NUM_THREADS=16 ./benchdnn --lnorm --mode=P --dir=FWD_I --dt=f32:s8 --tag=axb 257x768_n"lnorm_ci_0d:2" Perf numbers with the change: OMP_NUM_THREADS=16 ./benchdnn --lnorm --mode=P --dir=FWD_I --dt=f32:s8 --tag=axb 257x768_n"lnorm_ci_0d:2" |
5d12bcc
to
0a4d13a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perf looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch. In addition to my review comments. Can we take this opportunity to move all function definitions into the cpp
file please? Thank you.
Done. |
Description
Make layernorm op use stateless ACL interface.
Checklist
General
make test
andmake test_benchdnn_*
) pass locally for each commit?