-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: cpu: x64: jit_brdgmm_dw_conw: make brdgmm support fusing sum op #2337
Conversation
5d7c753
to
d4a0ca9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
It would be nice to see the following:
|
Got it, thanks for the clarification |
before
after
|
Thanks for providing the logs, I think I see what's going on now.
but once it's specified, even to the actual
It looks quite unexpected to me, but it's probably not your war, so I'm fine to pass this change. Hope you verified it's actually faster with brdgmm because the verbose from you doesn't show it is (though it's a first run, but still). @mgouicem, we have a debate to talk through. |
d4a0ca9
to
8924b67
Compare
Brdgmm kernel could not be selected when using the --attr-post-ops=sum flag. jit_dw was used instead.
This PR makes it possible to use brdgmm kernel with sum post-op
The changes look trivial, In fact, there was already support, but brdgmm could not be dispatched because the function(has_default_values) signature was updated a long time ago but the function call was not updated. As a result, data_type_undef was always used inside, which prevented the brdgmm kernel from being selected