-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][GPU] Expand matmul decomp cases #2916
base: main
Are you sure you want to change the base?
Conversation
2f3e014
to
e3a9503
Compare
b34fab9
to
2e2c013
Compare
}); | ||
|
||
add_mode_matches(fpmath_bf16, [](Type dt) -> const char * { | ||
if (dt.isInt8() || dt.isInt4()) return "B"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is invalid, it will just lead to out-of-register issues like fixed in #2467 as kernels optimized for B
may not have enough spare registers for upconversion. If some of our kernel strategies on B
are functional for [OB]
, then they should just be marked as such.
Additionally, we should not need multiple calls to add_mode_matches
, instead, we should get
add_mode_matches(fpmath_bf16, [](Type dt) -> const char * {
if (dt == Type::f32) { return "[SB]"; }
if (dt.isInt8() || dt.isInt4()) return "[OB]";
if (dt.isF8()) return "B"; // This seems invalid and could could lead to out of registers, Need to determine appropriate specifier for GEMM.
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean, but we need to strike a balance between only supporting specifically optimized cases and out of box functionality. There is an issue with the current approach - there is no fallback from "[OB]" type strategies to "[B]" type due to the selection process.
I suggest we take the opposite approach and mark strategies that will not tolerate upconversion. Optimized strategies with "[OB]" tags should be prefered on a performance basis when theyre useable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we take the opposite approach and mark strategies that will not tolerate up conversion.
This is what we are doing already, it is just that B
is how we mark that the kernel will not support up conversion. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored, renamed all "O", "H", "S"
strategies to `"[OH]", "H", "S"' to conform with naming convention.
f4d86c8
to
8e04316
Compare
e3a9503
to
4375e7a
Compare
4375e7a
to
6892eb5
Compare
… on it Case with different mask is not supported if only both scales were specified.
Temporary "const char *" objects can disappear while getting to the parser internals. Moving strings to parse into a permanent container solves the problem.
8e04316
to
608baa3
Compare
6892eb5
to
ebf1fa9
Compare
608baa3
to
f933c1e
Compare
make test |
Description
Expands cases where inputs are decompressed to include integer activations as well as weights. Integer weights are still required but in the presence of fpmath setting activations will also be upconverted.
This means that cases intending to use integer accumulation must not supply eg
attr-fpmath=f16:true
as all such cases must be upconverted per https://jira.devtools.intel.com/browse/MFDNN-13380.Fixes # (github issue)
Checklist
General
make test
andmake test_benchdnn_*
) pass locally for each commit?