graph: fix the intermediate data types in SDPA patterns #2894

TaoLv · 2025-03-17T08:19:49Z

Address MFDNN-13091

The purpose is to align the intermediate data types in the pattern with those in ATen SDPA and GPU ukernel SDPA.

matmul/softmax/binary ops are extended to support mixed data types for inputs and outputs.
The two SDPA examples are changed to use f32 intermediate data type, rather than bf16 or f16.
The backend is changed to dispatch to GPU ukernel SDPA only when f32 intermediate data type is required.
Add test cases for f16/bf16 SDPA with f32 intermediate data type.

With these, now a typical f16 SDPA which can be dispatched to GPU ukernel SDPA looks like below:

tests/benchdnn/inputs/graph/complex_fusion/harness_mha_all

mgouicem · 2025-03-21T14:35:45Z

doc/graph/programming_model/low_precision.md

@@ -52,7 +52,6 @@ Graph operations support bf16 and f16 data types.

 A TypeCast operation performing down conversion should be inserted clearly to
 indicate the use of low numeric precision. oneDNN Graph implementation fully
-honors the API-specified numeric precision and only performs the computation
-using the API-specified or higher numeric precision.
+honors the API-specified numeric precision.


Just to make sure we are aligned. This still allows to use f32 values to store f16/bf16 data, as long as we respect roundings to f16/bf16 accuracy, right?

Yes, in my understanding, it's still allowed for backend implementations. From this perspective, it seems I need to keep the original statement. My intention here was to align the implementations. As the original statement sounds like different backends (eg. DNNL & GC, CPU & GPU) can have different numerical behaviors.

mgouicem · 2025-03-21T14:39:23Z

src/graph/interface/op_def.hpp

+                .set_type_constraints(
+                        "T2", {data_type::f32, data_type::bf16, data_type::f16})
+                .set_type_constraints(
+                        "T3", {data_type::f32, data_type::bf16, data_type::f16})


This requires some documentation about type promotion as users might wonder what happens for example with f16 <- f16 + bf16.

In fact, we don't allow f16 + bf16. It's mentioned in the "supported data types" section in the op document. When src0 and src1 have different data types, one of them should be f32 and the other one (f16 or bf16) will be promoted to f32 for calculation.

dzarukin

LGTM. Move approve responsibility to @mgouicem. :)

TaoLv requested review from a team as code owners March 17, 2025 08:19

github-actions bot added documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc component:graph-api Codeowner: @oneapi-src/onednn-graph component:tests Codeowner: @oneapi-src/onednn-arch component:examples labels Mar 17, 2025

TaoLv force-pushed the lvtao/main/fix-sdpa-intermediates branch 2 times, most recently from 257f3f9 to 4a506ff Compare March 17, 2025 08:39

ElaineBao reviewed Mar 17, 2025

View reviewed changes

tests/benchdnn/inputs/graph/complex_fusion/harness_mha_all Show resolved Hide resolved

TaoLv force-pushed the lvtao/main/fix-sdpa-intermediates branch from 4a506ff to 2cdb502 Compare March 17, 2025 09:20

TaoLv added 8 commits March 20, 2025 23:10

graph: interface: op: matmul supports mixed data types

7a0285f

graph: interface: op: softmax supports mixed data types

d32cdd2

graph: interface: op: binary ops support mixed data types

79f3e18

examples: graph: sdpa: define with f32 intermediate data type

d55039e

graph: backend: dnnl: ukernel sdpa only supports f32 intermediates

fdea8ad

graph: backend: dnnl: pattern: sdpa: remove xf16 check from gpu pattern

abe08ff

benchdnn: inputs: graph: add sdpa cases with f32 intermediate type

80f4f02

benchdnn: inputs: graph: test f32 intermediates for implicit mask

df61ffb

TaoLv force-pushed the lvtao/main/fix-sdpa-intermediates branch from 2cdb502 to 5b05972 Compare March 21, 2025 06:50

doc: graph: op: update supported data types

76f2cf9

TaoLv force-pushed the lvtao/main/fix-sdpa-intermediates branch from 5b05972 to 76f2cf9 Compare March 21, 2025 13:12

mgouicem reviewed Mar 21, 2025

View reviewed changes

dzarukin reviewed Mar 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

graph: fix the intermediate data types in SDPA patterns #2894

graph: fix the intermediate data types in SDPA patterns #2894

TaoLv commented Mar 17, 2025

mgouicem Mar 21, 2025

TaoLv Mar 22, 2025

mgouicem Mar 21, 2025

TaoLv Mar 22, 2025

dzarukin left a comment

graph: fix the intermediate data types in SDPA patterns #2894

Are you sure you want to change the base?

graph: fix the intermediate data types in SDPA patterns #2894

Conversation

TaoLv commented Mar 17, 2025

mgouicem Mar 21, 2025

Choose a reason for hiding this comment

TaoLv Mar 22, 2025

Choose a reason for hiding this comment

mgouicem Mar 21, 2025

Choose a reason for hiding this comment

TaoLv Mar 22, 2025

Choose a reason for hiding this comment

dzarukin left a comment

Choose a reason for hiding this comment