Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph: introduce internal dnnl_sdpa op #2930

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

xiang1guo
Copy link
Contributor

@xiang1guo xiang1guo commented Mar 21, 2025

Description

  • This is the first part to support GQA pattern refinement requested by PyTorch upstream. Address MFDNN-12871. The PR added a new internal dnnl_sdpa op for better fusion and alignment with sdpa primitive. With this, the internal compilation will transform a subgraph into sdpa op, see following pic.
  • The refactor also aims to reduce graph compilation time by reducing layout propagation and memory planning time with a simplified internal sdpa op.

image

Works

  • Added a dnnl_sdpa op. Currently only support float SDPA.
  • Added a new sdp_primitive_v1 kernel for simplicity. The final goal is to merge this kernel with sdp_primitive kernel.
  • Move the sdpa primitive ukernel creation process into op_executable which is now aligned with other kernels

Follow-up

There will be another PR to refine the GQA pattern based on this new internal dnnl_sdpa.

validation

There are total 66 case can be supported by GPU ukernel, those 42 float SDPA cases can run into sdp_primitive_v1_kernel_t now, the other 24 cases are quantization SDPA which all run into sdp_primitive_kernel_t .

Kernel PR main branch
sdp_primitive_v1_kernel_t 42 0
sdp_primitive_kernel_t 24 66
larger_partition_kernel_t 76 76

Performance test is WIP.

@github-actions github-actions bot added the component:graph-api Codeowner: @oneapi-src/onednn-graph label Mar 21, 2025
@xiang1guo xiang1guo self-assigned this Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:graph-api Codeowner: @oneapi-src/onednn-graph
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant