Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Shapeless convolution: add non-group specialization #2962

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

echeresh
Copy link
Contributor

Found some issues while looking at performance data from MFDNN-13246.

List of changes:

  • Added g1 specialization - to substitute g = 1 (one group). This results in more broad usage of block 2D based kernels
  • Fixed Stream-K batch offsets (found while running the Jira shapes)
  • Minor refactoring to reuse common macros for bitmask enums

@echeresh echeresh added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Mar 26, 2025
@echeresh echeresh requested a review from a team as a code owner March 26, 2025 19:32
@echeresh echeresh changed the title [GPU] Shapeless convolution: add non-group specializaiton [GPU] Shapeless convolution: add non-group specialization Mar 26, 2025
@echeresh
Copy link
Contributor Author

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_conv
enable benchdnn_deconv
enable benchdnn_matmul
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg
enable arch_gpu_xe3-lpg

@echeresh echeresh force-pushed the echeresh/fix-v2 branch 2 times, most recently from d249e79 to 791f007 Compare March 28, 2025 23:38
@echeresh
Copy link
Contributor Author

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_conv
enable benchdnn_deconv
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg
enable arch_gpu_xe3-lpg

@echeresh
Copy link
Contributor Author

echeresh commented Apr 2, 2025

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_conv
enable benchdnn_deconv
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg
enable arch_gpu_xe3-lpg

@echeresh
Copy link
Contributor Author

echeresh commented Apr 2, 2025

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_conv
enable benchdnn_deconv
enable benchdnn_matmul
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg
enable arch_gpu_xe3-lpg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants