Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu: miopen: pooling: Fixes strides in 1D pooling #1812

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

dylanangus
Copy link

Description

Currently, when performing 1D pooling in the AMD backend, the src/dst tensor strides are set as [c*w, w, 1, 0] which is incorrect because the dimensions are set to be [n, c, 1, w]. Thus, this MR updates the strides to the correct format of [c*w, w, w, 1].

@vpirogov vpirogov added this to the v3.5 milestone Feb 28, 2024
@@ -86,9 +86,13 @@ struct miopen_pooling_impl_base_t {
// [n, c, w, 1]
dims_[src][3] = dims_[src][2];
dims_[src][2] = 1;
strides_[src][2] = dims_[src][3];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see some test cases failed because of this issue?

We have the same logic in the cuDNN based implementation, do you see any failed cases when it's used?
https://github.com/oneapi-src/oneDNN/blob/de69d44024ab4f64b20deb7aa066a65c867f1123/src/gpu/nvidia/cudnn_pooling_impl.hpp#L99-L112

Copy link
Author

@dylanangus dylanangus Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably the test case --pool --engine=gpu ic64iw32ow16kw3sw2pw0 fails in benchdnn for AMD and not for NVIDIA

AMD output:

benchdnn --pool --engine=gpu ic64iw32ow16kw3sw2pw0
MIOpen Error: /long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/MLOpen/src/hipoc/hipoc_kernel.cpp:104: Failed to launch kernel: invalid argument
onednn_verbose,info,oneDNN v3.5.0 (commit 5ce1799d497a97d374ed9195fe6d45a808e72279)
onednn_verbose,info,cpu,runtime:DPC++,nthr:2
onednn_verbose,info,cpu,isa:Intel AVX2
onednn_verbose,info,gpu,runtime:DPC++
onednn_verbose,info,cpu,engine,0,backend:OpenCL,name:AMD EPYC 7402 24-Core Processor                ,driver_version:2023.16.12,binary_kernels:disabled
onednn_verbose,info,gpu,engine,0,backend:AMD,name:AMD Instinct MI210,driver_version:0.0.0,binary_kernels:disabled
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
terminate called after throwing an instance of 'dnnl::impl::gpu::amd::miopen_error'
  what():  At :/home/dangus/repos/oneDNN/src/gpu/amd/miopen_pooling_impl.hpp : 202miopenPoolingForward : miopenStatusUnknownError
Aborted

NVIDIA output:

enchdnn --pool --engine=gpu ic64iw32ow16kw3sw2pw0
0:PASSED __REPRO: --pool --engine=gpu ic64iw32ow16kw3sw2pw0
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 1.99s; fill: 0.66s (33%); compute_ref: 0.00s (0%); compare: 0.00s (0%);

Build command

cmake .. -DCMAKE_BUILD_TYPE=Debug -DDNNL_CPU_RUNTIME=DPCPP -DDNNL_GPU_RUNTIME=DPCPP -DDNNL_GPU_VENDOR=<AMD/NVIDIA> -DONEDNN_BUILD_GRAPH=OFF

Tested on a Mi210 for AMD and A100 for NVIDIA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that MIOpen takes strides for descriptors creation, whereas cuDNN takes formats.

I would be curious what format gets passed there though.

@dylanangus
Copy link
Author

Is there anything else needed, or can this be merged? @densamoilov

@dylanangus dylanangus merged commit 69c90fc into uxlfoundation:main Mar 13, 2024
10 checks passed
@vpirogov vpirogov added the platform:gpu-amd Codeowner: @oneapi-src/onednn-gpu-amd label May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-amd Codeowner: @oneapi-src/onednn-gpu-amd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants