-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu: miopen: pooling: Fixes strides in 1D pooling #1812
Conversation
@@ -86,9 +86,13 @@ struct miopen_pooling_impl_base_t { | |||
// [n, c, w, 1] | |||
dims_[src][3] = dims_[src][2]; | |||
dims_[src][2] = 1; | |||
strides_[src][2] = dims_[src][3]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see some test cases failed because of this issue?
We have the same logic in the cuDNN based implementation, do you see any failed cases when it's used?
https://github.com/oneapi-src/oneDNN/blob/de69d44024ab4f64b20deb7aa066a65c867f1123/src/gpu/nvidia/cudnn_pooling_impl.hpp#L99-L112
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably the test case --pool --engine=gpu ic64iw32ow16kw3sw2pw0
fails in benchdnn for AMD and not for NVIDIA
AMD output:
benchdnn --pool --engine=gpu ic64iw32ow16kw3sw2pw0
MIOpen Error: /long_pathname_so_that_rpms_can_package_the_debug_info/src/extlibs/MLOpen/src/hipoc/hipoc_kernel.cpp:104: Failed to launch kernel: invalid argument
onednn_verbose,info,oneDNN v3.5.0 (commit 5ce1799d497a97d374ed9195fe6d45a808e72279)
onednn_verbose,info,cpu,runtime:DPC++,nthr:2
onednn_verbose,info,cpu,isa:Intel AVX2
onednn_verbose,info,gpu,runtime:DPC++
onednn_verbose,info,cpu,engine,0,backend:OpenCL,name:AMD EPYC 7402 24-Core Processor ,driver_version:2023.16.12,binary_kernels:disabled
onednn_verbose,info,gpu,engine,0,backend:AMD,name:AMD Instinct MI210,driver_version:0.0.0,binary_kernels:disabled
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
terminate called after throwing an instance of 'dnnl::impl::gpu::amd::miopen_error'
what(): At :/home/dangus/repos/oneDNN/src/gpu/amd/miopen_pooling_impl.hpp : 202miopenPoolingForward : miopenStatusUnknownError
Aborted
NVIDIA output:
enchdnn --pool --engine=gpu ic64iw32ow16kw3sw2pw0
0:PASSED __REPRO: --pool --engine=gpu ic64iw32ow16kw3sw2pw0
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 1.99s; fill: 0.66s (33%); compute_ref: 0.00s (0%); compare: 0.00s (0%);
Build command
cmake .. -DCMAKE_BUILD_TYPE=Debug -DDNNL_CPU_RUNTIME=DPCPP -DDNNL_GPU_RUNTIME=DPCPP -DDNNL_GPU_VENDOR=<AMD/NVIDIA> -DONEDNN_BUILD_GRAPH=OFF
Tested on a Mi210 for AMD and A100 for NVIDIA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that MIOpen takes strides for descriptors creation, whereas cuDNN takes formats.
I would be curious what format gets passed there though.
Is there anything else needed, or can this be merged? @densamoilov |
Description
Currently, when performing 1D pooling in the AMD backend, the src/dst tensor strides are set as
[c*w, w, 1, 0]
which is incorrect because the dimensions are set to be[n, c, 1, w]
. Thus, this MR updates the strides to the correct format of[c*w, w, w, 1]
.