Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common, xe: fix missing -cl-intel-greater-than-4GB-buffer-required OpenCL flag #2718

Merged
merged 2 commits into from
Feb 28, 2025

Conversation

rjoursler
Copy link
Contributor

When offset0 is set, the maximum offset addressed by an OpenCL kernel is offset0 + buffer_size. If this value exceeds a 4GB offset, stateless addressing must be used, which requires setting the -cl-intel-greater-than-4GB-buffer-required flag. This PR adds the missing checks against offset0. This required modifying memory_desc_wrapper::size() to return an appropriate size when offset0 is set.

Fixes MFDNN-13205.

@rjoursler rjoursler requested review from a team as code owners February 18, 2025 18:42
@github-actions github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Feb 18, 2025
@rjoursler rjoursler force-pushed the rjoursle/fix_concat branch 3 times, most recently from 1005547 to bfaf989 Compare February 18, 2025 18:56
@rjoursler rjoursler force-pushed the rjoursle/fix_concat branch 2 times, most recently from 707336e to 13a3261 Compare February 25, 2025 12:23
@rjoursler
Copy link
Contributor Author

make test
enable test_device_cpu
enable test_device_gpu

The behavior of returning 0 when offset0 is set is not aligned with how
primitives check for large buffer support.
Stateful loads cannot address buffers that exceed 4GB in offset from the base
pointer.
@rjoursler
Copy link
Contributor Author

make test
enable test_device_cpu
enable test_device_gpu

@rjoursler rjoursler merged commit 20bd7b5 into main Feb 28, 2025
22 of 23 checks passed
@rjoursler rjoursler deleted the rjoursle/fix_concat branch February 28, 2025 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants