-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synthesizing CUDA tests #1183
Synthesizing CUDA tests #1183
Conversation
… is on CPU. Allow bounds check to zero index. Update docs.
…nvidia/slang into feature/cuda-test-synth
…int32_t vector array types.
…eature/cuda-test-synth # Conflicts: # tools/render-test/cuda/cuda-compute-util.h
|
||
* That the count has to be stored elsewhere (unlike with CUDA) | ||
* On some GPU targets there is no bounds checking - accessing outside the bound values can cause *undefined behavior* | ||
* The elements may be laid out *contiguously* on GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what a better term is here, but it seems clear that the elements are laid out contiguously above (all the elements are stored in contiguous memory pointed to by data
). There's an additional detail of how the storage is "flat" or "tail-allocated" or "internal to the parent," but not none of those terms is really good at capturing the intention.
Also: eventually it would be good to clean up our behavior for unsized arrays to make sure they are consistent across targets.
{ | ||
case BaseType::Bool: | ||
{ | ||
// In memory a bool is a byte. BUT when in a vector or matrix it will actually be a int32_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is a bit confusing here, because it is describing an exception to the rules that comes along later. Reading the comment I had assumed this code was going to return sizeof(uint32_t)
, so I skimmed past the definition at first, and then only scrolled back later when I saw the code in the vector/matrix cases and thought "so wait, when does it actually get the byte-sized thing?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: it might make sense to factor the commonalities between CPU and CUDA layout into a base class, since I'm guessing the two cases overlap a fair bit.
Tests are automatically synthesized for CUDA target if they are available for CPU.
All CPU tests work on CUDA, except for
tests/compute/cbuffer-legalize.slang.2 syn (cuda)
tests/compute/entry-point-uniform-params.slang.4 syn (cuda)
tests/compute/global-init.slang.2 syn (cuda)
tests/compute/semantic.slang.4 syn (cuda)