Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthesizing CUDA tests #1183

Merged
merged 11 commits into from
Jan 28, 2020

Conversation

jsmall-zzz
Copy link
Contributor

Tests are automatically synthesized for CUDA target if they are available for CPU.

  • Fix unsized array for CUDA targets
  • Special handling on CUDA for bool type around vector/matrix. It appears that bool is int8_t sized normally in memory. Not so in vector/matrix.
  • Small fixes for some tests.

All CPU tests work on CUDA, except for

tests/compute/cbuffer-legalize.slang.2 syn (cuda)
tests/compute/entry-point-uniform-params.slang.4 syn (cuda)
tests/compute/global-init.slang.2 syn (cuda)
tests/compute/semantic.slang.4 syn (cuda)

@jsmall-zzz jsmall-zzz changed the title Synthesising CUDA tests SynthesizingCUDA tests Jan 27, 2020
@jsmall-zzz jsmall-zzz changed the title SynthesizingCUDA tests Synthesizing CUDA tests Jan 27, 2020

* That the count has to be stored elsewhere (unlike with CUDA)
* On some GPU targets there is no bounds checking - accessing outside the bound values can cause *undefined behavior*
* The elements may be laid out *contiguously* on GPU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what a better term is here, but it seems clear that the elements are laid out contiguously above (all the elements are stored in contiguous memory pointed to by data). There's an additional detail of how the storage is "flat" or "tail-allocated" or "internal to the parent," but not none of those terms is really good at capturing the intention.

Also: eventually it would be good to clean up our behavior for unsized arrays to make sure they are consistent across targets.

{
case BaseType::Bool:
{
// In memory a bool is a byte. BUT when in a vector or matrix it will actually be a int32_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit confusing here, because it is describing an exception to the rules that comes along later. Reading the comment I had assumed this code was going to return sizeof(uint32_t), so I skimmed past the definition at first, and then only scrolled back later when I saw the code in the vector/matrix cases and thought "so wait, when does it actually get the byte-sized thing?"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: it might make sense to factor the commonalities between CPU and CUDA layout into a base class, since I'm guessing the two cases overlap a fair bit.

@jsmall-zzz jsmall-zzz merged commit b3e0b0d into shader-slang:master Jan 28, 2020
@jsmall-zzz jsmall-zzz deleted the feature/cuda-test-synth branch February 11, 2020 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants