-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid intermediate CUDA code is generated with new compilation API #6507
Comments
For reference, the following is generated with
|
Relevant IR snippet when compiling with the deprecated compile request API (via slangc):
Relevant IR snippet when compiling the hacked unit test with the new API:
Doing a swizzle on a ptr type directly leads to generating the problematic code. Of course we still need to fix the bug, but it would be good to find out the answer to this question before continuing with the original task #4760. |
I found that |
It's from here, but I believe the more promising lead right now is the fact that old API produces an entry point with the signature (The ptr being swizzled is a temporary variable containing the function parameter, which is actually |
The difference in the entry point parameter being constref vs not in old vs new API results comes from slang-lower-to-ir.cpp/collectParameterLists via In other words, for some reason the new API entrypoint does not have the EntryPointAttribute decoration, whereas the old one has. There is a comment explaining that constref is needed, so I assume the fix is to make sure that the new API will also end up adding the attribute to the entry point. EDIT: On second thought, maybe the AST should more reflect what the user wrote. I can just look for other attributes that indicate the function is an entrypoint, instead. For example numthreads in this case. |
…wring For shaders like tests/compute/simple.slang, which have a 'numthreads' attribute but no 'shader' attribute, the old compile request API would add an EntryPointAttribute to the AST node of the entry point. However, the new API doesn't, and so a certain ConstRef hack doesn't get applied when using the new API, leading to subsequent code generation issues. This patch also checks for a 'numthreads' attribute when deciding whether to apply the ConstRef hack. This closes issue shader-slang#6507 and helps to resolve issue shader-slang#4760.
…wring For shaders like tests/compute/simple.slang, which have a 'numthreads' attribute but no 'shader' attribute, the old compile request API would add an EntryPointAttribute to the AST node of the entry point. However, the new API doesn't, and so a certain ConstRef hack doesn't get applied when using the new API, leading to subsequent code generation issues. This patch also checks for a 'numthreads' attribute when deciding whether to apply the ConstRef hack. This closes issue shader-slang#6507 and helps to resolve issue shader-slang#4760.
* Add cuda codegen bug repro This just compiles tests/compute/simlpe.slang for PTX with the new compilation API, in order to reproduce a code generation bug. * Detect entrypoint more robustly when applying ConstRef hack during lowring For shaders like tests/compute/simple.slang, which have a 'numthreads' attribute but no 'shader' attribute, the old compile request API would add an EntryPointAttribute to the AST node of the entry point. However, the new API doesn't, and so a certain ConstRef hack doesn't get applied when using the new API, leading to subsequent code generation issues. This patch also checks for a 'numthreads' attribute when deciding whether to apply the ConstRef hack. This closes issue #6507 and helps to resolve issue #4760. * Add expected failure list for GitHub runners Our GitHub runners don't have the CUDA toolkits installed, so they can't run all tests.
Fixed by #6506 |
Here is a WIP branch that modifies one unit test to try to compile the shader from
tests/compute/simple.slang
with the new compilation API, targeting PTX.Here is the invalid intermediate CUDA code that gets generated:
The line
uint _S2 = (&_S1).x;
is invalid and does not compile withnvrtc
:The text was updated successfully, but these errors were encountered: