Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test failure on gfx-unit-test-tool/precompiledTargetModule2Vulkan #6170

Closed
jkwak-work opened this issue Jan 24, 2025 · 11 comments
Closed
Assignees
Labels
goal:forward looking Feature needed at a later date, not connected to a specific use case. kind:regression

Comments

@jkwak-work
Copy link
Collaborator

gfx-unit-test-tool/precompiledTargetModule2Vulkan is failing.
Not sure when it started failing.

Note that the test is passing when running with a Release build with "-use-test-server" option incorrectly. That's why CI has been passing; incorrectly.

You can reproduce it with debug build or when running with release build, don't use "-use-test-server" option.

D:\sbf\git\slang\jkwak>build\Release\bin\slang-test.exe gfx-unit-test-tool/precompiledTargetModule2Vulkan
Supported backends: fxc dxc glslang spirv-dis clang visualstudio genericcpp nvrtc llvm spirv-opt metal tint
error: abs(result[i] - expectedResult[i]) <= 0.01 - D:\sbf\git\slang\jkwak\tools\gfx-unit-test\gfx-test-util.cpp (240)
error: abs(result[i] - expectedResult[i]) <= 0.01 - D:\sbf\git\slang\jkwak\tools\gfx-unit-test\gfx-test-util.cpp (240)
error: abs(result[i] - expectedResult[i]) <= 0.01 - D:\sbf\git\slang\jkwak\tools\gfx-unit-test\gfx-test-util.cpp (240)
error: abs(result[i] - expectedResult[i]) <= 0.01 - D:\sbf\git\slang\jkwak\tools\gfx-unit-test\gfx-test-util.cpp (240)
FAILED test: 'gfx-unit-test-tool/precompiledTargetModule2Vulkan.internal'

===
0% of tests passed (0/1)
===

failing tests:
---
gfx-unit-test-tool/precompiledTargetModule2Vulkan.internal
---

When debugged, the "result" variable holds four zero values.

@jkwak-work jkwak-work added goal:forward looking Feature needed at a later date, not connected to a specific use case. kind:regression labels Jan 24, 2025
@cheneym2
Copy link
Collaborator

The test needs to be updated to match development in slang for the spirv backend. I don't understand the issue that was causing the test to "pass" when running on Release with "-use-test-server" incorrectly, but it's clear the test should not be passing.

When the test was written, getEntryPointCode() would return the whole spirv program even if one module was precompiled, but Slang was updated to only return "glue" spirv, and the precompiled bits have to be separately queried. I've got the bit to query the remaining blobs of spirv and am working on getting them linked using SPIRV-Tools. In the past, I've been able to use "spirv-link.exe" to link in test apps, but for the CI test, it has to use the library version of spirv-tools.

The app is linking to SPIRV-Tools.lib now and some symbols from the library are resolved, but Link() is not resolved, still investigating that.

@jkwak-work
Copy link
Collaborator Author

The intermittence became higher after upgrading the graphics driver, although I am not sure if it is really related to the graphics driver or not.
I put it in tests/expected-failure-githug.txt to unblock CI tests.

gfx-unit-test-tool/precompiledTargetModule2Vulkan.internal

#6228

@cheneym2
Copy link
Collaborator

It's fine to have it disabled while I'm fixing the test.

@cheneym2
Copy link
Collaborator

The SPIRV-Tools-link.lib based linker is loaded properly now, but Link() reports an error:

Linking 2 binaries
SPIRV-TOOLS: Invalid opcode: 64768
SPIRV-TOOLS: input
SPIRV-TOOLS: 21:0
SPIRV-TOOLS: Failed to build module 1 out of 2.
SPIRV-TOOLS: input
SPIRV-TOOLS: 0:0
error: Exception was thrown during execution

Dumping the same binaries to disk and using "spirv-link.exe" shows that the bins are legal and linkable, so it must be API usage.

@cheneym2
Copy link
Collaborator

The API expects spv sizes to be reported in units of uint32_t instead of bytes, so Link() was reading past the end of the valid data. Linkage is working now and producing valid kernel code. Need to get it loaded as the replacement kernel code now.

@jkwak-work
Copy link
Collaborator Author

Another test keeps failing too.

gfx-unit-test-tool/precompiledModuleCacheVulkan.internal

@cheneym2
Copy link
Collaborator

Interesting, there may be multiple issues then.

@cheneym2
Copy link
Collaborator

I have what appears to be a working fix for the Vulkan tests, but D3D tests are failing. Need to investigate that. This draft PR needs cleaned up still: #6236

@jkwak-work What was the cause of the false "passing" status of this test, could you tell? I didn't catch the "incorrect" aspect of this:

Note that the test is passing when running with a Release build with "-use-test-server" option incorrectly. That's why CI has been passing; incorrectly.

cheneym2 added a commit to cheneym2/slang that referenced this issue Jan 31, 2025
In the SPIR-V backend of Slang, compiling a shader
that contains some modules with precompiled target
blobs will produce only a "glue" SPIR-V output which
needs to be linked with the assorted precompiled
blobs to be complete.

Closes shader-slang#6170
cheneym2 added a commit to cheneym2/slang that referenced this issue Jan 31, 2025
In the SPIR-V backend of Slang, compiling a shader
that contains some modules with precompiled target
blobs will produce only a "glue" SPIR-V output which
needs to be linked with the assorted precompiled
blobs to be complete.

Closes shader-slang#6170
@cheneym2 cheneym2 added this to the Q1 2025 (Winter) milestone Feb 3, 2025
cheneym2 added a commit to cheneym2/slang that referenced this issue Feb 3, 2025
In the SPIR-V backend of Slang, compiling a shader
that contains some modules with precompiled target
blobs will produce only a "glue" SPIR-V output which
needs to be linked with the assorted precompiled
blobs to be complete.

Closes shader-slang#5435
Closes shader-slang#6170
@cheneym2 cheneym2 marked this as a duplicate of #5435 Feb 3, 2025
@cheneym2
Copy link
Collaborator

cheneym2 commented Feb 5, 2025

The proposed fix is #6253 but the change is causing "error 127" failures in examples that use the "compileShaders" function from gfx that was fixed by the change.

Reducing priority for the next couple weeks.

@cheneym2
Copy link
Collaborator

https://github.com/shader-slang/slang/pull/6455/files fixes the issue and passes CI. Just needs to be rebased.

@cheneym2
Copy link
Collaborator

Fixed by #6455

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
goal:forward looking Feature needed at a later date, not connected to a specific use case. kind:regression
Projects
None yet
Development

No branches or pull requests

2 participants