-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid VK_EXT_surface_maintenance1 behavior in presence of multiple ICD (nvidia + mesa-vulkan-drivers) #1557
Comments
To clarify, in the step (4) above |
I'm sorry, but excuse my ignorance. What exactly is invalid? When I tried to reproduce this locally, with nvidia driver 550 I found that I do reproduce that the struct is empty, but isn't that to be expected if the driver version doesn't support an extension?
The way I confirmed this is by setting VK_LOADER_DRIVERS_SELECT=nvidia and running vulkaninfo --summary to show me the list of available instance extensions with the 535 nvidia driver.
Whereas with the 550 driver the above command would have |
Focusing on version 535, without
That makes sense:
This does not make sense: We think this behavior is incorrect because:
Note that removing llvmpipe from the system (or using the VK_ICD_FILENAMES workaround) results in correct behavior: support for |
Hmm, I wonder if there is an issue with the way we are querying supported extensions. What would be the vk api equivalent of |
I think the bug is both in the spec - making it impossible for an application to know if an instance extension is supported by a particular physical device, and with vulkaninfo - where it is printing the struct regardless of the values returned. Vulkaninfo could check if the structs were filled out, and if not, not print the structs. That said, the crash in the 550 driver is still concerning and may be a loader bug so I don't want to rule that out. I do wonder if this is all because the spec has an instance extensions which may not be supported by all physical devices. The loader typically "fixes" any instance level functions when called on drivers that don't support it. The loader could do the same here by removing the surface maintenance1 structs from pnext chains, but that's not ideal as modifying pnext chains is likely to break more things in the process. Maybe there should be a way to query if a specific physical device supports an instance extension so applications can do that checking themselves? That puts the onus on app devs which is unfortunate (yet another thing to have to manage). In short I think the output of vulkaninfo is misleading - the nvidia 535 driver doesn't support the surface maintenance extension yet vulkaninfo makes it appear that it does. |
Checking whether a struct was filled or not really feels like a hack though, probably because this isn't something APIs normally have to rely on and there isn't an obvious way to do it. If the spec required that a field has to change in a specific manner in response to queries (0->1 or any other way) that would be well defined. Otherwise, a zero initialized struct might be a correct response as well as non-zero-initialized, but unchanged value that was passed in.
I don't see a good way how vulkaninfo would be supposed to handle this in the general case. Also, we hit this same issue in ANGLE. We could workaround with hacks like checking for unchanged out-of-range values but is there something we (Khronos) could do to improve the situation in the longer run, for all users? |
There is a much easier solution for this particular problem. Removing the struct from the pnext chain doesn't help unsuspecting apps, because they queried and were ensured surface_maint1 exists, but after selecting the device the query doesn't work. The problem in particular stems from the fact that the loader takes the instance extensions supported by all the devices (like nvidia and lavapipe), and advertises the union to the application. The solution is easy, the loader can fill this struct itself if the actual device doesn't support the extension. |
As to the general problem of instance extensions, either the loader should advertise the intersection as opposed to the union of instance extensions advertised by ICDs (pessimistic behavior), or it should make sure it has a meaningful implementation itself if an ICD is chosen that doesn't advertise the extension. My preference is the latter where possible. |
Is it even meaningful to query extensions without selecting a driver first? What is the purpose of this aggregation? |
That's unfortunately how Vulkan is, the instance extensions are supposed to be independent of the driver, or at least that's what the intention originally was (as I understand). In this case, technically surface_maint1 should be something that depends on the window system alone (like wayland or X11), but is somehow implemented by the driver instead of some other driver-agnostic component in the system (which is the way Android's implementation is). |
Seems that this needs a loader fix to look for surface maintenance 1 structs in the pNext chain, and fill them out with appropriate fall back info. And yes, this is unfortunately the outcome of the choice to make surfaces an "instance" level object. A similar issue comes up with vkCreateDisplayPlaneSurfaceKHR, where the function has a given physical device that it is operating on, but due to the use of a surface, it is being called on every driver. #1541 for more details. |
While implementing the emulation of VK_EXT_surface_maintenance1 in the loader, I've hit a bit of a roadblock.
Maybe I missed something, but in the meantime I am simply setting the values to 0 (which acts as memset'ting the struct to 0). |
Oh that one is easy too, just populate them with the |
I have the emulation PR up. While I have locally validated that it fixes the issue of vulkaninfo reporting erroneous values, additional testing is always welcome. |
Thank you, @romanl-g would you do the honors? :) @charles-lunarg, FWIW, you should be able to run CTS over your implementation, hopefully the tests don't make any assumptions about swapchain_maint1 being supported at the same time! |
ANGLE: I didn't test thoroughly but in a basic test with that PR |
After running through local CTS testing, I didn't encounter any glaring failures. All the tests which weren't skipped passed, and on multiple drivers. |
Describe the bug
According to @ShabbyX this appears to be a loader bug, was previously discussed with LunarG.
Environment (please complete the following information):
To Reproduce
This results in
VK_EXT_surface_maintenance1
reported to be supported by the driver butvkGetPhysicalDeviceSurfaceCapabilities2KHR
not updating the data structure. In ANGLE we hit it here:https://crsrc.org/c/third_party/angle/src/libANGLE/renderer/vulkan/SurfaceVk.cpp;drc=1b4d6185c4de95f1fed15b08cc7d044eba78ece3;l=1684
Workaround1:
VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json
Workaround2:
sudo apt remove mesa-vulkan-drivers
The text was updated successfully, but these errors were encountered: