Replies: 1 comment
-
it appears to me that a recent change trigged a problem: Before this change, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
https://github.com/NVIDIA/cutlass/blob/main/include/cute/arch/copy_sm90_desc.hpp#L425
In the code linked above, function tma_descriptor_cp_fence_release is declared as CUTE_HOST_DEVICE, it calls a function cast_smem_ptr_to_uint (CUTE_DEVICE). how does the host version of the function tma_descriptor_cp_fence_release call device code cast_smem_ptr_to_uint?
Beta Was this translation helpful? Give feedback.
All reactions