Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batched - dense: Testing and fixing Serial QR #2342

Merged
merged 10 commits into from
Mar 11, 2025

Conversation

lucbv
Copy link
Contributor

@lucbv lucbv commented Sep 19, 2024

The serial QR algorithms does not have unit-tests and is failing for non square matrices. See issue #2328.
This first commit fixes the issue with rectangular matrices and adds a basic test for that use case.
Next will work on adding a test that exercises the interfaces on multiple matrices of different sizes within a parallel_for. Finally equivalent tests will be added for the square case as well.

Tasks:

  • analytical test on rectangular matrix (SerialQR, ApplyQ, FormQ, Q*Q^t=I)
    • SerialQR
    • ApplyQ
    • FormQ
  • multiple variable size rectangular matrices (same API as above)
    • SerialQR
    • ApplyQ
    • FormQ
  • analytical test on square matrix
    • SerialQR
    • ApplyQ
    • FormQ
  • multiple variable size square matrices
    • SerialQR
    • ApplyQ
    • FormQ

Copy link
Contributor

@cwpearson cwpearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit - duplicated comment

@lucbv lucbv self-assigned this Oct 15, 2024
@lucbv lucbv force-pushed the batched_QR_fixes branch 2 times, most recently from 18fc349 to b24686b Compare November 6, 2024 20:45
@lucbv lucbv force-pushed the batched_QR_fixes branch 2 times, most recently from 2b4e03b to f4dfbe9 Compare December 4, 2024 20:39
@lucbv lucbv removed the AT: WIP label Dec 4, 2024
The serial QR algorithms does not have unit-tests and is failing
for non square matrices. See issue kokkos#2328.
This first commit fixes the issue with rectangular matrices and
adds a basic test for that use case. Next will work on adding a
test that exercises the interfaces on multiple matrices of different
sizes within a parallel_for. Finally equivalent tests will be added
for the square case as well.
Fixing unused variable error
It looks like the Left NoTranspose ApplyQ is not doing the correct
thing. Will have a look at that next.

Spliting the tests a bit better, looking at Q and Qt
together to detect issues with consistency between the two.
Also eventually allows to use GEMM to figure out which one
is wrong.

Final fixes and adjusting the tolerance

Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
@lucbv lucbv force-pushed the batched_QR_fixes branch from 6aa3beb to f3460b6 Compare March 7, 2025 22:11
lucbv added 5 commits March 7, 2025 15:32
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
@lucbv lucbv requested review from cwpearson and yasahi-hpc March 8, 2025 00:08
Copy link
Contributor

@yasahi-hpc yasahi-hpc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I requested small changes. I also have three additional points.

  1. It seems that Side::Right specialization of KokkosBatched::SerialApplyQ is not tested. Is it possible to add a test case for this specialization?
  2. Could you also consider to add docstrings in APIs under KokkosBatched_ApplyQ_Decl.hpp and KokkosBatched_QR_Decl.hpp? It would be helpful to generate API reference
  3. Could you also add static_assertion to check the template argument is valid, e.g., check if ArgTrans is NoTranspose or Transpose.

Checking that the side, mode and algo are all valid.
Cleaning up the test by using create_mirror_view_and_copy
when appropriate.

Somehow the semantic of create_mirror_view_and_copy is different
from the regular deep_copy so need to make a few changes after
testing on GPUs...

Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
@lucbv lucbv force-pushed the batched_QR_fixes branch from dd9ecda to a9b516e Compare March 10, 2025 21:11
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
lucbv added 2 commits March 10, 2025 17:18
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
@lucbv lucbv requested a review from yasahi-hpc March 10, 2025 23:22
@lucbv
Copy link
Contributor Author

lucbv commented Mar 10, 2025

I addressed 1. and 3. however for the documentation I will delay until we have it setup in the repo for batched algorithms.
This should not take too long as I am in the process of writing the main page for that and then we will be able to add the APIs one by one as we get to them.

Copy link
Contributor

@yasahi-hpc yasahi-hpc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
As for the documentation, I can contribute if needed.
Let me know if you decided how to implement.

@lucbv lucbv merged commit d343008 into kokkos:develop Mar 11, 2025
19 checks passed
@ndellingwood
Copy link
Contributor

ndellingwood commented Mar 12, 2025

@lucbv some of these changes were incompatible with usage in Trilinos (intrepid2), this is showing in the nightly integration builds:

17:53:53 [ 66%] Building CXX object packages/intrepid2/unit-test/Projection/CMakeFiles/Intrepid2_unit-test_Projection_test_project_fields_Serial_DOUBLE.dir/test_project_fields_Serial_DOUBLE.cpp.o
17:53:56 In file included from /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Build/packages/intrepid2/unit-test/Projection/test_project_fields_Serial_DOUBLE.cpp:15:
17:53:56 In file included from /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Trilinos/packages/intrepid2/unit-test/Projection/test_project_fields.hpp:34:
17:53:56 /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Trilinos/packages/intrepid2/src/Projection/Intrepid2_ProjectionTools.hpp:527:9: error: no member named 'SerialQR_Internal' in namespace 'KokkosBatched'; did you mean 'KokkosBatched::Impl::SerialQR_Internal'?
17:53:56         KokkosBatched::SerialQR_Internal::invoke(A0_host.extent(0), A0_host.extent(1),
17:53:56         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17:53:56         KokkosBatched::Impl::SerialQR_Internal
17:53:56 /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Trilinos/kokkos-kernels/batched/dense/impl/KokkosBatched_QR_Serial_Internal.hpp:34:8: note: 'KokkosBatched::Impl::SerialQR_Internal' declared here
17:53:56 struct SerialQR_Internal {
17:53:56        ^
17:53:56 In file included from /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Build/packages/intrepid2/unit-test/Projection/test_project_fields_Serial_DOUBLE.cpp:15:
17:53:56 In file included from /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Trilinos/packages/intrepid2/unit-test/Projection/test_project_fields.hpp:34:
17:53:56 /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Trilinos/packages/intrepid2/src/Projection/Intrepid2_ProjectionTools.hpp:583:11: error: no member named 'SerialQR_Internal' in namespace 'KokkosBatched'; did you mean 'KokkosBatched::Impl::SerialQR_Internal'?
17:53:56           KokkosBatched::SerialQR_Internal::invoke(A.extent(0), A.extent(1),
17:53:56           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17:53:56           KokkosBatched::Impl::SerialQR_Internal
17:53:56 /home/jenkins/blake-new/workspace/KokkosEco_Trilinos_Blake_OneAPI2023_2_0_ICPX_Serial/Trilinos/kokkos-kernels/batched/dense/impl/KokkosBatched_QR_Serial_Internal.hpp:34:8: note: 'KokkosBatched::Impl::SerialQR_Internal' declared here
17:53:56 struct SerialQR_Internal {

Should I be able to add a guard on version and include updates to include the Impl namespace to resolve?

ndellingwood added a commit to ndellingwood/Trilinos that referenced this pull request Mar 12, 2025
Update in Intrepid2_ProjectionTools.hpp for compatibility changes
in kokkos/kokkos-kernels#2342

Signed-off-by: Nathan Ellingwood <ndellin@sandia.gov>
@ndellingwood
Copy link
Contributor

Yeah, simple namespace update was all that was needed, PR up trilinos/Trilinos#13876

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants