-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batched - dense: Testing and fixing Serial QR #2342
Conversation
94496d2
to
4bfb715
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit - duplicated comment
18fc349
to
b24686b
Compare
2b4e03b
to
f4dfbe9
Compare
cefa092
to
e2771fd
Compare
The serial QR algorithms does not have unit-tests and is failing for non square matrices. See issue kokkos#2328. This first commit fixes the issue with rectangular matrices and adds a basic test for that use case. Next will work on adding a test that exercises the interfaces on multiple matrices of different sizes within a parallel_for. Finally equivalent tests will be added for the square case as well. Fixing unused variable error It looks like the Left NoTranspose ApplyQ is not doing the correct thing. Will have a look at that next. Spliting the tests a bit better, looking at Q and Qt together to detect issues with consistency between the two. Also eventually allows to use GEMM to figure out which one is wrong. Final fixes and adjusting the tolerance Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I requested small changes. I also have three additional points.
- It seems that
Side::Right
specialization ofKokkosBatched::SerialApplyQ
is not tested. Is it possible to add a test case for this specialization? - Could you also consider to add docstrings in APIs under
KokkosBatched_ApplyQ_Decl.hpp
andKokkosBatched_QR_Decl.hpp
? It would be helpful to generate API reference - Could you also add
static_assertion
to check the template argument is valid, e.g., check ifArgTrans
isNoTranspose
orTranspose
.
Checking that the side, mode and algo are all valid. Cleaning up the test by using create_mirror_view_and_copy when appropriate. Somehow the semantic of create_mirror_view_and_copy is different from the regular deep_copy so need to make a few changes after testing on GPUs... Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
I addressed 1. and 3. however for the documentation I will delay until we have it setup in the repo for batched algorithms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
As for the documentation, I can contribute if needed.
Let me know if you decided how to implement.
@lucbv some of these changes were incompatible with usage in Trilinos (intrepid2), this is showing in the nightly integration builds:
Should I be able to add a guard on version and include updates to include the Impl namespace to resolve? |
Update in Intrepid2_ProjectionTools.hpp for compatibility changes in kokkos/kokkos-kernels#2342 Signed-off-by: Nathan Ellingwood <ndellin@sandia.gov>
Yeah, simple namespace update was all that was needed, PR up trilinos/Trilinos#13876 |
The serial QR algorithms does not have unit-tests and is failing for non square matrices. See issue #2328.
This first commit fixes the issue with rectangular matrices and adds a basic test for that use case.
Next will work on adding a test that exercises the interfaces on multiple matrices of different sizes within a parallel_for. Finally equivalent tests will be added for the square case as well.
Tasks: