Release v2024.02.2 · LLNL/RAJA

This release contains a bugfix and new execution policies that improve performance for GPU kernels with reductions.

Please download the RAJA-v2024.02.2.tar.gz file below. The others, generated by GitHub, may not work for you due to RAJA's dependencies on git submodules.

Notable changes include:

New features / API changes:
- RAJA::loop_exec and associated policies (loop_reduce, etc.) have been removed. These were deprecated in an earlier release and type aliased to RAJA::seq_exec, etc. which have the same behavior as RAJA::loop_exec, etc. in the past. When you update to this version of RAJA, please change use of loop_exec too seq_exec in your code.
- New GPU execution policies for CUDA and HIP added which provide improved performance for GPU kernels with reductions. Please see the RAJA User Guide for more information. Short summary:
  - Option added to change max grid size in policies that use the occupancy calculator.
  - Policies added to run with max occupancy, a fraction of of the max occupancy, and to run with a "concretizer" which allows a user to determine how to run based on what the occupancy calculator determines about a kernel.
  - Additional options to tune kernels containing reductions, such as
    - an option to initialize data on host for reductions that use atomic operations
    - an option to avoid device scope memory fences
- Change ordering of SYCL thread index ordering in RAJA::launch to follow the SYCL "row-major" convention. Please see RAJA User Guide for more information.
Build changes/improvements:
- NONE.
Bug fixes/improvements:
- Fixed issue in bump-style allocator used internally in RAJA::launch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2024.02.2