Skip to content

v2024.02.2

Compare
Choose a tag to compare
@rhornung67 rhornung67 released this 08 May 17:57
· 318 commits to main since this release
593f756

This release contains a bugfix and new execution policies that improve performance for GPU kernels with reductions.

Please download the RAJA-v2024.02.2.tar.gz file below. The others, generated by GitHub, may not work for you due to RAJA's dependencies on git submodules.

Notable changes include:

  • New features / API changes:

    • RAJA::loop_exec and associated policies (loop_reduce, etc.) have been removed. These were deprecated in an earlier release and type aliased to RAJA::seq_exec, etc. which have the same behavior as RAJA::loop_exec, etc. in the past. When you update to this version of RAJA, please change use of loop_exec too seq_exec in your code.
    • New GPU execution policies for CUDA and HIP added which provide improved performance for GPU kernels with reductions. Please see the RAJA User Guide for more information. Short summary:
      • Option added to change max grid size in policies that use the occupancy calculator.
      • Policies added to run with max occupancy, a fraction of of the max occupancy, and to run with a "concretizer" which allows a user to determine how to run based on what the occupancy calculator determines about a kernel.
      • Additional options to tune kernels containing reductions, such as
        • an option to initialize data on host for reductions that use atomic operations
        • an option to avoid device scope memory fences
    • Change ordering of SYCL thread index ordering in RAJA::launch to follow the SYCL "row-major" convention. Please see RAJA User Guide for more information.
  • Build changes/improvements:

    • NONE.
  • Bug fixes/improvements:

    • Fixed issue in bump-style allocator used internally in RAJA::launch.