You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clean-ups related to expanded standard library coverage (shader-slang#1269)
This change continues the work already started in moving the definitions of many built-in functions to the standard library.
The main focus in this change was reducing the number of operations that had to be special-cased on the CPU and CUDA targets by making sure that the scalar cases of built-in functions map to the proper names in the prelude (e.g., `F32_sin()`) via the ordinary `__target_intrinsic` mechanism. In some cases this cleanup meant that special-case logic that was constructing definitions for those functions using C++ code could be scrapped.
Additional changes made along the way:
* A few scalar functions that were missing in the CPU/CUDA preludes got added: `round`, hyperbolic trigonometric functions, `frexp`, `modf`, and `fma`
* The floating-point `min()` and `max()` definitions in the preludes were changed to use intrinsic operations on the target (which are likely to follow IEEE semantics, while our definitions did not)
* For the CUDA target, many of the functions had their names translated during code emit from, e.g., `sin` to `sinf`. This change makes the CUDA target more closely match the C++/CPU target in using names like `F32_sin` consistently.
* For the CUDA target, a few additional functions have intrinsics that don't exist (portably) on CPU: `sincos()` and `rsqrt()`.
* For the Slang stdlib definitions to work, a new `$P` replacement was defined for `__targert_intrinsic` that expands to a type based on the first operand of the function (e.g., `F32` for `float`).
* I removed the dedicated opcodes for matrix-matrix, matrix-vector, and vector-matrix multiplication, and instead turned them into ordinary functions with definitions and `__target_intrinsic` modifiers to map them appropriately for HLSL and GLSL. This is realistically how we would have implemented these if we'd had `__target_intrinsic` from the start.
Notes about possible follow-on work:
* The `ldexp` function is still left in the Slang stdlib because it has to account for a floating-point exponent and the `math.h` version only handles integers for the exponent. It is possible that we can/should define another overload for `ldexp` (and `frexp`) that uses an integer for exponent, and then have that one be a built-in on CPU/CUDA, with the HLSL `frexp` being defined in the stdlib to delegate to the correct `frexp` for those targets.
* The `firstbithigh` and related functions are missing for our CPU and CUDA targets, and will need to be added. It is worth nothing that `firstbithigh` apparently has some very odd functionality around signed integer arguments (which are supported, despite MSDN being unclear on that point). General cleanup will be required for those functions.
* Maxing the various matrix and vector products no longer be intrinsic ops might affect how we emit code for them as sub-expressions (both whether we fold them into use sites and how we parenthize them). This doesn't seem to affect any of our existing tests, but we could consider marking these functions with `[__readNone]` to ensure they can be folded, and then also adding whatever modifier(s) we might invent to control precdence and parentheses insertion during emit.
0 commit comments