Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

mcourteaux · 2025-03-17T13:45:32Z

Fixes #8594.
This clears up all the preprocessor prologues or wrapper function prologues.

Update: pow() for WGSL is removed from the preamble now, and its behavior is emulated during codegen in the WGSL_CodeGen backend. There is a potential for simplyfing that again if there are meaningful bounds on expressions inferred. I'd consider that future work. Removing this from the preamble and using Halide's IR to emulate this, enables support for vectorized calls. There is a caveat with this: WGSL says there are no nans. However, their own builtin pow() function returns nan for negative inputs to x and y. Emulating that was tricky, because I think the WGSL simplifier removes those? Not sure why I couldn't simply do make_const(x.type(), std::nanf("")). I had to resort to make_const(x.type(), 1.0f) * Call::make(Float(32), "nan_f32", {}, Call::PureExtern) to get the NaN value in.

mcourteaux · 2025-03-21T00:25:32Z

I seem to have created a few issues. https://buildbot.halide-lang.org/master/#/builders/98/builds/88 (halide-testbranch-main-llvm20-arm-64-osx-cmake) shows useful errors. Will address them.

mcourteaux · 2025-04-19T07:41:18Z

Is this Windows issue fixed by now?

JIT session error: could not register eh-frame: __register_frame function not found
Assertion failed: !G && "InFlight alloc neither abandoned nor finalized", file C:\build_bot\worker\llvm-20-x86-64-windows\llvm-project\llvm\lib\ExecutionEngine\JITLink\JITLinkMemoryManager.cpp, line 251

@steven-johnson Could you request a rebuild for those failed build bots?

abadams · 2025-04-23T17:47:13Z

Working on the windows error in #8615

…unction for GPU backends.

…e or a signed typ.

mcourteaux · 2025-04-26T08:21:32Z

@alexreinking Can you fix mac-x86-worker-2: Python package numpy is no found. If that one is not found, I'm assuming it's a fresh install and more stuff might be missing...

mcourteaux · 2025-05-01T07:08:06Z

@abadams can you assess the test failure on arm. There is a hexagon related test that failed. I doubt it is related to my work.
@derek-gerstmann can you assess the test failure on Windows? I don't see any error.

derek-gerstmann · 2025-05-01T17:58:06Z

@mcourteaux The test failure on Windows was the correctness_math test on the Vulkan backend segfaulting due to a failed compilation for mismatched types for built-in constants for inf_f32, neg_inf_f32, and nan_f32. These are handled as extern calls and they assumed they were always scalars, which caused the pow vector x2 tests to fail. I've pushed fixes to make sure proper vector constants are generated.

abadams · 2025-05-06T19:38:55Z

I will defer to Derek for approval

derek-gerstmann

Do we need so many alias macros or is there cleaner way? Otherwise LGTM.

mcourteaux requested review from halidebuildbots and abadams March 17, 2025 13:46

mcourteaux force-pushed the math_funcs_table_gpu branch from 475c010 to 7b91926 Compare March 17, 2025 20:45

mcourteaux added 11 commits April 25, 2025 22:44

Rewrite function calls to math functions to the native built-in API f…

d2b811a

…unction for GPU backends.

Test vectorized support for math functions in correctness/math.cpp

fc6fc0e

Clang format.

d971ff5

Add missing #include

af59780

Fix fast_inverse on Metal.

e1764ad

Fix two small mistakes.

f0ddb5b

Move WGSL emulation of pow to IR instead of a function in the preamble.

0c62813

Make distinction between backends where abs() returns an unsigned typ…

8c3c7ae

…e or a signed typ.

Correct the type cast in WGSL pow().

a7f1452

Attempt to fix pow() on WGSL.

ce859f7

Attempt to make pow() return NaN on WebGPU.

1c730e5

mcourteaux force-pushed the math_funcs_table_gpu branch from e883584 to 1c730e5 Compare April 25, 2025 20:44

Trigger build.

06fd1f6

derek-gerstmann added 2 commits May 1, 2025 10:51

Make extern calls for nan_f32, inf_f32, neg_inf_f32 handle vector types.

cd90fc5

Clang format pass.

d675a4b

abadams requested a review from derek-gerstmann May 6, 2025 19:38

derek-gerstmann reviewed May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

mcourteaux commented Mar 17, 2025 •

edited

Loading

mcourteaux commented Mar 21, 2025 •

edited

Loading

mcourteaux commented Apr 19, 2025

abadams commented Apr 23, 2025

mcourteaux commented Apr 26, 2025

mcourteaux commented May 1, 2025

derek-gerstmann commented May 1, 2025

abadams commented May 6, 2025

derek-gerstmann left a comment

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Are you sure you want to change the base?

Math functions renaming table for GPU backends to support vectorized evaluation of math functions. #8595

Conversation

mcourteaux commented Mar 17, 2025 • edited Loading

mcourteaux commented Mar 21, 2025 • edited Loading

mcourteaux commented Apr 19, 2025

abadams commented Apr 23, 2025

mcourteaux commented Apr 26, 2025

mcourteaux commented May 1, 2025

derek-gerstmann commented May 1, 2025

abadams commented May 6, 2025

derek-gerstmann left a comment

Choose a reason for hiding this comment

mcourteaux commented Mar 17, 2025 •

edited

Loading

mcourteaux commented Mar 21, 2025 •

edited

Loading