Skip to content

Commit 935768c

Browse files
author
Tim Foley
authored
Clean-ups related to expanded standard library coverage (shader-slang#1269)
This change continues the work already started in moving the definitions of many built-in functions to the standard library. The main focus in this change was reducing the number of operations that had to be special-cased on the CPU and CUDA targets by making sure that the scalar cases of built-in functions map to the proper names in the prelude (e.g., `F32_sin()`) via the ordinary `__target_intrinsic` mechanism. In some cases this cleanup meant that special-case logic that was constructing definitions for those functions using C++ code could be scrapped. Additional changes made along the way: * A few scalar functions that were missing in the CPU/CUDA preludes got added: `round`, hyperbolic trigonometric functions, `frexp`, `modf`, and `fma` * The floating-point `min()` and `max()` definitions in the preludes were changed to use intrinsic operations on the target (which are likely to follow IEEE semantics, while our definitions did not) * For the CUDA target, many of the functions had their names translated during code emit from, e.g., `sin` to `sinf`. This change makes the CUDA target more closely match the C++/CPU target in using names like `F32_sin` consistently. * For the CUDA target, a few additional functions have intrinsics that don't exist (portably) on CPU: `sincos()` and `rsqrt()`. * For the Slang stdlib definitions to work, a new `$P` replacement was defined for `__targert_intrinsic` that expands to a type based on the first operand of the function (e.g., `F32` for `float`). * I removed the dedicated opcodes for matrix-matrix, matrix-vector, and vector-matrix multiplication, and instead turned them into ordinary functions with definitions and `__target_intrinsic` modifiers to map them appropriately for HLSL and GLSL. This is realistically how we would have implemented these if we'd had `__target_intrinsic` from the start. Notes about possible follow-on work: * The `ldexp` function is still left in the Slang stdlib because it has to account for a floating-point exponent and the `math.h` version only handles integers for the exponent. It is possible that we can/should define another overload for `ldexp` (and `frexp`) that uses an integer for exponent, and then have that one be a built-in on CPU/CUDA, with the HLSL `frexp` being defined in the stdlib to delegate to the correct `frexp` for those targets. * The `firstbithigh` and related functions are missing for our CPU and CUDA targets, and will need to be added. It is worth nothing that `firstbithigh` apparently has some very odd functionality around signed integer arguments (which are supported, despite MSDN being unclear on that point). General cleanup will be required for those functions. * Maxing the various matrix and vector products no longer be intrinsic ops might affect how we emit code for them as sub-expressions (both whether we fold them into use sites and how we parenthize them). This doesn't seem to affect any of our existing tests, but we could consider marking these functions with `[__readNone]` to ensure they can be folded, and then also adding whatever modifier(s) we might invent to control precdence and parentheses insertion during emit.
1 parent b380b1a commit 935768c

12 files changed

+351
-482
lines changed

prelude/slang-cpp-scalar-intrinsics.h

+38-44
Original file line numberDiff line numberDiff line change
@@ -46,12 +46,16 @@ SLANG_FORCE_INLINE float F32_calcSafeRadians(float radians)
4646
// Unary
4747
SLANG_FORCE_INLINE float F32_ceil(float f) { return ::ceilf(f); }
4848
SLANG_FORCE_INLINE float F32_floor(float f) { return ::floorf(f); }
49+
SLANG_FORCE_INLINE float F32_round(float f) { return ::roundf(f); }
4950
SLANG_FORCE_INLINE float F32_sin(float f) { return ::sinf(f); }
5051
SLANG_FORCE_INLINE float F32_cos(float f) { return ::cosf(f); }
5152
SLANG_FORCE_INLINE float F32_tan(float f) { return ::tanf(f); }
5253
SLANG_FORCE_INLINE float F32_asin(float f) { return ::asinf(f); }
5354
SLANG_FORCE_INLINE float F32_acos(float f) { return ::acosf(f); }
5455
SLANG_FORCE_INLINE float F32_atan(float f) { return ::atanf(f); }
56+
SLANG_FORCE_INLINE float F32_sinh(float f) { return ::sinhf(f); }
57+
SLANG_FORCE_INLINE float F32_cosh(float f) { return ::coshf(f); }
58+
SLANG_FORCE_INLINE float F32_tanh(float f) { return ::tanhf(f); }
5559
SLANG_FORCE_INLINE float F32_log2(float f) { return ::log2f(f); }
5660
SLANG_FORCE_INLINE float F32_log(float f) { return ::logf(f); }
5761
SLANG_FORCE_INLINE float F32_log10(float f) { return ::log10f(f); }
@@ -61,42 +65,39 @@ SLANG_FORCE_INLINE float F32_abs(float f) { return ::fabsf(f); }
6165
SLANG_FORCE_INLINE float F32_trunc(float f) { return ::truncf(f); }
6266
SLANG_FORCE_INLINE float F32_sqrt(float f) { return ::sqrtf(f); }
6367
SLANG_FORCE_INLINE float F32_rsqrt(float f) { return 1.0f / F32_sqrt(f); }
64-
SLANG_FORCE_INLINE float F32_rcp(float f) { return 1.0f / f; }
6568
SLANG_FORCE_INLINE float F32_sign(float f) { return ( f == 0.0f) ? f : (( f < 0.0f) ? -1.0f : 1.0f); }
66-
SLANG_FORCE_INLINE float F32_saturate(float f) { return (f < 0.0f) ? 0.0f : (f > 1.0f) ? 1.0f : f; }
6769
SLANG_FORCE_INLINE float F32_frac(float f) { return f - F32_floor(f); }
68-
SLANG_FORCE_INLINE float F32_radians(float f) { return f * 0.01745329222f; }
6970

7071
SLANG_FORCE_INLINE bool F32_isnan(float f) { return isnan(f); }
7172
SLANG_FORCE_INLINE bool F32_isfinite(float f) { return isfinite(f); }
7273
SLANG_FORCE_INLINE bool F32_isinf(float f) { return isinf(f); }
7374

7475
// Binary
75-
SLANG_FORCE_INLINE float F32_min(float a, float b) { return a < b ? a : b; }
76-
SLANG_FORCE_INLINE float F32_max(float a, float b) { return a > b ? a : b; }
76+
SLANG_FORCE_INLINE float F32_min(float a, float b) { return ::fminf(a, b); }
77+
SLANG_FORCE_INLINE float F32_max(float a, float b) { return ::fmaxf(a, b); }
7778
SLANG_FORCE_INLINE float F32_pow(float a, float b) { return ::powf(a, b); }
7879
SLANG_FORCE_INLINE float F32_fmod(float a, float b) { return ::fmodf(a, b); }
7980
SLANG_FORCE_INLINE float F32_remainder(float a, float b) { return ::remainderf(a, b); }
80-
SLANG_FORCE_INLINE float F32_step(float a, float b) { return float(b >= a); }
8181
SLANG_FORCE_INLINE float F32_atan2(float a, float b) { return float(::atan2(a, b)); }
8282

83-
// TODO(JS):
84-
// Note C++ has ldexp, but it takes an integer for the exponent, it seems HLSL takes both as float
85-
SLANG_FORCE_INLINE float F32_ldexp(float m, float e) { return m * ::powf(2.0f, e); }
86-
87-
// Ternary
88-
SLANG_FORCE_INLINE float F32_smoothstep(float min, float max, float x)
89-
{
90-
const float t = x < min ? 0.0f : ((x > max) ? 1.0f : (x - min) / (max - min));
91-
return t * t * (3.0 - 2.0 * t);
83+
SLANG_FORCE_INLINE float F32_frexp(float x, float& e)
84+
{
85+
int ei;
86+
float m = ::frexpf(x, &ei);
87+
e = ei;
88+
return m;
89+
}
90+
SLANG_FORCE_INLINE float F32_modf(float x, float& ip)
91+
{
92+
return ::modff(x, &ip);
9293
}
93-
SLANG_FORCE_INLINE float F32_lerp(float x, float y, float s) { return x + s * (y - x); }
94-
SLANG_FORCE_INLINE float F32_clamp(float x, float min, float max) { return ( x < min) ? min : ((x > max) ? max : x); }
95-
SLANG_FORCE_INLINE void F32_sincos(float f, float& outSin, float& outCos) { outSin = F32_sin(f); outCos = F32_cos(f); }
9694

9795
SLANG_FORCE_INLINE uint32_t F32_asuint(float f) { Union32 u; u.f = f; return u.u; }
9896
SLANG_FORCE_INLINE int32_t F32_asint(float f) { Union32 u; u.f = f; return u.i; }
9997

98+
// Ternary
99+
SLANG_FORCE_INLINE float F32_fma(float a, float b, float c) { return ::fmaf(a, b, c); }
100+
100101
// ----------------------------- F64 -----------------------------------------
101102

102103
SLANG_FORCE_INLINE double F64_calcSafeRadians(double radians)
@@ -112,12 +113,16 @@ SLANG_FORCE_INLINE double F64_calcSafeRadians(double radians)
112113
// Unary
113114
SLANG_FORCE_INLINE double F64_ceil(double f) { return ::ceil(f); }
114115
SLANG_FORCE_INLINE double F64_floor(double f) { return ::floor(f); }
116+
SLANG_FORCE_INLINE double F64_round(double f) { return ::round(f); }
115117
SLANG_FORCE_INLINE double F64_sin(double f) { return ::sin(f); }
116118
SLANG_FORCE_INLINE double F64_cos(double f) { return ::cos(f); }
117119
SLANG_FORCE_INLINE double F64_tan(double f) { return ::tan(f); }
118120
SLANG_FORCE_INLINE double F64_asin(double f) { return ::asin(f); }
119121
SLANG_FORCE_INLINE double F64_acos(double f) { return ::acos(f); }
120122
SLANG_FORCE_INLINE double F64_atan(double f) { return ::atan(f); }
123+
SLANG_FORCE_INLINE double F64_sinh(double f) { return ::sinh(f); }
124+
SLANG_FORCE_INLINE double F64_cosh(double f) { return ::cosh(f); }
125+
SLANG_FORCE_INLINE double F64_tanh(double f) { return ::tanh(f); }
121126
SLANG_FORCE_INLINE double F64_log2(double f) { return ::log2(f); }
122127
SLANG_FORCE_INLINE double F64_log(double f) { return ::log(f); }
123128
SLANG_FORCE_INLINE double F64_log10(float f) { return ::log10(f); }
@@ -127,38 +132,32 @@ SLANG_FORCE_INLINE double F64_abs(double f) { return ::fabs(f); }
127132
SLANG_FORCE_INLINE double F64_trunc(double f) { return ::trunc(f); }
128133
SLANG_FORCE_INLINE double F64_sqrt(double f) { return ::sqrt(f); }
129134
SLANG_FORCE_INLINE double F64_rsqrt(double f) { return 1.0 / F64_sqrt(f); }
130-
SLANG_FORCE_INLINE double F64_rcp(double f) { return 1.0 / f; }
131135
SLANG_FORCE_INLINE double F64_sign(double f) { return (f == 0.0) ? f : ((f < 0.0) ? -1.0 : 1.0); }
132-
SLANG_FORCE_INLINE double F64_saturate(double f) { return (f < 0.0) ? 0.0 : (f > 1.0) ? 1.0 : f; }
133136
SLANG_FORCE_INLINE double F64_frac(double f) { return f - F64_floor(f); }
134-
SLANG_FORCE_INLINE double F64_radians(double f) { return f * 0.01745329222; }
135137

136138
SLANG_FORCE_INLINE bool F64_isnan(double f) { return isnan(f); }
137139
SLANG_FORCE_INLINE bool F64_isfinite(double f) { return isfinite(f); }
138140
SLANG_FORCE_INLINE bool F64_isinf(double f) { return isinf(f); }
139141

140142
// Binary
141-
SLANG_FORCE_INLINE double F64_min(double a, double b) { return a < b ? a : b; }
142-
SLANG_FORCE_INLINE double F64_max(double a, double b) { return a > b ? a : b; }
143+
SLANG_FORCE_INLINE double F64_min(double a, double b) { return ::fmin(a, b); }
144+
SLANG_FORCE_INLINE double F64_max(double a, double b) { return ::fmax(a, b); }
143145
SLANG_FORCE_INLINE double F64_pow(double a, double b) { return ::pow(a, b); }
144146
SLANG_FORCE_INLINE double F64_fmod(double a, double b) { return ::fmod(a, b); }
145147
SLANG_FORCE_INLINE double F64_remainder(double a, double b) { return ::remainder(a, b); }
146-
SLANG_FORCE_INLINE double F64_step(double a, double b) { return double(b >= a); }
147148
SLANG_FORCE_INLINE double F64_atan2(double a, double b) { return ::atan2(a, b); }
148149

149-
// TODO(JS):
150-
// Note C++ has ldexp, but it takes an integer for the exponent, it seems HLSL takes both as float
151-
SLANG_FORCE_INLINE double F64_ldexp(double m, double e) { return m * ::pow(2.0, e); }
152-
153-
// Ternary
154-
SLANG_FORCE_INLINE double F64_smoothstep(double min, double max, double x)
155-
{
156-
const double t = x < min ? 0.0 : ((x > max) ? 1.0 : (x - min) / (max - min));
157-
return t * t * (3.0 - 2.0 * t);
150+
SLANG_FORCE_INLINE double F64_frexp(double x, double& e)
151+
{
152+
int ei;
153+
double m = ::frexp(x, &ei);
154+
e = ei;
155+
return m;
156+
}
157+
SLANG_FORCE_INLINE double F64_modf(double x, double& ip)
158+
{
159+
return ::modf(x, &ip);
158160
}
159-
SLANG_FORCE_INLINE double F64_lerp(double x, double y, double s) { return x + s * (y - x); }
160-
SLANG_FORCE_INLINE double F64_clamp(double x, double min, double max) { return (x < min) ? min : ((x > max) ? max : x); }
161-
SLANG_FORCE_INLINE void F64_sincos(double f, double& outSin, double& outCos) { outSin = F64_sin(f); outCos = F64_cos(f); }
162161

163162
SLANG_FORCE_INLINE void F64_asuint(double d, uint32_t& low, uint32_t& hi)
164163
{
@@ -176,15 +175,16 @@ SLANG_FORCE_INLINE void F64_asint(double d, int32_t& low, int32_t& hi)
176175
hi = int32_t(u.u >> 32);
177176
}
178177

178+
// Ternary
179+
SLANG_FORCE_INLINE double F64_fma(double a, double b, double c) { return ::fma(a, b, c); }
180+
179181
// ----------------------------- I32 -----------------------------------------
180182

181183
SLANG_FORCE_INLINE int32_t I32_abs(int32_t f) { return (f < 0) ? -f : f; }
182184

183185
SLANG_FORCE_INLINE int32_t I32_min(int32_t a, int32_t b) { return a < b ? a : b; }
184186
SLANG_FORCE_INLINE int32_t I32_max(int32_t a, int32_t b) { return a > b ? a : b; }
185187

186-
SLANG_FORCE_INLINE int32_t I32_clamp(int32_t x, int32_t min, int32_t max) { return ( x < min) ? min : ((x > max) ? max : x); }
187-
188188
SLANG_FORCE_INLINE float I32_asfloat(int32_t x) { Union32 u; u.i = x; return u.f; }
189189
SLANG_FORCE_INLINE uint32_t I32_asuint(int32_t x) { return uint32_t(x); }
190190
SLANG_FORCE_INLINE double I32_asdouble(int32_t low, int32_t hi )
@@ -201,8 +201,6 @@ SLANG_FORCE_INLINE uint32_t U32_abs(uint32_t f) { return f; }
201201
SLANG_FORCE_INLINE uint32_t U32_min(uint32_t a, uint32_t b) { return a < b ? a : b; }
202202
SLANG_FORCE_INLINE uint32_t U32_max(uint32_t a, uint32_t b) { return a > b ? a : b; }
203203

204-
SLANG_FORCE_INLINE uint32_t U32_clamp(uint32_t x, uint32_t min, uint32_t max) { return ( x < min) ? min : ((x > max) ? max : x); }
205-
206204
SLANG_FORCE_INLINE float U32_asfloat(uint32_t x) { Union32 u; u.u = x; return u.f; }
207205
SLANG_FORCE_INLINE uint32_t U32_asint(int32_t x) { return uint32_t(x); }
208206

@@ -238,8 +236,6 @@ SLANG_FORCE_INLINE uint64_t U64_abs(uint64_t f) { return f; }
238236
SLANG_FORCE_INLINE uint64_t U64_min(uint64_t a, uint64_t b) { return a < b ? a : b; }
239237
SLANG_FORCE_INLINE uint64_t U64_max(uint64_t a, uint64_t b) { return a > b ? a : b; }
240238

241-
SLANG_FORCE_INLINE uint64_t U64_clamp(uint64_t x, uint64_t min, uint64_t max) { return ( x < min) ? min : ((x > max) ? max : x); }
242-
243239
SLANG_FORCE_INLINE uint32_t U64_countbits(uint64_t v)
244240
{
245241
#if SLANG_GCC_FAMILY
@@ -264,8 +260,6 @@ SLANG_FORCE_INLINE int64_t I64_abs(int64_t f) { return (f < 0) ? -f : f; }
264260
SLANG_FORCE_INLINE int64_t I64_min(int64_t a, int64_t b) { return a < b ? a : b; }
265261
SLANG_FORCE_INLINE int64_t I64_max(int64_t a, int64_t b) { return a > b ? a : b; }
266262

267-
SLANG_FORCE_INLINE int64_t I64_clamp(int64_t x, int64_t min, int64_t max) { return ( x < min) ? min : ((x > max) ? max : x); }
268-
269263
#ifdef SLANG_PRELUDE_NAMESPACE
270264
}
271265
#endif

0 commit comments

Comments
 (0)