Skip to content

Commit 98cab6f

Browse files
Conform non-suffixed integer literals (#5717)
* Make non-suffixed integer literal type resolution conform to C * Update integer literal tests * Clean up integer literal implementation a bit * Update docs on integer literals * Clean up docs update * Clean up docs update * Add comment on INT64_MIN edge case * Fixed failing test, fixed formatting and cleaned up code --------- Co-authored-by: Yong He <yonghe@outlook.com>
1 parent 600cce2 commit 98cab6f

11 files changed

+219
-61
lines changed

docs/64bit-type-support.md

+31-10
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ Slang 64-bit Type Support
55

66
* Not all targets support 64 bit types, or all 64 bit types
77
* 64 bit integers generally require later APIs/shader models
8-
* When specifying 64 bit literals *always* use the type suffixes (ie `L`, `ULL`, `LL`)
8+
* When specifying 64 bit floating-point literals *always* use the type suffixes (ie `L`)
9+
* An integer literal will be interpreted as 64 bits if it cannot fit in a 32 bit value.
910
* GPU target/s generally do not support all double intrinsics
1011
* Typically missing are trascendentals (sin, cos etc), logarithm and exponential functions
1112
* CUDA is the exception supporting nearly all double intrinsics
@@ -28,7 +29,7 @@ This also applies to vector and matrix versions of these types.
2829

2930
Unfortunately if a specific target supports the type or the typical HLSL intrinsic functions (such as sin/cos/max/min etc) depends very much on the target.
3031

31-
Special attention has to be made with respect to literal 64 bit types. By default float and integer literals if they do not have an explicit suffix are assumed to be 32 bit. There is a variety of reasons for this design choice - the main one being around by default behavior of getting good performance. The suffixes required for 64 bit types are as follows
32+
Special attention has to be made with respect to literal 64 bit types. By default float literals if they do not have an explicit suffix are assumed to be 32 bit. There is a variety of reasons for this design choice - the main one being around by default behavior of getting good performance. The suffixes required for 64 bit types are as follows
3233

3334
```
3435
// double - 'l' or 'L'
@@ -40,27 +41,47 @@ double b = 1.34e-200;
4041
// int64_t - 'll' or 'LL' (or combination of upper/lower)
4142
4243
int64_t c = -5436365345345234ll;
43-
// WRONG!: This is the same as d = int64_t(int32_t(-5436365345345234)) which means d ! = -5436365345345234LL.
44-
// Will produce a warning.
45-
int64_t d = -5436365345345234;
4644
4745
int64_t e = ~0LL; // Same as 0xffffffffffffffff
48-
// Does produce the same result as 'e' because equivalent int64_t(~int32_t(0))
49-
int64_t f = ~0;
5046
5147
// uint64_t - 'ull' or 'ULL' (or combination of upper/lower)
5248
5349
uint64_t g = 0x8000000000000000ull;
54-
// WRONG!: This is the same as h = uint64_t(uint32_t(0x8000000000000000)) which means h = 0
55-
// Will produce a warning.
56-
uint64_t h = 0x8000000000000000u;
5750
5851
uint64_t i = ~0ull; // Same as 0xffffffffffffffff
5952
uint64_t j = ~0; // Equivalent to 'i' because uint64_t(int64_t(~int32_t(0)));
6053
```
6154

6255
These issues are discussed more on issue [#1185](https://github.com/shader-slang/slang/issues/1185)
6356

57+
The type of a decimal non-suffixed integer literal is the first integer type from the list [`int`, `int64_t`]
58+
which can represent the specified literal value. If the value cannot fit, the literal is represented as an `uint64_t`
59+
and a warning is given.
60+
The type of a hexadecimal non-suffixed integer literal is the first type from the list [`int`, `uint`, `int64_t`, `uint64_t`]
61+
that can represent the specified literal value. A non-suffixed integer literal will be 64 bit if it cannot fit in 32 bits.
62+
```
63+
// Same as int64_t a = int(1), the value can fit into a 32 bit integer.
64+
int64_t a = 1;
65+
66+
// Same as int64_t b = int64_t(2147483648), the value cannot fit into a 32 bit integer.
67+
int64_t b = 2147483648;
68+
69+
// Same as int64_t c = uint64_t(18446744073709551615), the value is larger than the maximum value of a signed 64 bit
70+
// integer, and is interpreted as an unsigned 64 bit integer. Warning is given.
71+
uint64_t c = 18446744073709551615;
72+
73+
// Same as uint64_t = int(0x7FFFFFFF), the value can fit into a 32 bit integer.
74+
uint64_t d = 0x7FFFFFFF;
75+
76+
// Same as uint64_t = int64_t(0x7FFFFFFFFFFFFFFF), the value cannot fit into an unsigned 32 bit integer but
77+
// can fit into a signed 64 bit integer.
78+
uint64_t e = 0x7FFFFFFFFFFFFFFF;
79+
80+
// Same as uint64_t = uint64_t(0xFFFFFFFFFFFFFFFF), the value cannot fit into a signed 64 bit integer, and
81+
// is interpreted as an unsigned 64 bit integer.
82+
uint64_t f = 0xFFFFFFFFFFFFFFFF;
83+
```
84+
6485
Double support
6586
==============
6687

docs/user-guide/02-conventional-features.md

+5-2
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,11 @@ The following integer types are provided:
3939

4040
All targets support the 32-bit `int` and `uint` types, but support for the other types depends on the capabilities of each target platform.
4141

42-
Integer literals can be both decimal and hexadecimal, and default to the `int` type.
43-
A literal can be explicitly made unsigned with a `u` suffix.
42+
Integer literals can be both decimal and hexadecimal. An integer literal can be explicitly made unsigned
43+
with a `u` suffix, and explicitly made 64-bit with the `ll` suffix. The type of a decimal non-suffixed integer literal is the first integer type from
44+
the list [`int`, `int64_t`] which can represent the specified literal value. If the value cannot fit, the literal is represented as
45+
an `uint64_t` and a warning is given. The type of hexadecimal non-suffixed integer literal is the first type from the list
46+
[`int`, `uint`, `int64_t`, `uint64_t`] that can represent the specified literal value. For more information on 64 bit integer literals see the documentation on [64 bit type support](../64bit-type-support.md).
4447

4548
The following floating-point type are provided:
4649

source/compiler-core/slang-lexer.cpp

+9-1
Original file line numberDiff line numberDiff line change
@@ -673,7 +673,10 @@ static int _readOptionalBase(char const** ioCursor)
673673
}
674674

675675

676-
IntegerLiteralValue getIntegerLiteralValue(Token const& token, UnownedStringSlice* outSuffix)
676+
IntegerLiteralValue getIntegerLiteralValue(
677+
Token const& token,
678+
UnownedStringSlice* outSuffix,
679+
bool* outIsDecimalBase)
677680
{
678681
IntegerLiteralValue value = 0;
679682

@@ -698,6 +701,11 @@ IntegerLiteralValue getIntegerLiteralValue(Token const& token, UnownedStringSlic
698701
*outSuffix = UnownedStringSlice(cursor, end);
699702
}
700703

704+
if (outIsDecimalBase)
705+
{
706+
*outIsDecimalBase = (base == 10);
707+
}
708+
701709
return value;
702710
}
703711

source/compiler-core/slang-lexer.h

+4-1
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,10 @@ String getFileNameTokenValue(Token const& token);
172172
typedef int64_t IntegerLiteralValue;
173173
typedef double FloatingPointLiteralValue;
174174

175-
IntegerLiteralValue getIntegerLiteralValue(Token const& token, UnownedStringSlice* outSuffix = 0);
175+
IntegerLiteralValue getIntegerLiteralValue(
176+
Token const& token,
177+
UnownedStringSlice* outSuffix = 0,
178+
bool* outIsDecimalBase = 0);
176179
FloatingPointLiteralValue getFloatingPointLiteralValue(
177180
Token const& token,
178181
UnownedStringSlice* outSuffix = 0);

source/slang/slang-diagnostic-defs.h

+6
Original file line numberDiff line numberDiff line change
@@ -1574,6 +1574,12 @@ DIAGNOSTIC(
15741574
Error,
15751575
invalidFloatingPointLiteralSuffix,
15761576
"invalid suffix '$0' on floating-point literal")
1577+
DIAGNOSTIC(
1578+
39999,
1579+
Warning,
1580+
integerLiteralTooLarge,
1581+
"integer literal is too large to be represented in a signed integer type, interpreting as "
1582+
"unsigned")
15771583

15781584
DIAGNOSTIC(
15791585
39999,

source/slang/slang-parser.cpp

+73-10
Original file line numberDiff line numberDiff line change
@@ -3136,8 +3136,7 @@ static Modifier* ParseSemantic(Parser* parser)
31363136
BitFieldModifier* bitWidthMod = parser->astBuilder->create<BitFieldModifier>();
31373137
parser->FillPosition(bitWidthMod);
31383138
const auto token = parser->tokenReader.advanceToken();
3139-
UnownedStringSlice suffix;
3140-
bitWidthMod->width = getIntegerLiteralValue(token, &suffix);
3139+
bitWidthMod->width = getIntegerLiteralValue(token);
31413140
return bitWidthMod;
31423141
}
31433142
else if (parser->LookAheadToken(TokenType::CompletionRequest))
@@ -6638,6 +6637,64 @@ static IntegerLiteralValue _fixIntegerLiteral(
66386637
return value;
66396638
}
66406639

6640+
static BaseType _determineNonSuffixedIntegerLiteralType(
6641+
IntegerLiteralValue value,
6642+
bool isDecimalBase,
6643+
Token* token,
6644+
DiagnosticSink* sink)
6645+
{
6646+
const uint64_t rawValue = (uint64_t)value;
6647+
6648+
/// Non-suffixed integer literal types
6649+
///
6650+
/// The type is the first from the following list in which the value can fit:
6651+
/// - For decimal bases:
6652+
/// - `int`
6653+
/// - `int64_t`
6654+
/// - For non-decimal bases:
6655+
/// - `int`
6656+
/// - `uint`
6657+
/// - `int64_t`
6658+
/// - `uint64_t`
6659+
///
6660+
/// The lexer scans the negative(-) part of literal separately, and the value part here
6661+
/// is always positive hence it is sufficient to only compare with the maximum limits.
6662+
BaseType baseType;
6663+
if (rawValue <= INT32_MAX)
6664+
{
6665+
baseType = BaseType::Int;
6666+
}
6667+
else if ((rawValue <= UINT32_MAX) && !isDecimalBase)
6668+
{
6669+
baseType = BaseType::UInt;
6670+
}
6671+
else if (rawValue <= INT64_MAX)
6672+
{
6673+
baseType = BaseType::Int64;
6674+
}
6675+
else
6676+
{
6677+
baseType = BaseType::UInt64;
6678+
6679+
if (isDecimalBase)
6680+
{
6681+
// There is an edge case here where 9223372036854775808 or INT64_MAX + 1
6682+
// brings us here, but the complete literal is -9223372036854775808 or INT64_MIN and is
6683+
// valid. Unfortunately because the lexer handles the negative(-) part of the literal
6684+
// separately it is impossible to know whether the literal has a negative sign or not.
6685+
// We emit the warning and initially process it as a uint64 anyways, and the negative
6686+
// sign will be properly parsed and the value will still be properly stored as a
6687+
// negative INT64_MIN.
6688+
6689+
// Decimal integer is too large to be represented as signed.
6690+
// Output warning that it is represented as unsigned instead.
6691+
sink->diagnose(*token, Diagnostics::integerLiteralTooLarge);
6692+
}
6693+
}
6694+
6695+
return baseType;
6696+
}
6697+
66416698
static bool _isCast(Parser* parser, Expr* expr)
66426699
{
66436700
if (as<PointerTypeExpr>(expr))
@@ -6925,20 +6982,18 @@ static Expr* parseAtomicExpr(Parser* parser)
69256982
constExpr->token = token;
69266983

69276984
UnownedStringSlice suffix;
6928-
IntegerLiteralValue value = getIntegerLiteralValue(token, &suffix);
6985+
bool isDecimalBase;
6986+
IntegerLiteralValue value = getIntegerLiteralValue(token, &suffix, &isDecimalBase);
69296987

69306988
// Look at any suffix on the value
69316989
char const* suffixCursor = suffix.begin();
69326990
const char* const suffixEnd = suffix.end();
6991+
const bool suffixExists = (suffixCursor != suffixEnd);
69336992

6934-
// If no suffix is defined go with the default
6935-
BaseType suffixBaseType = BaseType::Int;
6936-
6937-
if (suffixCursor < suffixEnd)
6993+
// Mark as void, taken as an error
6994+
BaseType suffixBaseType = BaseType::Void;
6995+
if (suffixExists)
69386996
{
6939-
// Mark as void, taken as an error
6940-
suffixBaseType = BaseType::Void;
6941-
69426997
int lCount = 0;
69436998
int uCount = 0;
69446999
int zCount = 0;
@@ -7008,6 +7063,14 @@ static Expr* parseAtomicExpr(Parser* parser)
70087063
suffixBaseType = BaseType::Int;
70097064
}
70107065
}
7066+
else
7067+
{
7068+
suffixBaseType = _determineNonSuffixedIntegerLiteralType(
7069+
value,
7070+
isDecimalBase,
7071+
&token,
7072+
parser->sink);
7073+
}
70117074

70127075
value = _fixIntegerLiteral(suffixBaseType, value, &token, parser->sink);
70137076

tests/diagnostics/int-literal.slang

+18-7
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
int doSomething(int a)
44
{
5-
// Warning can't fit
6-
int c0 = 0x800000000;
5+
// No warning, literal will be interpreted as 64 bit.
6+
uint64_t c0 = 0x800000000;
77

88
// No warning as top bits are just ignored
99
int c1 = -1ll;
@@ -13,19 +13,30 @@ int doSomething(int a)
1313
// Should sign extend
1414
int c3 = 0x80000000;
1515

16-
// Should give a warning (ideally including the preceeding -)
17-
// Currently we don't have the -, because the lexer lexes - independently
18-
int c4 = -0xfffffffff;
16+
// No warning, hex literal will be interpreted as an unsigned 64 integer then signed with negative operator.
17+
int64_t c4 = -0xfffffffff;
1918

20-
//
21-
a += c0 + c1 + c2;
19+
a += (int)c0 + c1 + c2;
2220

2321
int64_t b = 0;
2422

2523
// Ok
2624
b += 0x800000000ll;
2725

2826
uint64_t c5 = -2ull;
27+
28+
// Warning, integer literal is too large for signed 64 bit, must be interpreted as unsigned.
29+
uint64_t d0 = 18446744073709551615;
30+
31+
// Warning, integer literal is too small for signed 64 bit, must be interpreted as unsigned.
32+
uint64_t d1 = -9223372036854775809;
33+
34+
// This is INT64_MIN and valid negative signed integer, but warning will be emitted as negative(-) is scanned
35+
// separately in the lexer, and the positive literal portion will emit a warning.
36+
// The final value will still be correctly set as INT64_MIN.
37+
//
38+
// To not have this warning the lexer must scan the negative operator and number together.
39+
uint64_t d2 = -9223372036854775808;
2940

3041
return a + int(b);
3142
}
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,14 @@
11
result code = 0
22
standard error = {
3-
tests/diagnostics/int-literal.slang(6): warning 39999: integer literal '0x800000000' too large for type 'int' truncated to '0'
4-
int c0 = 0x800000000;
5-
^~~~~~~~~~~
6-
tests/diagnostics/int-literal.slang(18): warning 39999: integer literal '0xfffffffff' too large for type 'int' truncated to '-1'
7-
int c4 = -0xfffffffff;
8-
^~~~~~~~~~~
3+
tests/diagnostics/int-literal.slang(29): warning 39999: integer literal is too large to be represented in a signed integer type, interpreting as unsigned
4+
uint64_t d0 = 18446744073709551615;
5+
^~~~~~~~~~~~~~~~~~~~
6+
tests/diagnostics/int-literal.slang(32): warning 39999: integer literal is too large to be represented in a signed integer type, interpreting as unsigned
7+
uint64_t d1 = -9223372036854775809;
8+
^~~~~~~~~~~~~~~~~~~
9+
tests/diagnostics/int-literal.slang(39): warning 39999: integer literal is too large to be represented in a signed integer type, interpreting as unsigned
10+
uint64_t d2 = -9223372036854775808;
11+
^~~~~~~~~~~~~~~~~~~
912
}
1013
standard output = {
1114
}

0 commit comments

Comments
 (0)