Skip to content

Conversation

the-horo
Copy link
Contributor

The cmake files already check the triple for musl and that is a more reliable check compared to looking for Alpine's apk.

The cmake files already check the triple for musl and that is a more
reliable check compared to looking for Alpine's apk.

Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
@the-horo the-horo changed the title [druntime-test]: exceptions: Drop musl auto-detection [druntime-test]: Drop musl auto-detection Aug 18, 2025
@the-horo the-horo marked this pull request as draft August 20, 2025 17:37
@the-horo
Copy link
Contributor Author

Alright, some failures that I've seen locally as well. Let me try to fix them here and then backport upstream

@the-horo the-horo changed the title [druntime-test]: Drop musl auto-detection [druntime-test]: Fix glibc-aarch64 skipped tests Aug 21, 2025
@the-horo the-horo marked this pull request as ready for review August 21, 2025 20:38
@kinke
Copy link
Member

kinke commented Aug 22, 2025

Thx for going down yet another rabbit hole! :) - The posix fixes will need to be upstreamed.

Weird that one CircleCI job fails:

Testing memoryerror_null_read
/home/circleci/build/runtime/druntime-test-exceptions-debug/memoryerror_null_read  2>/home/circleci/build/runtime/druntime-test-exceptions-debug/memoryerror_null_read.stderr || true
Searched for pattern "segmentation fault: null pointer read/write operation", NEGATE = 
Segmentation fault (core dumped)

I think the config is pretty close to the Cirrus Ubuntu rolling x64 shared-libs-only gdmd job.

@@ -110,7 +110,9 @@
#include <linux/io_uring.h>
#include <linux/eventpoll.h>
#endif
#if !(defined(__linux__) && defined(__aarch64__)) // /usr/include/bits/math-vector.h(162): Error: undefined identifier `__Float32x4_t`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you haven't by chance spotted some macro that we could define (e.g., in importc.h) , to avoid these apparent type extensions? Not being able to include math.h on AArch64 is pretty bad IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last time I touched compilable/stdcheaders which suffered from the same issue I didn't find anything. Maybe it's worth for me to take another look, since it's so easy to hit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a RPi 5 now, and with vanilla Raspbian 12 (incl. gcc-12), math.h can be included just fine. I don't have a /usr/include/bits dir there...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the include-flow (for me) is like:

// math.h

#include <bits/libc-header-start.h> /* defines __GNUC_PREREQ */
// ...
#include <bits/math-vector.h>
// bits/math-vector.h

#if __GNUC_PREREQ(9, 0)
#  define __ADVSIMD_VEC_MATH_SUPPORTED
typedef __Float32x4_t __f32x4_t;
typedef __Float64x2_t __f64x2_t;
#elif __glibc_clang_prereq(8, 0)
#  define __ADVSIMD_VEC_MATH_SUPPORTED
typedef __attribute__ ((__neon_vector_type__ (4))) float __f32x4_t;
typedef __attribute__ ((__neon_vector_type__ (2))) double __f64x2_t;
#endif

Getting out of this I think is only possible with either:

  1. #undef __GNUC__ or #undef __GNUC_MINOR__ (this would make the __GNUC_PREREQ fail)
  2. define __Float32x4_t

features.h defines __GNUC_PREREQ as:

#if defined __GNUC__ && defined __GNUC_MINOR__
# define __GNUC_PREREQ(maj, min) \
        ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#else
# define __GNUC_PREREQ(maj, min) 0
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm; in my case, /usr/include/aarch64-linux-gnu/bits/math-vector.h boils down to a:

#include <bits/libm-simd-decl-stubs.h>

A potential avenue to explore would be to define __Float32x4_t and __Float64x2_t in __importc_builtins.di - as aliases to the D vector types. That might work with gcc then; I can't think of how we'd be able to support clang's __attribute__ ((__neon_vector_type__ (4))) float - unless we'd go ahead and try to fake an older gcc/clang version (< v9/v8 apparently) on Linux AArch64 only.

Copy link
Member

@kinke kinke Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that's pretty good news! So if that clang typedef 'works' for importC, I guess it strips all __attribute__ stuff there and treats it as a scalar float/double. alias __Float32x4_t = __vector(float[4]) (in __importc_builtins.di, either in general or just for AArch64) would probably be correct, and alias __Float64x2_t = __vector(double[2]). No idea about the 3 __SV* types.

Edit: According to godbolt, I guess an empty struct dummy is okay:

<source>:3:1: error: SVE type '__SVBool_t' does not have a fixed size
    3 | __SVBool_t a; 
<source>:4:1: error: SVE type '__SVFloat32_t' does not have a fixed size
    4 | __SVFloat32_t b;
<source>:5:1: error: SVE type '__SVFloat64_t' does not have a fixed size
    5 | __SVFloat64_t c;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __Float??x? aliases seem to work:

dlang: arm64-desk-s /tmp/x # cat d.d
import importc_math;
import std.math;
void main () {
	__f64x2_t x;
	x[0] = PI * 0;
	x[1] = PI * 0.5;

	auto r = _ZGVnN2v_cos(x);
	import std.stdio;
	writeln(r);
}
dlang: arm64-desk-s /tmp/x # cat importc_math.c
#include <math.h>
dlang: arm64-desk-s /tmp/x # ~/build/bin/ldc2 -run d.d
[1, 6.12323e-17]

The only issue that I hit was that the vector types are not indexable in C:

dlang: arm64-desk-s /tmp/x # cat c.c
#include <math.h>
#include <stdio.h>
int main () {
	__f32x4_t x;
	x[0] = M_PI;

	__f32x4_t r = _ZGVnN4v_cosf(x);

	printf("%f\n", r[0]);
}
dlang: arm64-desk-s /tmp/x # ~/build/bin/ldc2 -o- c.c
c.c(5): Error: can only `*` a pointer, not a `__vector(float[4])`
 x[0] = M_PI;
  ^
c.c(9): Error: can only `*` a pointer, not a `__vector(float[4])`
 printf("%f\n", r[0]);
                 ^

I don't personally have any issue with this though as I only value importC, not compileC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't personally have any issue with this though as I only value importC, not compileC.

Exactly my thoughts. Just nice that one can resort to C function prototypes with vector types, passing our float4/double2 types directly. And in case someone was to try to use the SVE types (or functions taking thiose etc.), a dummy empty struct should yield compile errors right away - definitely better than something accidentally compiling, but totally failing at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that the SVE types are allowed by the compiler and their usage does lead to runtime errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah well but I guess noone feeds such a function with a default-initialized SVE type, so should get a compile error when trying to pass in some vector. Alternatively, a forward-declared struct might work better, assuming that math.h still works then.

@the-horo the-horo force-pushed the druntime-test-fix-musl-check branch 2 times, most recently from da14a88 to 2333e27 Compare August 22, 2025 17:24
…vm-20

Since llvm-20, the stack variables are optimized away leading to an
infinite loop. The affected assembly became:
```
Disassembly of section .text._D25memoryerror_stackoverflow1fFKG1024hZv:

0000000000000000 <_D25memoryerror_stackoverflow1fFKG1024hZv>:
   0:	e9 00 00 00 00       	jmp    5 <_D25memoryerror_stackoverflow1fFKG1024hZv+0x5>

Disassembly of section .text._D25memoryerror_stackoverflow1gFKG1024hZv:

0000000000000000 <_D25memoryerror_stackoverflow1gFKG1024hZv>:
   0:	e9 00 00 00 00       	jmp    5 <_D25memoryerror_stackoverflow1gFKG1024hZv+0x5>
```

Compared to the llvm-19 optimized version:
```
Disassembly of section .text._D25memoryerror_stackoverflow1fFKG1024hZv:

0000000000000000 <_D25memoryerror_stackoverflow1fFKG1024hZv>:
   0:	53                   	push   %rbx
   1:	48 81 ec 00 04 00 00 	sub    $0x400,%rsp
   8:	48 89 fe             	mov    %rdi,%rsi
   b:	48 89 e3             	mov    %rsp,%rbx
   e:	ba 00 04 00 00       	mov    $0x400,%edx
  13:	48 89 df             	mov    %rbx,%rdi
  16:	e8 00 00 00 00       	call   1b <_D25memoryerror_stackoverflow1fFKG1024hZv+0x1b>
  1b:	48 89 df             	mov    %rbx,%rdi
  1e:	e8 00 00 00 00       	call   23 <_D25memoryerror_stackoverflow1fFKG1024hZv+0x23>
  23:	48 81 c4 00 04 00 00 	add    $0x400,%rsp
  2a:	5b                   	pop    %rbx
  2b:	c3                   	ret

Disassembly of section .text._D25memoryerror_stackoverflow1gFKG1024hZv:

0000000000000000 <_D25memoryerror_stackoverflow1gFKG1024hZv>:
   0:	53                   	push   %rbx
   1:	48 81 ec 00 04 00 00 	sub    $0x400,%rsp
   8:	48 89 fe             	mov    %rdi,%rsi
   b:	48 89 e3             	mov    %rsp,%rbx
   e:	ba 00 04 00 00       	mov    $0x400,%edx
  13:	48 89 df             	mov    %rbx,%rdi
  16:	e8 00 00 00 00       	call   1b <_D25memoryerror_stackoverflow1gFKG1024hZv+0x1b>
  1b:	48 89 df             	mov    %rbx,%rdi
  1e:	e8 00 00 00 00       	call   23 <_D25memoryerror_stackoverflow1gFKG1024hZv+0x23>
  23:	48 81 c4 00 04 00 00 	add    $0x400,%rsp
  2a:	5b                   	pop    %rbx
  2b:	c3                   	ret
```

Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Check glibc's sysdeps/unix/sysv/linux/**/bits/types/struct_shmid_ds.h
for correctness. Some arches still look like they have an incorrect
declaration but this, at least, fixes aarch64 which is the more
common one.

Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
The definitions I've taken from glibc's
sysdeps/unix/sysv/linux/**/bits/typesizes.h.

Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
These tests seem to fail with a shared druntime (depending on arch as
well), and work with a static one. CI fails with a static druntime
though.

A gdb backtrace with a shared druntime:
```
Reading symbols from ./a...
(No debugging symbols found in ./a)
(gdb) r
Starting program: /root/build/a
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x0000005555550afc in D main ()
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000007ff77432f0 in ?? () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1
(gdb) bt
\#0  0x0000007ff77432f0 in ?? () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1
\ldc-developers#1  0x0000007ff77563b8 in ?? () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1
\ldc-developers#2  0x0000007ff7757000 in _Unwind_Backtrace () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1
\ldc-developers#3  0x0000007ff78b2c6c in backtrace () from /usr/lib64/libc.so.6
\ldc-developers#4  0x0000007ff7aee058 in core.lifetime.emplace!(core.runtime.DefaultTraceInfo).emplace(core.runtime.DefaultTraceInfo) ()
   from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111
\ldc-developers#5  0x0000007ff7aed1f0 in core.runtime.defaultTraceHandler(void*) () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111
\ldc-developers#6  0x0000007ff7b09614 in _d_createTrace () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111
\ldc-developers#7  0x0000007ff7b0a7a0 in _d_throw_exception () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111
\ldc-developers#8  0x0000007ff7acb418 in _d_assert_msg () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111
\ldc-developers#9  0x0000005555550d18 in etc.linux.memoryerror.registerMemoryAssertHandler!().registerMemoryAssertHandler()._d_handleSignalAssert(int, core.sys.posix.signal.siginfo_t*, void*) ()
\ldc-developers#10 <signal handler called>
\ldc-developers#11 0x0000005555550afc in D main ()
```

And with a static druntime:
```
Reading symbols from ./a...
(No debugging symbols found in ./a)
(gdb) r
Starting program: /tmp/a
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000000555556a4bc in D main ()
(gdb) c
Continuing.
core.exception.AssertError@/usr/lib/ldc2/1.41/include/d/etc/linux/memoryerror.d(415): segmentation fault: null pointer read/write operation
----------------
??:? [0x555557fa2f]
??:? [0x555557f1ab]
??:? [0x5555581d17]
??:? [0x5555570893]
??:? [0x555556b2ab]
??:? [0x555556a6d7]
??:? [0x7ff7ffb797]
??:? [0x555556a4bb]
??:? [0x5555570567]
??:? [0x5555570413]
??:? [0x5555570277]
??:? [0x555556a5a3]
??:? [0x7ff7d02053]
??:? __libc_start_main [0x7ff7d02137]
??:? [0x555556a3af]

Program received signal SIGSEGV, Segmentation fault.
0x0000007ffffffdd0 in ?? ()
```

Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
…ch64

Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
$(ROOT)/memoryerror_%: private extra_ldflags.d += -link-defaultlib-shared=false
endif
endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most controversial part. Whenever something only failed for (some) CI systems, I've tried to remove the test on that CI platform only - which is doable for compiler and druntime/Phobos unittests, but the druntime integration tests with their Makefiles are unfortunately a PITA in that regard. Just so that the tests are all included in a normal test run.

It might pay off to investigate further, incl. checking whether we can disable based on the BUILD var (debug/release).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is doable for compiler and druntime/Phobos unittests, but the druntime integration tests with their Makefiles are unfortunately a PITA in that regard

Adding something akin to:

TESTS := $(filter-out $(LDC_SKIP_DRUNTIME_TESTS),$(TESTS))

in common.mak, alongside setting that environment variable when ctest -E 'ldc2-unittest|lit-tests|dmd-testsuite' is run should work. I can't test right now though.

It might pay off to investigate further, incl. checking whether we can disable based on the BUILD var (debug/release).

Yeah, I agree that we shouldn't skip tests on all runners simply because 1 misbehaves. Yet debugging in online CI is a pretty miserable experience

Based on dlang/dmd#21779. For LDC, always
define IS_MUSL externally, not just to `1` for musl targets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants