-
-
Notifications
You must be signed in to change notification settings - Fork 268
[druntime-test]: Fix glibc-aarch64 skipped tests #4971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[druntime-test]: Fix glibc-aarch64 skipped tests #4971
Conversation
The cmake files already check the triple for musl and that is a more reliable check compared to looking for Alpine's apk. Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Alright, some failures that I've seen locally as well. Let me try to fix them here and then backport upstream |
Thx for going down yet another rabbit hole! :) - The posix fixes will need to be upstreamed. Weird that one CircleCI job fails:
I think the config is pretty close to the Cirrus |
@@ -110,7 +110,9 @@ | |||
#include <linux/io_uring.h> | |||
#include <linux/eventpoll.h> | |||
#endif | |||
#if !(defined(__linux__) && defined(__aarch64__)) // /usr/include/bits/math-vector.h(162): Error: undefined identifier `__Float32x4_t` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you haven't by chance spotted some macro that we could define (e.g., in importc.h
) , to avoid these apparent type extensions? Not being able to include math.h
on AArch64 is pretty bad IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last time I touched compilable/stdcheaders which suffered from the same issue I didn't find anything. Maybe it's worth for me to take another look, since it's so easy to hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a RPi 5 now, and with vanilla Raspbian 12 (incl. gcc-12), math.h
can be included just fine. I don't have a /usr/include/bits
dir there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the include-flow (for me) is like:
// math.h
#include <bits/libc-header-start.h> /* defines __GNUC_PREREQ */
// ...
#include <bits/math-vector.h>
// bits/math-vector.h
#if __GNUC_PREREQ(9, 0)
# define __ADVSIMD_VEC_MATH_SUPPORTED
typedef __Float32x4_t __f32x4_t;
typedef __Float64x2_t __f64x2_t;
#elif __glibc_clang_prereq(8, 0)
# define __ADVSIMD_VEC_MATH_SUPPORTED
typedef __attribute__ ((__neon_vector_type__ (4))) float __f32x4_t;
typedef __attribute__ ((__neon_vector_type__ (2))) double __f64x2_t;
#endif
Getting out of this I think is only possible with either:
#undef __GNUC__
or#undef __GNUC_MINOR__
(this would make the__GNUC_PREREQ
fail)- define
__Float32x4_t
features.h
defines __GNUC_PREREQ
as:
#if defined __GNUC__ && defined __GNUC_MINOR__
# define __GNUC_PREREQ(maj, min) \
((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#else
# define __GNUC_PREREQ(maj, min) 0
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm; in my case, /usr/include/aarch64-linux-gnu/bits/math-vector.h
boils down to a:
#include <bits/libm-simd-decl-stubs.h>
A potential avenue to explore would be to define __Float32x4_t
and __Float64x2_t
in __importc_builtins.di
- as aliases to the D vector types. That might work with gcc then; I can't think of how we'd be able to support clang's __attribute__ ((__neon_vector_type__ (4))) float
- unless we'd go ahead and try to fake an older gcc/clang version (< v9/v8 apparently) on Linux AArch64 only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay that's pretty good news! So if that clang typedef 'works' for importC, I guess it strips all __attribute__
stuff there and treats it as a scalar float/double. alias __Float32x4_t = __vector(float[4])
(in __importc_builtins.di
, either in general or just for AArch64) would probably be correct, and alias __Float64x2_t = __vector(double[2])
. No idea about the 3 __SV*
types.
Edit: According to godbolt, I guess an empty struct dummy is okay:
<source>:3:1: error: SVE type '__SVBool_t' does not have a fixed size
3 | __SVBool_t a;
<source>:4:1: error: SVE type '__SVFloat32_t' does not have a fixed size
4 | __SVFloat32_t b;
<source>:5:1: error: SVE type '__SVFloat64_t' does not have a fixed size
5 | __SVFloat64_t c;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __Float??x?
aliases seem to work:
dlang: arm64-desk-s /tmp/x # cat d.d
import importc_math;
import std.math;
void main () {
__f64x2_t x;
x[0] = PI * 0;
x[1] = PI * 0.5;
auto r = _ZGVnN2v_cos(x);
import std.stdio;
writeln(r);
}
dlang: arm64-desk-s /tmp/x # cat importc_math.c
#include <math.h>
dlang: arm64-desk-s /tmp/x # ~/build/bin/ldc2 -run d.d
[1, 6.12323e-17]
The only issue that I hit was that the vector types are not indexable in C:
dlang: arm64-desk-s /tmp/x # cat c.c
#include <math.h>
#include <stdio.h>
int main () {
__f32x4_t x;
x[0] = M_PI;
__f32x4_t r = _ZGVnN4v_cosf(x);
printf("%f\n", r[0]);
}
dlang: arm64-desk-s /tmp/x # ~/build/bin/ldc2 -o- c.c
c.c(5): Error: can only `*` a pointer, not a `__vector(float[4])`
x[0] = M_PI;
^
c.c(9): Error: can only `*` a pointer, not a `__vector(float[4])`
printf("%f\n", r[0]);
^
I don't personally have any issue with this though as I only value importC, not compileC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't personally have any issue with this though as I only value importC, not compileC.
Exactly my thoughts. Just nice that one can resort to C function prototypes with vector types, passing our float4
/double2
types directly. And in case someone was to try to use the SVE types (or functions taking thiose etc.), a dummy empty struct should yield compile errors right away - definitely better than something accidentally compiling, but totally failing at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that the SVE types are allowed by the compiler and their usage does lead to runtime errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah well but I guess noone feeds such a function with a default-initialized SVE type, so should get a compile error when trying to pass in some vector. Alternatively, a forward-declared struct might work better, assuming that math.h still works then.
da14a88
to
2333e27
Compare
…vm-20 Since llvm-20, the stack variables are optimized away leading to an infinite loop. The affected assembly became: ``` Disassembly of section .text._D25memoryerror_stackoverflow1fFKG1024hZv: 0000000000000000 <_D25memoryerror_stackoverflow1fFKG1024hZv>: 0: e9 00 00 00 00 jmp 5 <_D25memoryerror_stackoverflow1fFKG1024hZv+0x5> Disassembly of section .text._D25memoryerror_stackoverflow1gFKG1024hZv: 0000000000000000 <_D25memoryerror_stackoverflow1gFKG1024hZv>: 0: e9 00 00 00 00 jmp 5 <_D25memoryerror_stackoverflow1gFKG1024hZv+0x5> ``` Compared to the llvm-19 optimized version: ``` Disassembly of section .text._D25memoryerror_stackoverflow1fFKG1024hZv: 0000000000000000 <_D25memoryerror_stackoverflow1fFKG1024hZv>: 0: 53 push %rbx 1: 48 81 ec 00 04 00 00 sub $0x400,%rsp 8: 48 89 fe mov %rdi,%rsi b: 48 89 e3 mov %rsp,%rbx e: ba 00 04 00 00 mov $0x400,%edx 13: 48 89 df mov %rbx,%rdi 16: e8 00 00 00 00 call 1b <_D25memoryerror_stackoverflow1fFKG1024hZv+0x1b> 1b: 48 89 df mov %rbx,%rdi 1e: e8 00 00 00 00 call 23 <_D25memoryerror_stackoverflow1fFKG1024hZv+0x23> 23: 48 81 c4 00 04 00 00 add $0x400,%rsp 2a: 5b pop %rbx 2b: c3 ret Disassembly of section .text._D25memoryerror_stackoverflow1gFKG1024hZv: 0000000000000000 <_D25memoryerror_stackoverflow1gFKG1024hZv>: 0: 53 push %rbx 1: 48 81 ec 00 04 00 00 sub $0x400,%rsp 8: 48 89 fe mov %rdi,%rsi b: 48 89 e3 mov %rsp,%rbx e: ba 00 04 00 00 mov $0x400,%edx 13: 48 89 df mov %rbx,%rdi 16: e8 00 00 00 00 call 1b <_D25memoryerror_stackoverflow1gFKG1024hZv+0x1b> 1b: 48 89 df mov %rbx,%rdi 1e: e8 00 00 00 00 call 23 <_D25memoryerror_stackoverflow1gFKG1024hZv+0x23> 23: 48 81 c4 00 04 00 00 add $0x400,%rsp 2a: 5b pop %rbx 2b: c3 ret ``` Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Check glibc's sysdeps/unix/sysv/linux/**/bits/types/struct_shmid_ds.h for correctness. Some arches still look like they have an incorrect declaration but this, at least, fixes aarch64 which is the more common one. Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
The definitions I've taken from glibc's sysdeps/unix/sysv/linux/**/bits/typesizes.h. Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
These tests seem to fail with a shared druntime (depending on arch as well), and work with a static one. CI fails with a static druntime though. A gdb backtrace with a shared druntime: ``` Reading symbols from ./a... (No debugging symbols found in ./a) (gdb) r Starting program: /root/build/a [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x0000005555550afc in D main () (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x0000007ff77432f0 in ?? () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1 (gdb) bt \#0 0x0000007ff77432f0 in ?? () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1 \ldc-developers#1 0x0000007ff77563b8 in ?? () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1 \ldc-developers#2 0x0000007ff7757000 in _Unwind_Backtrace () from /usr/lib/gcc/aarch64-unknown-linux-gnu/15/libgcc_s.so.1 \ldc-developers#3 0x0000007ff78b2c6c in backtrace () from /usr/lib64/libc.so.6 \ldc-developers#4 0x0000007ff7aee058 in core.lifetime.emplace!(core.runtime.DefaultTraceInfo).emplace(core.runtime.DefaultTraceInfo) () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111 \ldc-developers#5 0x0000007ff7aed1f0 in core.runtime.defaultTraceHandler(void*) () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111 \ldc-developers#6 0x0000007ff7b09614 in _d_createTrace () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111 \ldc-developers#7 0x0000007ff7b0a7a0 in _d_throw_exception () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111 \ldc-developers#8 0x0000007ff7acb418 in _d_assert_msg () from /usr/lib/ldc2/1.41/lib64/libdruntime-ldc-shared.so.111 \ldc-developers#9 0x0000005555550d18 in etc.linux.memoryerror.registerMemoryAssertHandler!().registerMemoryAssertHandler()._d_handleSignalAssert(int, core.sys.posix.signal.siginfo_t*, void*) () \ldc-developers#10 <signal handler called> \ldc-developers#11 0x0000005555550afc in D main () ``` And with a static druntime: ``` Reading symbols from ./a... (No debugging symbols found in ./a) (gdb) r Starting program: /tmp/a [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x000000555556a4bc in D main () (gdb) c Continuing. core.exception.AssertError@/usr/lib/ldc2/1.41/include/d/etc/linux/memoryerror.d(415): segmentation fault: null pointer read/write operation ---------------- ??:? [0x555557fa2f] ??:? [0x555557f1ab] ??:? [0x5555581d17] ??:? [0x5555570893] ??:? [0x555556b2ab] ??:? [0x555556a6d7] ??:? [0x7ff7ffb797] ??:? [0x555556a4bb] ??:? [0x5555570567] ??:? [0x5555570413] ??:? [0x5555570277] ??:? [0x555556a5a3] ??:? [0x7ff7d02053] ??:? __libc_start_main [0x7ff7d02137] ??:? [0x555556a3af] Program received signal SIGSEGV, Segmentation fault. 0x0000007ffffffdd0 in ?? () ``` Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
…ch64 Signed-off-by: Andrei Horodniceanu <a.horodniceanu@proton.me>
2333e27
to
1ae7808
Compare
$(ROOT)/memoryerror_%: private extra_ldflags.d += -link-defaultlib-shared=false | ||
endif | ||
endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the most controversial part. Whenever something only failed for (some) CI systems, I've tried to remove the test on that CI platform only - which is doable for compiler and druntime/Phobos unittests, but the druntime integration tests with their Makefiles are unfortunately a PITA in that regard. Just so that the tests are all included in a normal test run.
It might pay off to investigate further, incl. checking whether we can disable based on the BUILD
var (debug/release).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is doable for compiler and druntime/Phobos unittests, but the druntime integration tests with their Makefiles are unfortunately a PITA in that regard
Adding something akin to:
TESTS := $(filter-out $(LDC_SKIP_DRUNTIME_TESTS),$(TESTS))
in common.mak
, alongside setting that environment variable when ctest -E 'ldc2-unittest|lit-tests|dmd-testsuite'
is run should work. I can't test right now though.
It might pay off to investigate further, incl. checking whether we can disable based on the BUILD var (debug/release).
Yeah, I agree that we shouldn't skip tests on all runners simply because 1 misbehaves. Yet debugging in online CI is a pretty miserable experience
Based on dlang/dmd#21779. For LDC, always define IS_MUSL externally, not just to `1` for musl targets.
The cmake files already check the triple for musl and that is a more reliable check compared to looking for Alpine's apk.