Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Core Dump in zebra Process #17730

Open
2 tasks done
killaruns opened this issue Dec 28, 2024 · 6 comments
Open
2 tasks done

[Bug] Core Dump in zebra Process #17730

killaruns opened this issue Dec 28, 2024 · 6 comments
Labels
triage Needs further investigation

Comments

@killaruns
Copy link

Description

  • I encountered a core dump issue with the zebra process in FRRouting. The process crashes with signal 6 (ABRT) at exactly 19:00 MSK every day. The server operates 24/7, and the consistent timing of these crashes is concerning. Below are the details of the incident and the stack trace, which I believe could help in diagnosing the issue.

Version

* Operating System: Debian GNU/Linux 11 (bullseye)
* Version of FRR: frr 8.5.6-0
* Installed Packages:
	frr 8.5.6-0~deb11u1 amd64
	frr-pythontools 8.5.6-0~deb11u1 all

How to reproduce

At 19:00 MSK, the zebra process always crashes.

Expected behavior

Stable operation without emergency shutdown

Actual behavior

Here’s the relevant output from coredumpctl:

           PID: 3587682 (zebra)
           UID: 108 (frr)
           GID: 115 (frr)
        Signal: 6 (ABRT)
     Timestamp: Fri 2024-12-27 19:01:58 MSK (15h ago)
  Command Line: /usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000
    Executable: /usr/lib/frr/zebra
 Control Group: /system.slice/frr.service
          Unit: frr.service
         Slice: system.slice
       Boot ID: 9eb153491b8d4fa4947660bbdd25d8bf
    Machine ID: 37cfcc84286c4c0daf172ebd4d278fa5
      Hostname: debian-accel-srv1
       Storage: /var/lib/systemd/coredump/core.zebra.108.9eb153491b8d4fa4947660bbdd25d8bf.3587682.1735315318000000.zst
       Message: Process 3587682 (zebra) of user 108 dumped core.

                Stack trace of thread 3587682:
                #0  0x00007fbb629b6fe1 raise (libpthread.so.0 + 0x12fe1)
                #1  0x00007fbb62ab006c n/a (libfrr.so.0 + 0xd006c)
                #2  0x00007fbb629b7140 __restore_rt (libpthread.so.0 + 0x13140)
                #3  0x00007fbb62807ce1 raise (libc.so.6 + 0x38ce1)
                #4  0x00007fbb627f1537 abort (libc.so.6 + 0x22537)
                #5  0x00007fbb62adc2f9 _zlog_assert_failed (libfrr.so.0 + 0xfc2f9)
                #6  0x000056403e1eedc6 n/a (zebra + 0xeddc6)
                #7  0x00007fbb62acc208 work_queue_run (libfrr.so.0 + 0xec208)
                #8  0x00007fbb62ac1f8d thread_call (libfrr.so.0 + 0xe1f8d)
                #9  0x00007fbb62a7ad58 frr_run (libfrr.so.0 + 0x9ad58)
                #10 0x000056403e183326 main (zebra + 0x82326)
                #11 0x00007fbb627f2d0a __libc_start_main (libc.so.6 + 0x23d0a)
                #12 0x000056403e183f4a _start (zebra + 0x82f4a)

                Stack trace of thread 3587695:
                #0  0x00007fbb628bfe26 ppoll (libc.so.6 + 0xf0e26)
                #1  0x00007fbb62ac180f thread_fetch (libfrr.so.0 + 0xe180f)
                #2  0x00007fbb62a6d201 n/a (libfrr.so.0 + 0x8d201)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

                Stack trace of thread 3587692:
                #0  0x00007fbb628bfe26 ppoll (libc.so.6 + 0xf0e26)
                #1  0x00007fbb62ac180f thread_fetch (libfrr.so.0 + 0xe180f)
                #2  0x00007fbb62a6d201 n/a (libfrr.so.0 + 0x8d201)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

                Stack trace of thread 3587698:
                #0  0x00007fbb628bfe26 ppoll (libc.so.6 + 0xf0e26)
                #1  0x00007fbb62ac180f thread_fetch (libfrr.so.0 + 0xe180f)
                #2  0x00007fbb62a6d201 n/a (libfrr.so.0 + 0x8d201)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

                Stack trace of thread 3587683:
                #0  0x00007fbb628c52e9 syscall (libc.so.6 + 0xf62e9)
                #1  0x00007fbb62aad767 seqlock_wait (libfrr.so.0 + 0xcd767)
                #2  0x00007fbb62a6c5cf n/a (libfrr.so.0 + 0x8c5cf)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

                Stack trace of thread 3587699:
                #0  0x00007fbb628bfe26 ppoll (libc.so.6 + 0xf0e26)
                #1  0x00007fbb62ac180f thread_fetch (libfrr.so.0 + 0xe180f)
                #2  0x00007fbb62a6d201 n/a (libfrr.so.0 + 0x8d201)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

                Stack trace of thread 3587685:
                #0  0x00007fbb628bfe26 ppoll (libc.so.6 + 0xf0e26)
                #1  0x00007fbb62ac180f thread_fetch (libfrr.so.0 + 0xe180f)
                #2  0x00007fbb62a6d201 n/a (libfrr.so.0 + 0x8d201)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

                Stack trace of thread 3587684:
                #0  0x00007fbb628bfe26 ppoll (libc.so.6 + 0xf0e26)
                #1  0x00007fbb62ac180f thread_fetch (libfrr.so.0 + 0xe180f)
                #2  0x00007fbb62a6d201 n/a (libfrr.so.0 + 0x8d201)
                #3  0x00007fbb629abea7 start_thread (libpthread.so.0 + 0x7ea7)
                #4  0x00007fbb628cba2f __clone (libc.so.6 + 0xfca2f)

GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/frr/zebra...
(No debugging symbols found in /usr/lib/frr/zebra)
[New LWP 3587682]
[New LWP 3587695]
[New LWP 3587692]
[New LWP 3587698]
[New LWP 3587683]
[New LWP 3587699]
[New LWP 3587685]
[New LWP 3587684]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fbb629b6fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
[Current thread is 1 (Thread 0x7fbb624ba600 (LWP 3587682))]
(gdb) bt
#0  0x00007fbb629b6fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fbb62ab006c in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#2  <signal handler called>
#3  0x00007fbb62807ce1 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007fbb627f1537 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fbb62adc2f9 in _zlog_assert_failed () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#6  0x000056403e1eedc6 in ?? ()
#7  0x00007fbb62acc208 in work_queue_run () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#8  0x00007fbb62ac1f8d in thread_call () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#9  0x00007fbb62a7ad58 in frr_run () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#10 0x000056403e183326 in main ()

log
journalctl.txt

dmesg
does not contain any messages at this time.

Additional context

No response

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@killaruns killaruns added the triage Needs further investigation label Dec 28, 2024
@ton31337
Copy link
Member

Please check latest versions.

@killaruns
Copy link
Author

killaruns commented Dec 28, 2024

Please check latest versions.

The latest stable version?
frr 10.2-0~deb11u1

@ton31337
Copy link
Member

ton31337 commented Dec 28, 2024

Please check latest versions.

The latest stable version? frr 10.2-0~deb11u1

Yes. What is your configuration? Are you using flowspec?

@donaldsharp
Copy link
Member

can you load up zebra in gdb and tell us this output:
l *(zebra + 0xeddc6)

@killaruns
Copy link
Author

Please check latest versions.

The latest stable version? frr 10.2-0~deb11u1

Yes. What is your configuration? Are you using flowspec?

We don't use flowspec, how can I attach the configuration here ?
We have a similar server with all the same parameters, it doesn't have these problems.

@killaruns
Copy link
Author

can you load up zebra in gdb and tell us this output: l *(zebra + 0xeddc6)

I'm sorry we have no experience with gdb, the parameters we attached above were taken only because neighboring discussions required such data to solve complex problems with frr application crashes.

Can you write step by step what we need to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants