Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFD session in not established if interface recreation #17845

Open
2 tasks done
ne-vlezay80 opened this issue Jan 12, 2025 · 7 comments
Open
2 tasks done

BFD session in not established if interface recreation #17845

ne-vlezay80 opened this issue Jan 12, 2025 · 7 comments
Labels
bfd bgp triage Needs further investigation

Comments

@ne-vlezay80
Copy link
Contributor

Description

If interface recreation, bgp session in not established.

Version

10.2.1

How to reproduce

  1. Configure bgp from
 neighbor 172.30.255.24 remote-as 4200220005
 neighbor 172.30.255.24 bfd
 neighbor 172.30.255.24 update-source qt-swep0
 neighbor 172.30.255.24 timers 1 4
  1. Remove network interface
  2. Create network interface

Expected behavior

BFD session in not establish, if remove network interface or ip address, which used from update-source.

Actual behavior

BFD is establish if interface recriation

Additional context

  1. Debug log:
2025/01/12 10:10:26 BGP: [GPPQK-HK3ZM] bfd_get_peer_info: Can't find interface by ifindex: 18 
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid device index
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=843, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid nexthop id
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=844, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid device index
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=847, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid nexthop id
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=848, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (280[]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (279[226/280]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (282[]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (281[228/282]) into the kernel
2025/01/12 10:10:27 BFD: [GCWEX-N0BBE] zclient: add interface qt-swep0 (VRF default(0))
2025/01/12 10:10:27 BFD: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025/01/12 10:10:27 BFD: [SSYGJ-9ZAE0] zclient: add local address 172.30.255.25/31 (VRF 0)
2025/01/12 10:10:27 BFD: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025/01/12 10:10:27 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:28 BGP: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Socket not connected
2025/01/12 10:10:28 BGP: [H4B4J-DCW2R][EC 33554455] 172.30.255.24 [Error] bgp_read_packet error: Connection reset by peer
2025/01/12 10:10:28 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:28 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::4c9:7fff:fe4e:fae2/64 (VRF 0)
2025/01/12 10:10:28 BFD: [SSYGJ-9ZAE0] zclient: add local address fec0:112:acab::5/127 (VRF 0)
2025/01/12 10:10:29 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:30 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:31 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
  1. sh bfd peer:
neigh# show bfd peer
BFD Peers:
        peer 172.30.255.24 local-address 172.30.255.25 vrf default interface qt-swep0
                ID: 2613937113
                Remote ID: 2632334212
                Active mode
                Status: init
                Diagnostics: path down
                Remote diagnostics: control detection time expired
                Peer Type: dynamic
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
  1. tcpdump:
~ # tcpdump -i qt-swep0 -ne
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on qt-swep0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:14:22.584911 2a:9a:e0:00:14:40 > 06:c9:7f:4e:fa:e2, ethertype IPv4 (0x0800), length 66: 172.30.255.24.65263 > 172.30.255.25.3784: BFDv1, Control, State Down, Flags: [none], length: 24
10:14:23.335449 2a:9a:e0:00:14:40 > 06:c9:7f:4e:fa:e2, ethertype IPv4 (0x0800), length 66: 172.30.255.24.65263 > 172.30.255.25.3784: BFDv1, Control, State Down, Flags: [none], length: 24
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@ne-vlezay80 ne-vlezay80 added the triage Needs further investigation label Jan 12, 2025
@ton31337
Copy link
Member

What do you mean "Remove network interface"? no neighbor 172.30.255.24 update-source qt-swep0?

@ne-vlezay80
Copy link
Contributor Author

What do you mean "Remove network interface"? no neighbor 172.30.255.24 update-source qt-swep0?

qt-swep0 - qemu tap interface

the interface removing if qemu process is stopping.
Example:

/usr/bin/qemu-system-x86_64 -nodefaults -display none -netdev tap,ifname=qt-swep0,id=netdev0,script=/etc/qemu/ifup,downscript=/etc/qemu/ifdown -netdev socket,listen=192.168.252.5:42000,id=socket0 -netdev hubport,hubid=0,netdev=netdev0,id=hubport1 -netdev hubport,hubid=0,netdev=socket0,id=hubport

/etc/qemu/ifup:

#!/bin/sh

if [[ $1 == "qt-swep0" ]]; then
 ip addr add 172.30.255.25/31 dev qt-swep0
fi

qt - interface type [qemu tunnel]
swep - node name
[numbers] - interface number

@ton31337
Copy link
Member

Is this a regression or it's the same with 10.0 or 10.1?

@ne-vlezay80
Copy link
Contributor Author

Is this a regression or it's the same with 10.0 or 10.1?

I suppose there is...
Not tested...

@donaldsharp
Copy link
Member

I would be surprised if bfd handles the interface id # change appropriately once it is assigned. I suspect we would need to check a bunch of daemons behaviors. Not something that we are testing or attempting to do.

@ne-vlezay80
Copy link
Contributor Author

  1. The bug is not appers, if used gretap tunnel. Appers only tuntap tunnel...
  2. The bug maybe appers on non systemd systems...

@ne-vlezay80
Copy link
Contributor Author

ne-vlezay80 commented Jan 17, 2025

My lab:
FRR (10.2.1) [debian bookworm] - not appers
FRR (10.1) [debian bookworm] - not appers
FRR (10.0) [debian bookworm] - not appers
Production [my vps]:
alpine 3.21 (frr 10.2.1) - is appers.

Hast lab and production - debian bookworm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bfd bgp triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants