Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Use separate ICMP identifier and sequence to remove 65k ping limit #385

Open
greenlms opened this issue Apr 4, 2025 · 4 comments

Comments

@greenlms
Copy link

greenlms commented Apr 4, 2025

Enhancement: Support for extended ICMP (identifier + sequence) tracking to overcome 65k sequence limit

Currently, fping uses a single global 16-bit sequence map (FPING_SEQMAP_SIZE) to track ICMP responses. This imposes a hard limit of approximately 65,535 total hosts × count combinations per run.

For example, it is possible to monitor up to 65,535 hosts with -C1, but increasing to -C2 immediately exceeds this limit.


However, the ICMP Echo protocol defines two separate 16-bit fields:

  • identifier (typically per host)
  • sequence (typically per probe)

By assigning a unique identifier per target host and using the sequence field per probe (e.g., for -C count), it becomes possible to scale host × count combinations into the range of 2³², fully within the ICMP specification.


This change would allow monitoring up to 65,535 hosts with options like -C60 or higher — a major benefit for ISPs and large-scale network monitoring systems.

Suggested implementation:

  • Assign a unique ICMP identifier (16-bit) to each target host.
  • For each repeated probe (e.g. with -C), increment only the sequence field (also 16-bit).
  • Internally, match ICMP responses using a composite key made from both fields, for example: (identifier << 16) | sequence.

This replaces the current single 16-bit sequence-based tracking, and enables proper handling of 65,000+ targets even with repeated pings (e.g. -C60).

I'd be happy to help test this in a high-scale environment or submit a prototype patch.

@greenlms
Copy link
Author

greenlms commented Apr 4, 2025

Practical test cases (current limits)

The following simple tests demonstrate the current 65k sequence limit in fping:

fping -4 -r0 -i0 --period=0 --timeout=1 -C65535 127.0.0.1             # ✅ works OK
fping -4 -r0 -i0 --period=0 --timeout=1 -C65535 127.0.0.1 127.0.0.2   # ❌ fails

fping -4 -r0 -i0 --period=0 --timeout=1 -C32767 127.0.0.1 127.0.0.2   # ✅ works OK
fping -4 -r0 -i0 --period=0 --timeout=1 -C32768 127.0.0.1 127.0.0.2   # ❌ fails

This shows that the total number of pings (host_count × C count) must remain ≤ 65535.

Real-world need

In a real monitoring scenario, I need to probe about 30,000 hosts with multiple repetitions, for example using -C60:

fping -4 -r0 -i0 --period=999 --timeout=999 -C60 [about 30,000 hosts]   # ❌ fails
fping -4 -r0 -i0 --period=999 --timeout=999 -C1  [about 30,000 hosts]   # ⚠️ works OK, but insufficient

This confirms that the current sequence ID limitation prevents using -C60 with large host sets, even though it would be valid under the ICMP protocol if identifier and sequence were tracked separately.


I kindly ask the maintainers to consider this enhancement, as it would significantly improve fping's scalability and usefulness in large-scale monitoring environments.

@auerswal
Copy link
Collaborator

auerswal commented Apr 4, 2025

Thanks for this feature request!

Have you considered using many fping instances in parallel? Each instance has this 64K limit of concurrently active sequence numbers, but usually each instance has its own identification value. You could, e.g., try to use 30 fping instances with about 1000 targets each. (But see issue #206 for potential problems.)

Currently, fping uses its process ID for ICMP identification. This allows many fping processes in parallel, e.g., on a multi-user system. It is also currently possible to have more than 64K targets, as long as no more than 64K active sequence numbers are needed at the same time. I'd say this is useful default behavior that I'd prefer to keep.

It may be possible to use an option to select a change similar to your proposal.

The active sequence number count is limited to SEQMAP_MAXSEQ (65535). Any used sequence number stays active until it is older than SEQMAP_TIMEOUT_IN_NS (10000000000, i.e., 10 seconds). We cannot just increase the number of active sequence numbers, because then we would allow duplicate sequence numbers. We could possibly add an option that allows decreasing the timeout (I'd assume that would be simpler than adding per target ICMP IDs).

If we want to have 30000 or more targets with different ICMP IDs, we would probably need to just assign them sequentially. We might also need one seqmap_map per target. That would need quite a lot of RAM. But, in this mode, there could only be one fping per host (or possibly container) anyway, and the host could have sufficient RAM for this. I'd say this also suggests to make such a behavior opt-in (i.e., require using some option).

@greenlms
Copy link
Author

greenlms commented Apr 4, 2025

Thanks for the detailed explanation — really helpful! Here's what I took from it, along with a few thoughts:

Image

  1. First of all — I now fully understand the limitations related to SEQMAP_MAXSEQ (65535, due to the 16-bit ICMP sequence field), and the need for the PID to stay within 2 bytes (used as the ICMP ID). Setting kernel.pid_max=65535 is a good trick — thanks!
  2. It’s also much clearer now how SEQMAP_TIMEOUT_IN_NS works — I didn’t realize that sequence numbers stay active for 10 seconds after use. That explains a lot about how fping controls flow internally.
  3. Your idea to make SEQMAP_TIMEOUT_IN_NS configurable (via a new --expire-timeout option, or by tying it to --timeout) sounds great. I think that alone would solve the main limitation in many scenarios.
  4. Of course, reducing this timeout too much (e.g. to 1s or 500ms) could cause delayed replies from earlier pings to be misinterpreted as replies to a new request — possibly showing up as duplicates. I can't explain the behavior precisely, but I think you know what I mean 😅 I typically run with --retry=0, so this wouldn't be a major issue for me.
  5. As for parallel fping instances — I’d prefer to avoid this if possible. I know it works, but I’d rather not deal with multiple processes unless absolutely necessary.
  6. Lastly — your mention of using a separate seqmap_map per target really caught my attention. That would be an amazing improvement, providing total independence between targets. I understand it would use more RAM, but honestly — modern machines (or routers) often have hundreds of GB of memory. It wouldn’t be a blocker at all.

@auerswal
Copy link
Collaborator

auerswal commented Apr 6, 2025

After thinking about this a bit more, I am leaning towards adding an option to specify the sequence number timeout as the simplest possible solution. Combining that with --check-source to avoid spurious results due to sequence number reuse would probably solve your issue. The current (arbitrary) default of 10s works sufficiently well to keep as default, even though any fixed value cannot be correct for all circumstances (I have seen RTTs in excess of 4 minutes in mobile networks).

Automatically adjusting the sequence number timeout to the maximum probe timeout might also work, but I am not sure about possible side effects and regressions. GitHub issue #32 shows how problematic such changes can be. Since the timeout depends on several options, and by default involves exponential back-off, this is also more complex to implement than allowing the user to provide a number.

For now, I do not intend to look into using a different ICMP identifier for different targets. One ICMP ID per target would limit fping to 64K targets, while it currently works with many more, as long as there are no more than 64K active sequence numbers (-i1 suffices). Adding this additional operating mode would be much work for little gain, moving complexity that can be added outside of fping (by using multiple fping processes) into fping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants