Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable NAK Delay at the Consumer Level #6311

Open
elettrico opened this issue Dec 30, 2024 · 6 comments
Open

Configurable NAK Delay at the Consumer Level #6311

elettrico opened this issue Dec 30, 2024 · 6 comments
Labels
proposal Enhancement idea or proposal

Comments

@elettrico
Copy link

Proposed change

Currently, when a NAK is sent for a message, the only way to avoid immediate redelivery is by using nakDelay (nakWithDelay in my case as I'm using java). While this functionality works, it requires specifying the delay programmatically each time a NAK is issued.
There is no way to configure a delay for NAKs at the consumer level. I think it's a limitation, especially when multiple consumers need a consistent delay for NAKs.

Proposal:
Introduce a configuration option at the consumer level to define a default NAK delay. For example, it maybe similar to how backoff can be configured for timeouts, defining a backoffNak, and also a defaultNakDelay parameter could allow users to specify a fixed delay for all NAKs issued by the consumer. This configuration would slightly simplify code for handling NAKs and, more importantly, it will ensure consistent behavior across consumers.

Use case

Consider a scenario where a consumer processes messages and the client(s) occasionally needs to send a NAK due to transient issues (e.g., external service unavailability): by configurig a NAK delay at the consumer level, the delay configuration can be managed centrally allowing adjustments to respond to external variables, such as errors or timeouts from external services (for example we can have a 1 second NAK delay, but if we know that an external service is intermittently unavailable or down for several minutes, the NAK delay can be increased to 60 seconds or more, minimizing network traffic and avoiding unnecessary message retries every second).
Additionally, this approach allows operators to modify the delay (e.g., making it incremental or adjusting it to changing requirements) directly at the server configuration level without requiring redeployment of client applications. This provides greater flexibility and reduces operational overhead in dynamic environments.

Contribution

No response

@elettrico elettrico added the proposal Enhancement idea or proposal label Dec 30, 2024
@ripienaar
Copy link
Contributor

Would it work to have the existing backoff configuration also apply to NAK?

@elettrico
Copy link
Author

Maybe, but existing backoff configuration relies on MaxDelivery, and there's (of course) no such thing for NAK.
Also, it could be a solution, but in that case, we would be assuming that MaxDelivery for NAK is equal to the length of the backoff array for timeouts and then treating NAKs the same as if they were timeouts, mixing two different scenarios.
Additionally, we would need a configuration option to enable or disable this behavior, as we might want to use backoff only for timeouts and not for NAK as it is now.
In my opinion, considering the effort required for this approach and the aforementioned drawbacks, it might be better to opt for a separate configuration for NAK delays.

@ripienaar
Copy link
Contributor

A NAK is a delivery attempt so MaxDelivery counts on NAKs.

Adding a new setting is a huge undertaking especially one thats quite close in behaviour to something else - We have received feedback, a lot of it, that NAKs should be subject to backoff also so that seems a more likely approach

@elettrico
Copy link
Author

You are absolutely right about MaxDelivery applying to NAKs (my mistake reading the docs).
I understand what you say about a new settings, and in general I think it's ok to use the same backoff for both timeouts and NAKs.
Anyway, please keep in mind that this might create issues for systems that rely on the current behavior or require different handling for timeouts and NAKs. For instance, I'm thinking about a scenario where a timeout could indicate an error rendering the client temporarily unavailable (an unhandled error or a network issue, for example), while a NAK might simply mean "I can't process this right now"—whether due to an external issue unrelated to the client itself, or because the client is temporarily overwhelmed (e.g., due to high load) but might recover and be ready to process the message within a second. There are scenarios where distinguishing between these situations might be useful, so it could be worth considering this differentiation for the future.

@ripienaar
Copy link
Contributor

As we have NAK with a custom delay you could assume that the backoff won't apply if a delay is given, so if you want non default backoff managed NAK just give a delay of your own.

Anyway, just a thought that backoff might work

@elettrico
Copy link
Author

Yes, I am thinking about cases where no custom delay is passed, and immediate redelivery is expected. Also, since in my scenario there is a dependency on external services, I was considering that it might be useful to differentiate the backoffs between timeouts (internal issues) and NAKs (external issues). But probably it’s not a big deal, and I’m just overthinking it — you’re probably correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Enhancement idea or proposal
Projects
None yet
Development

No branches or pull requests

2 participants