-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Configuring Hangfire to avoid logging errors on transient database timeouts #2493
Comments
Hm, all logging levels, starting from Any chance you are logging |
Thanks for responding so quickly and apologies for the delayed reply. I've double checked and can confirm that the logs are occurring at the As above, the error message states that a retry will be performed in 4s. Ideally, we wouldn't have an error log emitted until a few retries have occurred. To clarify, the logging mechanism is working as documented when errors occur within our user code (e.g. when a background task fails to read from the DB, no error logs are emitted until the retry count has exceeded). The problem seems to be when Hangfire itself encounters a transient database error when trying to read job metadata from the database (outside of the scope of our user code). Hopefully that helps... please let me know if there's anything other info I can provide you! |
Can you post here some more error messages that occur too often, possibly with the stack traces of the corresponding exceptions? I need to understand where this problem happens to see the possible workarounds. |
The frequency rate itself isn't the problem. The issue is that the error is caused by an expected error state (a connectivity blip) which we expect to resolve itself on retry. UnfortunateIy I can't provide any other examples. As of writing, this is the only exception we are getting that seems to be being raised before a retry has been attempted.
Here is a stack trace of the exception which is being surfaced:
As indicated by the trace, the exception seems to be surfaced by
Edit: apologies for the duplicate posts. |
I am using Hangfire and Serilog (backed by SQL server storage) for background tasks in my dotnet application, with monitoring setup to notify when any unexpected exceptions occur.
Due to hosting in a cloud environment, transient database errors occur on a non-infrequent basis. This is fine for jobs themselves, due to retry policies being configured to only log if all retries are exhausted. The problem I am experiencing is when a transient database connectivity blip causes Hangfire to fail to access job metadata from our SQL database. Whenever this occurs, an exception is thrown and an error logged (e.g.
Execution Worker is in the Failed state now due to an exception, execution will be retried in no more than 00:00:04
). In almost all cases, the database recovers on retry.Is there a way that Hangfire can be configured to only log these errors after a certain number of retries have occurred?
The text was updated successfully, but these errors were encountered: