-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: timeout issues for list.peerings #2042
base: main
Are you sure you want to change the base?
Conversation
dependency/consul_peering.go
Outdated
// list peering is a blocking API, so making sure the ctx passed while calling it | ||
// times out after the default wait time. | ||
ctx, cancel := context.WithTimeout(context.Background(), DefaultContextTimeout) | ||
defer cancel() | ||
// The first result needs to be "immediate", in which case we use the default timeout settings. | ||
// For all other requests, we use a 'blocking' query, therefore setting the appropriate timeout. | ||
// We add one second to allow for the Consul client to finish sending the response before | ||
// declaring it an exceeded deadline. | ||
var ctx context.Context | ||
if opts.WaitTime != 0 { | ||
ctx, cancel := context.WithTimeout(context.Background(), opts.WaitTime + time.Second) | ||
defer cancel() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either the original or this patch work the way I'd expect to see with blocking queries, with the caveat that I work on Nomad and not Consul. But their blocking queries behavior should be similar.
When we make a blocking query, we wait for an API response to come back. If there is no change before the wait
time expires, we get a non-error response back with the current value of the query, whether or not it has changed. If we look at how the Peerings.List
API works, the context on this request does not change the wait
parameter! It sets the request timeout from the client. So that means when we hit this context deadline, the server is probably still waiting to send any update and will be sad because we're exiting early.
So if you're making a blocking query, the only reason you want to set a context other than context.Background()
here (as we did for consul_partitions.go
), is if you want the client side to bail out early. As we just saw, waiting until the deadline isn't an error in our use case here. But if you use a context, we'll get a context deadline error and return the error to the caller, instead of saying "ok, no change, nothing to do". Which is the behavior we expect of consul-template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your comment and the TODO-comments to the Peerings APIs correctly, it seems like we do not need any context at all, other than perhaps context.Background()
. Not in this case at least. I like that - it simplifies the code.
Thank you for taking the time!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
The previously used `DefaultContextTimeout` does not work well with the user-definable `WaitTime`, which can easily exceed this global timeout. By removing the use of a context deadline, we follow the pattern other dependencies (like consul_peerings) use. This made `DefaultContextTimeout` unused and it was therefore removed. Fixes hashicorp#2041
The previously used
DefaultContextTimeout
does not work well with the user-definableWaitTime
, which can easily exceed this global timeout.Fixes #2041