-
Notifications
You must be signed in to change notification settings - Fork 4.6k
credentials: implement file-based JWT Call Credentials (part 1 for A97) #8431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8431 +/- ##
==========================================
- Coverage 82.27% 80.65% -1.62%
==========================================
Files 414 415 +1
Lines 40424 40969 +545
==========================================
- Hits 33259 33044 -215
- Misses 5795 6244 +449
- Partials 1370 1681 +311
🚀 New features to boost your workflow:
|
@dfawley hey 👋 Given you approved A97, would you mind having a cursory look at the PR to confirm if at least at a high level the approach looks good? |
I will take a look at this , I need to go through the gRFC first. |
Sorry for the delay here. @easwars would you be able to review this change? I think you have more background into some of the things than I do, like the bootstrap integration. Thank you! |
Thank you for your contribution @dimpavloff. Yes, it would be nice if you can split this into smaller PRs. I will continue to use this PR to review the JWT call credentials implementation. If you can move the xDS implementation out to one or more PRs, I would greatly appreciate that and would be happy to review them as well. |
|
||
// Verify cached expiration is 30 seconds before actual token expiration | ||
impl := creds.(*jwtTokenFileCallCreds) | ||
impl.mu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please only test the API surface. Relying on implementation internals in tests makes them brittle and would result in test changes when any changes to implementation is made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are referring to using a private field rather than obtaining mu specifically.
In general I agree -- white box tests may get fragile and break during a refactor. However, this test and the next couple of ones are about the caching behaviour -- it is meant to be transparent to the external API. If I don't make assertions about the private fields, the tests may pass trivially and become more flaky (e.g. when testing the backoff in the next test).
One alternative could be factoring out these behaviours out into a separate private struct with "public" functions which expose the same information. Given that it would require shifting the majority of the implementation into that struct, I'm not sure it is an improvement from the current approach.
Please do let me know your thoughts and if you have other suggestions.
@arjan-bal ping |
credentials/jwt/jwt_file_reader.go
Outdated
} | ||
|
||
// jWTFileReader handles reading and parsing JWT tokens from files. | ||
type jWTFileReader struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: This struct should be named jwtFileReader
. See https://google.github.io/styleguide/go/decisions#initialisms
credentials/jwt/jwt_file_reader.go
Outdated
func (r *jWTFileReader) extractExpiration(token string) (time.Time, error) { | ||
parts := strings.Split(token, ".") | ||
if len(parts) != 3 { | ||
return time.Time{}, fmt.Errorf("expected 3 parts, got %d: %w", len(parts), errJWTValidation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems to be wrapping errJWTValidation
in every error it returns. I think we should move the wrapping to the calling function to avoid redundancy.
If there are reasons to perform the wrapping in this method, we should mention that the cases in which the returned error is an instance of errJWTValidation
in the godoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 👍
credentials/jwt/jwt_file_reader.go
Outdated
} | ||
|
||
// extractExpiration parses the JWT token to extract the expiration time. | ||
func (r *jWTFileReader) extractExpiration(token string) (time.Time, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JWT parsing seems like a fairly complex task. Even though we aren't verifying the token, my opinion is that we should use a popular library like golang-jwt. I would like to get others' opinions on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I scanned through that library. Looks like it has a lot of functionality. In the past, we have decided to implement things ourselves when we only need a small amount of functionality from a library. Also, we already have this functionality in the existing GCP JWT token call creds. So, I'm leaning against not taking a dependency on an external library (even though it seems to be a 1.x). Want to hear Doug's thoughts on this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This functionality is preset in the go std lib.
import "golang.org/x/oauth2/jws"
func (r *jwtFileReader) extractExpiration(token string) (time.Time, error) {
cs, err := jws.Decode(token)
if err != nil {
return time.Time{}, err
}
expTime := time.Unix(cs.Exp, 0)
if expTime.Before(time.Now()) {
return time.Time{}, fmt.Errorf("expired token: %w", errJWTValidation)
}
return expTime, nil
}
This is what the Google oauth2 implementation uses internally. The jws package is marked as deprecated with the following message:
Deprecated: this package is not intended for public use and might be removed in the future. It exists for internal use only. Please switch to another JWS package or copy this package into your own source tree.
I think we can take inspiration from their implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can take inspiration from their implementation.
Their implementation's jws.Decode
:
- Parses the token by splitting on the
.
character https://cs.opensource.google/go/x/oauth2/+/refs/tags/v0.31.0:jws/jws.go;drc=696f7b31289a98558822be146698b7834e477e63;l=182 - Calls
base64.RawURLEncoding.DecodeString(claims)
- Decodes the JWT Claims
json.NewDecoder(bytes.NewBuffer(decoded)).Decode(c)
This is exactly what the current function does, barring minor detail of what exact functions are called to achieve these steps. So I am struggling to understand what your suggestion is here – copy paste the code verbatim so that it's exactly the same as in golang.org/x/oauth2/jws
? If yes, why is that preferred? Or ignore golang.org/x/oauth2/jws
's deprecation notice and do what you have in the snippet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the confusion, this thread is not actionable now. I left it open for other reviewers to see. I left comments about replacing strings.Split
for efficiency and using RawURLEncoding
below after going through the std lib implementation.
* | ||
*/ | ||
|
||
// Package jwt implements gRPC credentials using JWT tokens from files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a single file in the package should have the package level godoc. Since it's there in doc.go, we should remove it from here.
if needsPreemptiveRefresh { | ||
// Start refresh if not pending (handling the prior RPC may have | ||
// just spawned a goroutine). | ||
if !c.pendingRefresh { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we combine this with the outer check to reduce one level of indentation?
if needsPreemptiveRefresh && !c.pendingRefresh {
|
||
// refreshToken reads the token from file and updates the cached data. | ||
func (c *jwtTokenFileCallCreds) refreshToken() { | ||
// Deliberately not locking c.mu here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the mutex lock omitted here to avoid doing file IO while holding the lock? If that's the case, we should state this in the comment for the benefit of readers.
I think this is safe only as long fileReader
can handle concurrent calls. It can presently, since it doesn't have any mutable state. We should probably this requirement in the godoc of fileReader
to serve as a good warning in case state is ever added to fileReader. Something like "It is safe to call methods on this type concurrently" should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the mutex lock omitted here to avoid doing file IO while holding the lock?
Yes, exactly.
We should probably this requirement in the godoc of fileReader to serve as a good warning in case state is ever added to fileReader.
Good idea! 👍
c.mu.Lock() | ||
defer c.mu.Unlock() | ||
c.updateCacheLocked(token, expiry, err) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please omit empty line here, it doesn't seem to improve readability here.
c.nextRetryTime = time.Time{} | ||
|
||
c.cachedAuthHeader = "Bearer " + token | ||
// Per RFC A97: consider token invalid if it expires within the next 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This should be Per gRFC
instead of Per RFC
.
func (c *jwtTokenFileCallCreds) GetRequestMetadata(ctx context.Context, _ ...string) (map[string]string, error) { | ||
ri, _ := credentials.RequestInfoFromContext(ctx) | ||
if err := credentials.CheckSecurityLevel(ri.AuthInfo, credentials.PrivacyAndIntegrity); err != nil { | ||
return nil, fmt.Errorf("cannot send secure credentials on an insecure connection: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Use %v
instead of %w
to format the embedded error here. Using %w
can leak the type of the embedded error and make it part of the API which people may start depending on. See https://google.github.io/styleguide/go/best-practices#adding-information-to-errors
It is best to avoid using %w unless you also document (and have tests that validate) the underlying errors that you expose. If you do not expect your caller to call errors.Unwrap, errors.Is and so on, don’t bother with %w.
impl.mu.Unlock() | ||
|
||
if !nextRetryTime2.Equal(nextRetryTime) { | ||
t.Errorf("nextRetryTime should not change due to backoff. Got: %v, Want: %v", nextRetryTime2, nextRetryTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check seems to rely on the fact that time b/w the two GetRequestMetadata
calls is always <= 0.8s. I think we should try to avoid relying on time like this because overloaded CI infra can cause test execution to slow down occasionally. Can we change the nextRetryTime
to something larger than defaultTestTimeout
before this verification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; I've removed the reliance on the backoff strategy implementation entirely and now we control the nextRetryTime
whenever we want to retry / ensure no retries are made.
// Third call should retry but still fail with UNAVAILABLE. | ||
_, err3 := creds.GetRequestMetadata(ctx) | ||
if err3 == nil { | ||
t.Fatal("Expected cached error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Use the word want
instead of expect
, e.g.:
t.Fatalf("creds.GetRequestMetadata() = %v, want non-nil". err)
See https://google.github.io/styleguide/go/decisions#got-before-want
impl.mu.Unlock() | ||
|
||
// Third call should retry but still fail with UNAVAILABLE. | ||
_, err3 := creds.GetRequestMetadata(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Since we're not using the old err
variable again, we can re-use the err
names instead of adding numbers as suffixes.
t.Errorf("nextRetryTime should not change due to backoff. Got: %v, Want: %v", nextRetryTime4, nextRetryTime3) | ||
} | ||
if retryAttempt4 != retryAttempt3 { | ||
t.Error("retry attempt should not change due to backoff") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Log message should being with a capital letter, like regular sentences.
t.Errorf("after creating valid token file, GetRequestMetadata() should eventually succeed, but got: %v", err5) | ||
t.Error("backoff should expire and trigger new attempt on next RPC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Can we combine these two into a single t.Fatalf
call? This should also allow us to un-indent the else
block below. It is fine if the log message takes up more horizontal space.
See https://google.github.io/styleguide/go/guide#line-length
credentials/jwt/jwt_file_reader.go
Outdated
} | ||
|
||
// extractExpiration parses the JWT token to extract the expiration time. | ||
func (r *jWTFileReader) extractExpiration(token string) (time.Time, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This functionality is preset in the go std lib.
import "golang.org/x/oauth2/jws"
func (r *jwtFileReader) extractExpiration(token string) (time.Time, error) {
cs, err := jws.Decode(token)
if err != nil {
return time.Time{}, err
}
expTime := time.Unix(cs.Exp, 0)
if expTime.Before(time.Now()) {
return time.Time{}, fmt.Errorf("expired token: %w", errJWTValidation)
}
return expTime, nil
}
This is what the Google oauth2 implementation uses internally. The jws package is marked as deprecated with the following message:
Deprecated: this package is not intended for public use and might be removed in the future. It exists for internal use only. Please switch to another JWS package or copy this package into your own source tree.
I think we can take inspiration from their implementation.
credentials/jwt/jwt_file_reader.go
Outdated
// Add padding if necessary for base64 decoding. | ||
if m := len(payload) % 4; m != 0 { | ||
payload += strings.Repeat("=", 4-m) | ||
} | ||
|
||
payloadBytes, err := base64.URLEncoding.DecodeString(payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use base64.RawURLEncoding
to handle padding for us. See the std lib implementation.
credentials/jwt/jwt_file_reader.go
Outdated
parts := strings.Split(token, ".") | ||
if len(parts) != 3 { | ||
return time.Time{}, fmt.Errorf("expected 3 parts, got %d: %w", len(parts), errJWTValidation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the peel-off style used by the std lib implementation as its more efficient. It avoids extra allocations only re-slices the existing string a couple of times.
…hed value; also explain preemptive refresh test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @arjan-bal . Could you please take another look?
|
||
// refreshToken reads the token from file and updates the cached data. | ||
func (c *jwtTokenFileCallCreds) refreshToken() { | ||
// Deliberately not locking c.mu here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the mutex lock omitted here to avoid doing file IO while holding the lock?
Yes, exactly.
We should probably this requirement in the godoc of fileReader to serve as a good warning in case state is ever added to fileReader.
Good idea! 👍
grpctest.Tester | ||
} | ||
|
||
func TestTokenFileCallCreds(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, nicely spotted. Thanks
name: "insufficient_security_level", | ||
tokenContent: createTestJWT(t, now.Add(time.Hour)), | ||
authInfo: &testAuthInfo{secLevel: credentials.NoSecurity}, | ||
wantCode: codes.Unknown, // http2Client.getCallAuthData actually transforms such errors into into Unauthenticated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to a few lines above --
grpc-go/internal/transport/http2_client.go
Line 691 in c122250
return nil, status.Error(codes.Unauthenticated, "transport: cannot send secure credentials on an insecure connection") |
{ | ||
name: "unreachable_token_file", | ||
invalidTokenPath: true, | ||
tokenContent: "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree but because I am varying the value across the test cases, I thought it would be clearer to the reader what the intent of each test is as it's easier to compare visually.
The rule you quoted supports this: Zero-value fields may be omitted from struct literals when clarity is not lost as a result.
My preference would be to keep it as is but happy to remove if you insist.
t.Fatal("token should be cached after successful GetRequestMetadata") | ||
} | ||
|
||
if !shouldTriggerRefresh { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just explicitly confirms the precondition necessary for the rest of the test -- the test assumes that the subsequent call (the second call) will trigger a refresh. So this test will fail sooner if this doesn't happen.
I've updated the failure message to something more useful but happy to remove it, too, as it's not strictly needed.
impl.mu.Unlock() | ||
|
||
if !nextRetryTime2.Equal(nextRetryTime) { | ||
t.Errorf("nextRetryTime should not change due to backoff. Got: %v, Want: %v", nextRetryTime2, nextRetryTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; I've removed the reliance on the backoff strategy implementation entirely and now we control the nextRetryTime
whenever we want to retry / ensure no retries are made.
if retryAttempt != 1 { | ||
t.Errorf("Expected retry attempt to be 1, got %d", retryAttempt) | ||
} | ||
if nextRetryTime.IsZero() || nextRetryTime.Before(time.Now()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, doesn't this make the test a bit more noisy as it will have to be aware of the backoff details, which is another implementation detail, calculate the above, and explain it?
I've attempted to improve the test flow a bit. Please lmk what you think.
Part one for grpc/proposal#492 (A97).
This is done in a new
credentials/jwt
package to provide file-based PerRPCCallCredentials. It can be used beyond XDS. The package handles token reloading, caching, and validation as per A97 .There will be a separate PR which uses it in
xds/bootstrap
.Whilst implementing the above, I considered
credentials/oauth
andcredentials/xds
packages instead of creating a new one. The former package hasNewJWTAccessFromKey
andjwtAccess
which seem very relevant at first. However, I think thejwtAccess
behaviour seems more tailored towards Google services. Also, the refresh, caching, and error behaviour for A97 is quite different than what's already there and therefore a separate implementation would have still made sense.WRT
credentials/xds
, it could have been extended to both handle transport and call credentials. However, this is a bit at odds with A97 which says that the implementation should be non-XDS specific and, from reading between the lines, usable beyond XDS.I think the current approach makes review easier but because of the similarities with the other two packages, it is a bit confusing to navigate. Please let me know whether the structure should change.
Relates to istio/istio#53532
RELEASE NOTES:
credentials/jwt
package providing file-based JWT PerRPCCredentials (A97)