Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement(http sink): improve config UX #21857

Conversation

KowalczykBartek
Copy link
Contributor

Fix for #21583

@pront pront changed the title Use existing Auth struct enhancement(http provider): improve config UX Nov 21, 2024
@pront pront added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label Nov 21, 2024
@pront pront changed the title enhancement(http provider): improve config UX enhancement(http sink): improve config UX Nov 21, 2024
@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 21, 2024

@pront what do you think about this bc3bc28 ? this tls makes sense only for oauth2

Problem is that, I had to remove those eq and partial eq (otherwise how to implement it for Option), I am not sure how important it is, the only thing it breaks is

    fn test_parse(input: &str, expected_uri: &'static str, expected_auth: Option<(&str, &str)>) {
        let UriSerde { uri, auth } = input.parse().unwrap();
        assert_eq!(uri, Uri::from_static(expected_uri));
        assert_eq!(
            auth,
            expected_auth.map(|(user, password)| {
                Auth::Basic {
                    user: user.to_owned(),
                    password: password.to_owned().into(),
                }
            })
        );
    }

@pront
Copy link
Member

pront commented Nov 21, 2024

@pront what do you think about this bc3bc28 ? this tls makes sense only for oauth2

Hi @KowalczykBartek, I appreciate you jumping on this so fast!

I am going to push a few commits to this PR and then I would really appreciate if you can run some manual tests again and share the config you used in this PR.

@pront pront requested review from a team as code owners November 21, 2024 01:06
@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Nov 21, 2024
@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 21, 2024

@pront you removed

pub tls: Option<TlsConfig>,

now, there is no option to pass certs for mtls scenario - that was the idea behind this ticket, your http rest client can use different tls settings than cert/key used to acquire bearer token :/ this is the minimal working solution bc3bc28 (I didn't put this tls there by accident, definitely not a duplicate f599b37#diff-0820c0d03ae75b35f0b5bbd8dbc8e4d7ca048d19d60351bef069ac00c236910dL670)

share the config you used in this PR.

#21583 (comment)

@pront
Copy link
Member

pront commented Nov 21, 2024

@pront you removed

pub tls: Option<TlsConfig>,

now, there is no option to pass certs for mtls scenario - that was the idea behind this ticket, your http rest client can use different tls settings than cert/key used to acquire bearer token

Ah yes indeed, feel free to add this back. Please just push commits directly to this PR and I will take a look.

If you add that back, you can use make generate-component-docs to regenerate the docs.

You can also do cd website && make run-production-site-locally to see the result after the CUE files are updated. But that might need Hugo version 0.84.0.

share the config you used in this PR.

#21583 (comment)

Apologies if I wasn't clear, I meant that after this PR is finalized the config schema will change e.g. the http_client_authorization_strategy field should be removed right?

@KowalczykBartek
Copy link
Contributor Author

 feel free to add this back

but now there is a problem I wrote already : #21857 (comment)
I don't know how to fix this.

@pront
Copy link
Member

pront commented Nov 21, 2024

 feel free to add this back

but now there is a problem I wrote already : #21857 (comment) I don't know how to fix this.

I see, it's because TlsConfig doesn't derive PartialEq and Eq. But you can just add them and everything should just work:

/// TLS configuration.
#[configurable_component]
#[configurable(metadata(docs::advanced))]
#[derive(Clone, Debug, Default, PartialEq, Eq)]
#[serde(deny_unknown_fields)]
pub struct TlsConfig {

@github-actions github-actions bot added the domain: core Anything related to core crates i.e. vector-core, core-common, etc label Nov 21, 2024
@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 21, 2024

Ok, i think we need to revert this from master. This will just not work - da06b02

sinks:
  my_sink_id:
    auth:
      strategy: "o_auth2"
      client_id: "xoxoxoxoxoxox"
      grace_period: 43100
      token_endpoint: "https://xoxoxoxoxox/oauth/token"
      tls:
        crt_path: xoxoxoxoxox/cert.pem
        key_file: xoxoxoxoxoxo/key.pem
    type: http
    encoding:
      codec: raw_message
    inputs:
      - my_source_id
    uri: http://localhost:9091/output

this will also kicks a

    pub fn apply_headers_map(&self, map: &mut HeaderMap) {
        match &self {
            Auth::Basic { user, password } => {
                let auth = Authorization::basic(user.as_str(), password.inner());
                map.typed_insert(auth);
            }
            Auth::Bearer { token } => match Authorization::bearer(token.inner()) {
                Ok(auth) => map.typed_insert(auth),
                Err(error) => error!(message = "Invalid bearer token.", token = %token, %error),
            },
            Auth::OAuth2 { .. } => panic!("OAuth2 authentication is not supported currently"),
        }
    }

which just crash vector, that's why I created separate property, to distinguish that from the beginning

But I fixed a major refactoring, so at least its again rfc compliant :)

@pront
Copy link
Member

pront commented Nov 21, 2024

Ok, i think we need to revert this from master. This will just not work - da06b02

sinks:
  my_sink_id:
    auth:
      strategy: "o_auth2"
      client_id: "xoxoxoxoxoxox"
      grace_period: 43100
      token_endpoint: "https://xoxoxoxoxox/oauth/token"
      tls:
        crt_path: xoxoxoxoxox/cert.pem
        key_file: xoxoxoxoxoxo/key.pem
    type: http
    encoding:
      codec: raw_message
    inputs:
      - my_source_id
    uri: http://localhost:9091/output

The config format looks good 👍
However, I cannot follow all the above. Are you suggesting to revert the original PR?

this will also kicks a

    pub fn apply_headers_map(&self, map: &mut HeaderMap) {
        match &self {
            Auth::Basic { user, password } => {
                let auth = Authorization::basic(user.as_str(), password.inner());
                map.typed_insert(auth);
            }
            Auth::Bearer { token } => match Authorization::bearer(token.inner()) {
                Ok(auth) => map.typed_insert(auth),
                Err(error) => error!(message = "Invalid bearer token.", token = %token, %error),
            },
            Auth::OAuth2 { .. } => panic!("OAuth2 authentication is not supported currently"),
        }
    }

which just crash vector, that's why I created separate property, to distinguish that from the beginning

But I fixed a major refactoring, so at least its again rfc compliant :)

Does the RFC specify mandatory headers? If not, then we can replace the panic! with a () (noop).

@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 21, 2024

Does the RFC specify mandatory headers? If not, then we can replace the panic! with a () (noop).

no no, both properties are now used by two different mechanisms, this conceptually is wrong, there needs to be different property than auth, cuz auth is already used, so, this 66e619c is minimal what can be done, but its redundant so... I have no more ideas

@pront
Copy link
Member

pront commented Nov 21, 2024

Does the RFC specify mandatory headers? If not, then we can replace the panic! with a () (noop).

no no, both properties are now used by two different mechanisms, this conceptually is wrong, there needs to be different property than auth, cuz auth is already used, so, this 66e619c is minimal what can be done, but its redundant so... I have no more ideas

To summarize, I understand why you want TLS settings directly on your OAuth variant and on HttpSinkConfig.

Are you saying there is a need to have both auth and authorization_config on HttpSinkConfig? Is there a scenario where you would need to specify both? E.g.

sinks:
  my_sink_id:
    type: http
    encoding:
      codec: raw_message
    inputs:
      - my_source_id
    uri: http://localhost:9091/output

    # first
    auth:
      strategy: bearer
      token: xxx

   # second
   authorization_config:
      strategy: "o_auth2"
      client_id: "xoxoxoxoxoxox"
      grace_period: 43100
      token_endpoint: "https://xoxoxoxoxox/oauth/token"
      tls:
        crt_path: xoxoxoxoxox/cert.pem
        key_file: xoxoxoxoxoxo/key.pem

It would be a shame to drop all this work. If you can explain a bit better what are the challenges are here, I am confident we can find a solution.

@jszwedko jszwedko added this to the Vector v0.43.0 milestone Nov 21, 2024
@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 21, 2024

@pront I propose to run this locally, with this config #21857 (comment) (any oauth without tls atm is ok).
then you will see the panic and why I am writing about two mechanism kicking in
and what have to be done is explained here #21583 (comment)

@pront
Copy link
Member

pront commented Nov 21, 2024

@pront I propose to run this locally, with this config #21857 (comment) (any oauth without tls atm is ok). then you will see the panic and why I am writing about two mechanism kicking in and what have to be done is explained here #21583 (comment)

ACK. I doubt I can get to this before the release since I am occupied with some other work. Worst case, I will revert the original PR and then we can return to this without any time pressure.

@KowalczykBartek
Copy link
Contributor Author

In the worst case this PR will be a signpost for next brave hero :)

@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 22, 2024

Ok, I got drunk yesterday (it's because of the weather, charms of eastern Europe..), and I had trouble reading, I went throught this discussion once again, so, lets summarize:

First, I think we were not on the same track, since this commit bc3bc28#diff-0820c0d03ae75b35f0b5bbd8dbc8e4d7ca048d19d60351bef069ac00c236910dL579 , @pront, you was referring to following state (here #21857 (comment)) :

    # first
    auth:
      strategy: bearer
      token: xxx

   # second
   authorization_config:
      strategy: "o_auth2"
      client_id: "xoxoxoxoxoxox"
      grace_period: 43100
      token_endpoint: "https://xoxoxoxoxox/oauth/token"
      tls:
        crt_path: xoxoxoxoxox/cert.pem
        key_file: xoxoxoxoxoxo/key.pem

I was already assuming, there is need to remove authorization_config and rely only on

    auth:
      strategy: bearer
      token: xxx

(and, because of that, both mechanisms were kicked in when this auth was set, because after this commit, both relied on the same property, this lead to panic inside the vector)

I bring back this property bf8f0f4#diff-0820c0d03ae75b35f0b5bbd8dbc8e4d7ca048d19d60351bef069ac00c236910dR575 so, right now, config looks like this

sinks:
  my_sink_id:
    authorization_config:
      strategy:
        strategy: "o_auth2"
        client_id: "myawesomeclientid"
        client_secret: "myawesomesecret"
        token_endpoint: "https://myawesomeoauthprovider/oauth/token"
        grace_period: 123
      tls:
        crt_path: /cert.pem
        key_file: /key.pem

and, there is already existing Auth enum used (as requested here #21583 (comment)), one problem (or not a problem, that's how it works) is:

      strategy:
        strategy: "o_auth2"

there is this duplicate strategy, but, I cannot make this existing auth struct untagged, because its already tagged, it would influence existing configuration

#[configurable_component]
#[derive(Clone, Debug, Eq, PartialEq)]
#[serde(deny_unknown_fields, rename_all = "snake_case", tag = "strategy")]
#[configurable(metadata(docs::enum_tag_description = "The authentication strategy to use."))]
pub enum Auth {

(this commit include all steps from #21583 (comment) )

I also moved tls from auth's oauth2 variant to

/// Configuration for HTTP client providing an authentication mechanism.
#[configurable_component]
#[configurable(metadata(docs::advanced))]
#[derive(Clone, Debug)]
#[serde(deny_unknown_fields)]
pub struct AuthorizationConfig {
    /// Define how to authorize against an upstream.
    #[configurable]
    strategy: Auth,

    /// The TLS settings for the http client's connection.
    ///
    /// Optional, constrains TLS settings for this http client.
    #[configurable(derived)]
    tls: Option<TlsConfig>,
}

if someone will need to have both basic and mtls, so, it gives more flexibility

P.S I test this on real env, it works ( I can share a small Java app, which can be used as local auth provider, expecting certs)

P.S.2 I have question, how can I build locally a dev docker image, I need to check something on k8s, but I am unable to find any guide how to do this, there are multiple scripts, but cannot make it working, also, I am on arm (m2), so, not sure its supported currently

@pront
Copy link
Member

pront commented Nov 22, 2024

how can I build locally a dev docker image, I need to check something on k8s

I wonder if this scripts/build-docker.sh works in your environment.
Otherwise, you can build locally and move the binary to a running docker instance.


Regarding the config UX, I will probably have to checkout this PR again and see if I can change the UX to just this:

    auth:
      strategy: "o_auth2"
      client_id: "xoxoxoxoxoxox"
      grace_period: 43100
      token_endpoint: "https://xoxoxoxoxox/oauth/token"
      tls:
        crt_path: xoxoxoxoxox/cert.pem
        key_file: xoxoxoxoxoxo/key.pem

From your message above it sounds like we don't need two auth configs at the same time but there are some details that get in the way. But bear with me, I am busy with other things so it might take a while.

@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 22, 2024

like we don't need two auth configs at the same time

no, definitely. I would even say, that my solution is a replacement for existing(it does the same but its implemented in correct place, in the rest client), this method with apply_headers is very limited (location is also strange).

Regarding the config UX, I will probably have to checkout this PR again and see if I can change the UX to just this:

I don't think that's doable

diff --git a/src/http.rs b/src/http.rs
index 4c849f1ef..dfa228fd6 100644
--- a/src/http.rs
+++ b/src/http.rs
@@ -580,6 +580,7 @@ impl<B> fmt::Debug for HttpClient<B> {
 pub struct AuthorizationConfig {
     /// Define how to authorize against an upstream.
     #[configurable]
+    #[serde(flatten)]
     strategy: Auth,
 
     /// The TLS settings for the http client's connection.
@@ -595,7 +596,7 @@ pub struct AuthorizationConfig {
 /// HTTP header without any additional encryption beyond what is provided by the transport itself.
 #[configurable_component]
 #[derive(Clone, Debug, Eq, PartialEq)]
-#[serde(deny_unknown_fields, rename_all = "snake_case", tag = "strategy")]
+#[serde(untagged, deny_unknown_fields)]
 #[configurable(metadata(docs::enum_tag_description = "The authentication strategy to use."))]
 pub enum Auth {
     /// Basic authentication.

but it doesn't work

vector::cli: Configuration error. error=data did not match any variant of untagged enum Auth

with following approach

@@ -595,7 +596,7 @@ pub struct AuthorizationConfig {
 /// HTTP header without any additional encryption beyond what is provided by the transport itself.
 #[configurable_component]
 #[derive(Clone, Debug, Eq, PartialEq)]
-#[serde(deny_unknown_fields, rename_all = "snake_case", tag = "strategy")]
+#[serde(tag = "strategy", deny_unknown_fields)]
 #[configurable(metadata(docs::enum_tag_description = "The authentication strategy to use."))]
 pub enum Auth {
     /// Basic authentication.

also, doesn't work

Configuration error. error=unknown field `client_id`

Regarding docker,

./scripts/build-docker.sh     

what parameters should I pass there ?

+ SUPPORTED_PLATFORMS=([debian]="linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8" [alpine]="linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8" [distroless-static]="linux/amd64,linux/arm/v7,linux/arm64/v8" [distroless-libc]="linux/amd64,linux/arm/v7,linux/arm64/v8")
./scripts/build-docker.sh: line 21: debian: unbound variable

pront and others added 3 commits November 25, 2024 14:12
Co-authored-by: Esther Kim <esthercodes2019@gmail.com>
Co-authored-by: Esther Kim <esthercodes2019@gmail.com>
Co-authored-by: Esther Kim <esthercodes2019@gmail.com>
@pront
Copy link
Member

pront commented Nov 25, 2024

Ok one last effort to sync on this. Otherwise, I am going to revert #21583 to unblock the next release. And hopefully we can tackle this in the future.


my solution is a replacement for existing(it does the same but its implemented in correct place, in the rest client)

So we have:

  1. basic (static user/pass creds)
  2. brearer (static string)
  3. token provided by endpoint (dynamic)

Right?

Can you explain to me why we need to modify the existing code when we have (1) or (2)?

I would expect that when you use (3) you can do whatever you want.

I don't see why we cannot add the 3rd one to this struct.


this method with apply_headers is very limited (location is also strange).

This made me think....

I would expect that each HTTP request sets the OAuth token (as provided by the new endpoint.
Example:

POST /api/resource HTTP/1.1
Host: api.example.com
Authorization: Bearer abc123xyz
Content-Type: application/json

{
  "data": "example"
}

Yes/no?

@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 25, 2024

Can you explain to me why we need to modify the existing code when we have (1) or (2)?

where I modified existing codebase ? I provided implementation strictly for oauth token acquisition, additionally I added for user+password (so, my implementation, can be drop-in-replacement for non async apply_headers method).

I would expect that each HTTP request sets the OAuth token (as provided by the new endpoint.
Example:

how to add dynamic async stuff to non async method that has no state... I don't get honestly questions around this implementation :/

And hopefully we can tackle this in the future.

there is nothing to add in this pr, if thats not enought I cannot do more

@pront
Copy link
Member

pront commented Nov 25, 2024

additionally I added for user+password (so, my implementation, can be drop-in-replacement for non async apply_headers method).

Gotcha, that's where the confusion was. Since this is a critical component I am afraid we are doing way too much at once. I am open to replace the existing authentication method but as a two-phase process. So basically, a potential path forward here is:

  • keep (1) and (2) as is (important)
    • could be deprecated in favor of the new one
    • but let's not advertise as deprecated yet
  • also keep the new AuthorizationConfig as a new beta feature
    • document that it is a replacement for the existing auth field
    • Introduce a Auth2 enum to make it easier to maintain
    • rename strategy: Auth, to auth: Auth2
      • let me come back to you, I will check locally if we can flatten it ;)
  • add config validation to throw an error if both are defined (important)
    • this basically prohibits setting both

Does this make sense to you?

If not, I will go ahead and revert and maybe someone from the team can take a crack at it in the future.

@KowalczykBartek
Copy link
Contributor Author

i will go ahead and revert

pls go ahead, unfortunately i cannot spend more time on this

@KowalczykBartek
Copy link
Contributor Author

KowalczykBartek commented Nov 25, 2024

@pront

but we really want to avoid breaking existing users.

break what ?? I provided 100% backward compatible implementation

sinks:
  my_sink_id:
    authorization_config:
      strategy:
        strategy: "o_auth2"
        client_id: "myawesomeclientid"
        client_secret: "myawesomesecret"
        token_endpoint: "https://myawesomeoauthprovider/oauth/token"
        grace_period: 123

this code is started with separate flag, old is old, new is new - I started with new structres, but there were comments to make them common, ok I did it - but where is breaking change ?

@pront
Copy link
Member

pront commented Nov 25, 2024

pls go ahead, unfortunately i cannot spend more time on this

Thank you for your time. In the future, we will revive #21583 and address the UX issue as outlined here.

@KowalczykBartek KowalczykBartek closed this by deleting the head repository Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks no-changelog Changes in this PR do not need user-facing explanations in the release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants