Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/gen 2030 group sources to destination #2112

Open
wants to merge 282 commits into
base: main
Choose a base branch
from

Conversation

alonkeyval
Copy link
Collaborator

No description provided.

damemi and others added 30 commits December 23, 2024 13:41
Create & Delete Source CRD via UI
…yncWorkloadsInNamespace to handle source labeling
@alonkeyval alonkeyval changed the base branch from feature/source-crd to main January 13, 2025 10:26
Copy link
Contributor

@damemi damemi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an e2e test for this? You should be able to add it to tests/e2e/source to show:

  • Workload Source with odigos.io/group- label works with grouping
  • Namespace Source with odigos.io/group- label includes all the workloads in the namespace
  • Workload Source with disableInstrumentation=true excludes that Source from the grouping (? not sure what the behavior should be here)

if len(dest.Spec.SourceSelector.Groups) > 0 {
matchedSources := fetchSourcesByGroups(ctx, kubeClient, dest.Spec.SourceSelector.Groups, logger)
for _, source := range matchedSources {
key := fmt.Sprintf("%s/%s/%s", source.Spec.Workload.Namespace, source.Spec.Workload.Name, source.Spec.Workload.Kind)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will this work with Namespace Sources? You can create one Source to instrument an entire namespace, so won't you need to do some sort of linking between the workload and the namespace source?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we rely on the source crd to represent a group. What do you think about creating the source crd if it doesn't exist (because the source is represented by source crd from namespace kind) once the UI creates a group?

in this case we will have the source crd that represent the workload and one more for the namespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should minimize how often we create Sources on behalf of the user, since it's supposed to be a user-created object. Also, even the UI doesn't create Namespace Sources so if there is one it means the user likely intentionally created it instead of using Workload Sources.

I think it would be better here to handle a Namespace Source as a special case, where you loop through all the workloads in the namespace and check if they should inherit the group from the namespace (like we do in the Instrumentor).

The next question is what if a workload Source already exists with a different group than the namespace Source? Should the workload be part of both groups? Or should the workload Source take priority? I think to begin with, it should take priority over the namespace.

So, when you see a Namespace Source:

  1. List all other Sources in that Namespace to check for different groups or excluded workloads
  2. Remember any workloads with a different group or excluded
  3. List through all the deployments/statefulsets/daemonsets in the namespace
  4. If the workload isn't in the list from earlier, add it to the group in the Namespace Source

The pseudo-go would look like this:

matchedSources := fetchSourcesByGroups(ctx, kubeClient, dest.Spec.SourceSelector.Groups, logger)
for _, source := range matchedSources {
  if source.Spec.Workload.Kind == workload.WorkloadKindNamespace {
    excludedWorkloads := make(map[string]struct{}{})
    // exclude disabled Sources or Sources with a different group
    nsSources := &odigosv1.SourceList{}
    err := kubeClient.List(ctx, sourceList, client.InNamespace(source.Spec.Workload.Namespace))
    if err != nil {
      continue
    }
    for _, nsSource := range nsSources.Items {
      val, exists := nsSource.Labels["odigos.io/source-group"]
      // if the nsSource has a group label that's different from the original matchedSource, or the Source is disabled, exclude it
      if (exists && val != source.Labels["odigos.io/source-group"]) || !nsSource.Spec.DisableInstrumentation {
        excludedWorkloads[fmt.Sprintf("%s/%s/%s", source.Spec.Workload.Namespace, source.Spec.Workload.Name, source.Spec.Workload.Kind] = struct{}{}
      }

    // get workload keys for non-excluded workloads in the namespace
    deployments := &v1.DeploymentList{}
    err := kubeClient.List(ctx, deployments, client.InNamespace(source.Spec.Workload.Namespace))
      if err != nil {
        continue
      } 
    }
    
    for _, dep := range deployments.Items {
      key := fmt.Sprintf("%s/%s/%s", deployment.GetNamespace(), deployment.GetName(), workload.WorkloadKindDeployment)
      if _, exists := excludedWorkloads[key]; !exists {
        matchConditions = append(matchConditions, key)
      }
    }
    // repeat for statefulsets and daemonsets

  } else { //not a namespace source, follow the flow you have now
    key := fmt.Sprintf("%s/%s/%s", source.Spec.Workload.Namespace, source.Spec.Workload.Name, source.Spec.Workload.Kind)
    matchConditions = append(matchConditions, key)
  }
}

This should be a test case too.

Technically, you probably don't even need to handle any workload Sources with disableInstrumentation: true, because they won't be sending instrumentation anyway. But I guess it doesn't hurt here.

Copy link
Collaborator

@tamirdavid1 tamirdavid1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think that we should do it with connector as discussed

return routingProcessors
}

func fetchSourcesByNamespaces(ctx context.Context, kubeClient client.Client, namespaces []string, logger logr.Logger) []odigosv1.Source {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because we will have lots of logic around the sources better to move it under a dedicated pkg under k8sutils e.g sources

Copy link
Collaborator

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍
Added few comments most are nit

api/odigos/v1alpha1/destination_types.go Outdated Show resolved Hide resolved
api/odigos/v1alpha1/destination_types.go Outdated Show resolved Hide resolved
api/odigos/v1alpha1/destination_types.go Outdated Show resolved Hide resolved
@@ -111,14 +108,26 @@ func CalculateWithBase(currentConfig *Config, prefixProcessors []string, dests [
status.Destination[dest.GetID()] = fmt.Errorf("no configer for %s", dest.GetType())
continue
}
sanitizedProcessorName := fmt.Sprintf("odigosroutingfilterprocessor/%s", strings.ReplaceAll(dest.GetID(), ".", "-"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

common/config/root.go Outdated Show resolved Hide resolved
Comment on lines +37 to +45
for j := 0; j < ilSpans.Len(); j++ {
scopeSpan := ilSpans.At(j)
spans := scopeSpan.Spans()

spans.RemoveIf(func(span ptrace.Span) bool {
namespace, name, kind := extractResourceDetails(resourceAttributes)
return !fp.matches(name, namespace, kind)
})
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you make the decision based on the resource attributes only, you can remove the spans (metrics/logs) iterations and simply remove or keep the entire scopeSpans


func extractResourceDetails(attributes pcommon.Map) (namespace, name, kind string) {
namespace = getAttribute(attributes, k8sNamespaceNameAttr)
if namespace == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer to use a boolean value to indicate if the attribute is found, instead of relying on empty value

}

name, kind = getDynamicNameAndKind(attributes)
if name == "" || kind == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, return "found" boolean to indicate this

if len(dest.Spec.SourceSelector.Groups) > 0 {
matchedSources := fetchSourcesByGroups(ctx, kubeClient, dest.Spec.SourceSelector.Groups, logger)
for _, source := range matchedSources {
key := fmt.Sprintf("%s/%s/%s", source.Spec.Workload.Namespace, source.Spec.Workload.Name, source.Spec.Workload.Kind)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we usually write kind before name. e.g. deployment-foo and not foo-deployment

Suggested change
key := fmt.Sprintf("%s/%s/%s", source.Spec.Workload.Namespace, source.Spec.Workload.Name, source.Spec.Workload.Kind)
key := fmt.Sprintf("%s/%s/%s", source.Spec.Workload.Namespace, source.Spec.Workload.Kind, source.Spec.Workload.Name)


type Config struct {
MatchConditions []string `mapstructure:"match_conditions"`
MatchMap map[string]struct{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not part of the config. I think it should go directly as a member of the filterProcessor

@odigos-io odigos-io deleted a comment from blumamir Jan 16, 2025
func fetchSourcesByGroups(ctx context.Context, kubeClient client.Client, groups []string, logger logr.Logger) []odigosv1.Source {
sourceMap := make(map[string]odigosv1.Source)
for _, group := range groups {
labelSelector := labels.Set{fmt.Sprintf("odigos.io/group-%s", group): "true"}.AsSelector()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making the label name constant, and the group is the value of the label? So instead of

odigos.io/group-foo: true

it's

odigos.io/source-group: foo

Labels are meant to hold variable string values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants