-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate fields based on their mappings and the dynamic templates set #2285
base: main
Are you sure you want to change the base?
Conversation
multi_fields are now compared directly with dynamic templates or ECS fields. It was required to know whether or not that field was a multi_field or not to apply some specific validations.
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#12164 |
…ing for other dynamic templates if parameters do not match
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#12164 |
As a reference, this is the previous errors found without support of dynamic templates: #2214 (comment) Looking at the errors in this build: https://buildkite.com/elastic/integrations/builds/19707 There are some packages that are failing due to missing parameters:
There are packages that fail due to different types comparing with ECS:
Packages with issues related to multi_fields:
|
@@ -448,7 +450,7 @@ func appendECSMappingMultifields(schema []FieldDefinition, prefix string) []Fiel | |||
{ | |||
Name: "text", | |||
Type: "match_only_text", | |||
External: "ecs", | |||
External: externalFieldAppendedTag, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workaround to be able to filter multifieds generated here during the validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a better approach for this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have many failures if we don't do this? Maybe we can leave this for a separate change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As talked offline, we decided to ignore the validation of these multi-fields.
Added a comment about the reasoning behind this in https://github.com/elastic/elastic-package/pull/2285/files#diff-67b718cebdeb840d99361c07166f75df079827a75df20850ef6cada861304521R318-R322
internal/fields/mappings.go
Outdated
var matches bool | ||
if fullRegex { | ||
matches, err = stringMatchesPatterns(values, name) | ||
} else { | ||
matches, err = stringMatchesWildcards(values, name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does match_pattern
applies also for unmatch
?
According to this, maybe it just applies for match
.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html#match-unmatch
WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the docs seem to indicate that it only applies to match
, but it would be weird that it is not possible to set this option for unmatch
. Maybe match_pattern
applies to both?
In any case, do we generate templates with these options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the docs seem to indicate that it only applies to match, but it would be weird that it is not possible to set this option for unmatch. Maybe match_pattern applies to both?
According to the tests it looks like it applies to both parameters
https://github.com/elastic/elasticsearch/blob/fe3a3cb70e3dd18b3e5ddaf6a63389b4cb16b776/server/src/test/java/org/elasticsearch/index/mapper/DynamicTemplatesTests.java#L2391-L2397
And also it looks like it could apply to path_match
and path_unmatch
, lookign at this test
https://github.com/elastic/elasticsearch/blob/75bf27c8b69b7c32fb200afd5a244e7b1f8aea32/x-pack/plugin/core/template-resources/src/main/resources/watch-history.json#L23
Currently, I think that math_pattern
parameter is not used. Fleet does not generate that and currently the ECS component template does not use it neither.
internal/fields/mappings.go
Outdated
// TODO: Should we remove this code to load external and local fields? | ||
fieldsDir := filepath.Join(fieldsParentDir, "fields") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code to load the external and local fields maybe can be removed. That would mean that just the schema set (optionally) via options is taken into account.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds good.
internal/fields/mappings.go
Outdated
case "match_mapping_type", "unmatch_mapping_type": | ||
// Do nothing | ||
// These comparisons are done with the original data, and it's likely that the | ||
// resulting mapping does not have the same type since it could change by the `mapping` field | ||
// case "match_mapping_type": | ||
// logger.Debugf("> Check match_mapping_type: %q (type %s)", currentPath, fieldType) | ||
// values, err := parseSetting(value) | ||
// if err != nil { | ||
// return fmt.Errorf("failed to check match_mapping_type setting: %w", err) | ||
// } | ||
// logger.Debugf(">> Comparing to values: %s", values) | ||
// if slices.Contains(values, "*") { | ||
// continue | ||
// } | ||
// if !slices.Contains(values, fieldType) { | ||
// logger.Debugf(">> Issue: not matches") | ||
// matched = false | ||
// } | ||
// case "unmatch_mapping_type": | ||
// logger.Debugf("> Check unmatch_mapping_type: %q (type %s)", currentPath, fieldType) | ||
// values, err := parseSetting(value) | ||
// if err != nil { | ||
// return fmt.Errorf("failed to check unmatch_mapping_type setting: %w", err) | ||
// } | ||
// logger.Debugf(">> Comparing to values: %s", values) | ||
// if slices.Contains(values, fieldType) { | ||
// logger.Debugf(">> Issue: matches") | ||
// matched = false | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like that these options would be better to skip them, since they would require to have the original type before the documents are stored into Elasticsearch.
If there is no objection, this whole case could be left as empty (with a comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we can ignore this, but we should still check that the mapping type in the template matches with the actual mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I'll remove the code and add a comment there about this.
Comparing and validating the type would be achieved via the validateObjectMappingAndParameters
function once it is found a dynamic template that matches the given parameters.
return fmt.Errorf("missing type parameter for field: %q", currentPath) | ||
} | ||
|
||
for _, template := range dynamicTemplates { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. This loop mixes parsing of the dynamic templates and its execution. Consider unmarshalling the dynamic templates first, so this loop becomes something like:
for _, template := dynamicTemplates {
matches, err := template.Matches(currentPath, definition)
if err != nil {
return err
}
if matches {
return nil
}
}
The parsing of the dynamic templates could be done once, in validateMappingsNotInPreview
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separated that logic in two stages. First one parses the dynamic templates from map[string]any
into a specific struct.
Once the dynamic templates are parsed, the required mappings are validated with those dynamic templates.
internal/fields/mappings.go
Outdated
var matches bool | ||
if fullRegex { | ||
matches, err = stringMatchesPatterns(values, name) | ||
} else { | ||
matches, err = stringMatchesWildcards(values, name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the docs seem to indicate that it only applies to match
, but it would be weird that it is not possible to set this option for unmatch
. Maybe match_pattern
applies to both?
In any case, do we generate templates with these options?
internal/fields/mappings.go
Outdated
case "match_mapping_type", "unmatch_mapping_type": | ||
// Do nothing | ||
// These comparisons are done with the original data, and it's likely that the | ||
// resulting mapping does not have the same type since it could change by the `mapping` field | ||
// case "match_mapping_type": | ||
// logger.Debugf("> Check match_mapping_type: %q (type %s)", currentPath, fieldType) | ||
// values, err := parseSetting(value) | ||
// if err != nil { | ||
// return fmt.Errorf("failed to check match_mapping_type setting: %w", err) | ||
// } | ||
// logger.Debugf(">> Comparing to values: %s", values) | ||
// if slices.Contains(values, "*") { | ||
// continue | ||
// } | ||
// if !slices.Contains(values, fieldType) { | ||
// logger.Debugf(">> Issue: not matches") | ||
// matched = false | ||
// } | ||
// case "unmatch_mapping_type": | ||
// logger.Debugf("> Check unmatch_mapping_type: %q (type %s)", currentPath, fieldType) | ||
// values, err := parseSetting(value) | ||
// if err != nil { | ||
// return fmt.Errorf("failed to check unmatch_mapping_type setting: %w", err) | ||
// } | ||
// logger.Debugf(">> Comparing to values: %s", values) | ||
// if slices.Contains(values, fieldType) { | ||
// logger.Debugf(">> Issue: matches") | ||
// matched = false | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we can ignore this, but we should still check that the mapping type in the template matches with the actual mapping.
@@ -448,7 +450,7 @@ func appendECSMappingMultifields(schema []FieldDefinition, prefix string) []Fiel | |||
{ | |||
Name: "text", | |||
Type: "match_only_text", | |||
External: "ecs", | |||
External: externalFieldAppendedTag, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have many failures if we don't do this? Maybe we can leave this for a separate change.
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#12164 |
@@ -1528,12 +1528,10 @@ func (r *tester) validateTestScenario(ctx context.Context, result *testrunner.Re | |||
logger.Warn("Validate mappings found (technical preview)") | |||
exceptionFields := listExceptionFields(scenario.docs, fieldsValidator) | |||
|
|||
mappingsValidator, err := fields.CreateValidatorForMappings(r.dataStreamPath, r.esClient, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the process to load the fields from the package and ECS has been removed, these parameters are not needed anymore.
The schema to be used to validate fields with ECS is the one set (optionally) via the method WithMappingValidatorFallbackSchema
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#12164 |
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#12164 |
test integrations |
Created or updated PR in integrations repository to test this version. Check elastic/integrations#12164 |
Follow-up of the above comment #2285 (comment) Summary of the errors found in the build: https://buildkite.com/elastic/integrations/builds/20226 There are some packages that are failing due to missing parameters/definitions:
There are packages that fail due to different types comparing with ECS:
Packages that do not have any test failing now:
While inspecting the mappings and dynamic templates created for the
- name: core.*.pct
type: scaled_float
format: percent
unit: percent
metric_type: gauge
description: |
Percentage of CPU time in this core.
{
"docker.cpu.core.*.pct": {
"path_match": "docker.cpu.core.*.pct",
"match_mapping_type": "*",
"mapping": {
"scaling_factor": 1000,
"type": "scaled_float"
}
}
}, cc @jsoriano |
Among the errors mentioned above:
One option to solve this error in Currently, the embedded dynamic template created by elastic-package when {
"_embedded_ecs-url_original_to_multifield": {
"path_match": "*.url.original",
"mapping": {
"fields": {
"text": {
"type": "match_only_text"
}
},
"type": "wildcard"
}
}
}, This dynamic template could be updated as in the {
"ecs_path_match_wildcard_and_match_only_text": {
"path_match": [
"*.body.content",
"*url.full",
"*url.original"
],
"unmatch_mapping_type": "object",
"mapping": {
"fields": {
"text": {
"type": "match_only_text"
}
},
"type": "wildcard"
}
}
}, The inconvenient is that it would be needed to do a new WDYT about updating the embedded dynamic templates (in a follow-up PR)? Should it be created a new "empty" release in EDIT:
|
Why is this issue not detected with current validation? 🤔
We probably need to do that in any case.
We will need this to apply the new validation in integrations, we can wait till then. |
Looking at the current logic to validate the fields (without mappings), IIUC elastic-package/internal/fields/validate.go Lines 1249 to 1252 in a7d863a
As this requires the ECS templates file to be updated in |
💚 Build Succeeded
History
cc @mrodm |
Created this PR elastic/integrations#12300 to start testing different approaches/workarounds that could be followed in the failing packages found in the integrations repository. As a note, that integrations PR is just for testing purposes, and it is not going to be merged as it is. |
Relates #2207
Add support to validate mappings after ingesting documents based on dynamic templates.
It also updates how mappings not found in the preview are compared/validated with ECS fields.
Assumptions:
import_mappings
ecs@mappings
component templateappendECSMappingMultifields
function.Author's Checklist