processor: add conditional processing support for logs #10142

niedbalski · 2025-03-28T09:14:03Z

Processor conditional processing for logs

This PR adds conditional processing support for logs in processors. This feature allows processors to conditionally process logs based on field values.

Example configurations

Simple

service:
  log_level: info
pipeline:
  inputs:
    - name: dummy
      dummy: '{"request": {"method": "GET", "path": "/api/v1/resource", "headers": {"Authorization": "Bearer valid-token"}, "access": "granted"}}'
      tag: error.msg
      processors:
        logs:
          - name: content_modifier
            action: insert
            key: modified_if_post
            value: true
            condition:
              op: and
              rules:
                - field: "$request['method']"
                  op: eq
                  value: "POST"

          - name: content_modifier
            action: insert
            key: modified_if_get
            value: true
            condition:
              op: and
              rules:
                - field: "$request['method']"
                  op: eq
                  value: "GET"

  outputs:
    - name: stdout
      match: '*'

When this configuration runs, the processor only adds the field modified_if_get to records where the request method is GET:

[0] error.msg: [[1743160978.853138000, {}], {"request"=>{"method"=>"GET", "path"=>"/api/v1/resource", "headers"=>{"Authorization"=>"Bearer valid-token"}, "access"=>"granted"}, "modified_if_get"=>"true"}]
[0] error.msg: [[1743160979.853940000, {}], {"request"=>{"method"=>"GET", "path"=>"/api/v1/resource", "headers"=>{"Authorization"=>"Bearer valid-token"}, "access"=>"granted"}, "modified_if_get"=>"true"}]

More complex

service:
  log_level: info
pipeline:
  inputs:
    - name: dummy
      dummy: '{"request": {"method": "GET", "path": "/api/v1/resource", "headers": {"Authorization": "Bearer valid-token", "Content-Type": "application/json"}, "status_code": 200, "response_time": 150}}'
      tag: request.log
      processors:
        logs:
          - name: content_modifier
            action: insert
            key: high_priority_method
            value: true
            condition:
              op: and
              rules:
                - field: "$request['method']"
                  op: in
                  value: ["POST", "PUT", "DELETE"]

          - name: content_modifier
            action: insert
            key: requires_performance_check
            value: true
            condition:
              op: or
              rules:
                - field: "$request['response_time']"
                  op: gt
                  value: 100
                - field: "$request['status_code']"
                  op: gte
                  value: 400

  outputs:
    - name: stdout

Outputs:

  "headers"=>{"Authorization"=>"Bearer valid-token", "Content-Type"=>"application/json"}, "status_code"=>200, "response_time"=>150},
  "requires_performance_check"=>"true"}]
  [0] request.log: [[1743163088.501242000, {}], {"request"=>{"method"=>"GET", "path"=>"/api/v1/resource", "headers"=>{"Authorization"=>"Bearer valid-token",
  "Content-Type"=>"application/json"}, "status_code"=>200, "response_time"=>150}, "requires_performance_check"=>"true"}]
  [0] request.log: [[1743163089.501156000, {}], {"request"=>{"method"=>"GET", "path"=>"/api/v1/resource", "headers"=>{"Authorization"=>"Bearer valid-token",
  "Content-Type"=>"application/json"}, "status_code"=>200, "response_time"=>150}, "requires_performance_check"=>"true"}]

The high_priority_method field was NOT added to the records because our input has "method":"GET" which isn't in the array ["POST", "PUT", "DELETE"] specified
in the first condition.
The requires_performance_check field WAS added with value "true" because the second condition is met:
- We have an OR condition with two rules
- The first rule checks if response_time > 100 (our value is 150)
- Since this rule is true, the whole condition evaluates to true, regardless of the second rule
- This matches what we expected: when response time > 100, flag the request for performance checking

Added tests

Internal unit tests

Added test suite for processor condition validation in tests/internal/processor_conditional.c
Created tests for condition operators (and, or)
Added tests for rule operators (eq, neq, gt, lt, gte, lte, regex, not_regex, in, not_in)
Included tests for fields with $ prefix and record accessors
Added tests for error cases:
- Invalid operator validation
- Empty rules arrays
- Multiple rules handling
- Context metadata handling
- Deeply nested field access
- Overwriting existing conditions
- Missing fields/operators/values
- Invalid rule structures
- Invalid regex patterns
- Array values for numeric operators
- String values for 'in' operator
- Complex nested conditions

Runtime shell tests

Added runtime shell tests in tests/runtime_shell/processor_conditional.sh
Tests processor condition functionality with actual Fluent Bit instances

Do not strip $ prefix from field names when parsing conditions. This ensures that field names with $ prefix are correctly passed to the condition evaluator.

Convert all debug statements to trace level in the conditional evaluation code to reduce log noise while allowing detailed tracing when needed.

Add 'condition' to the list of special properties that are handled by the processor and bypass plugin-specific validation.

Convert all debug statements related to condition evaluation to trace level in the MessagePack processing code.

Add support for conditional filtering in MP chunk objects to allow processors to selectively process records that match a condition.

Add condition field to processor unit structure to support conditional processing of records based on their content.

Use the actual string size from the record accessor value instead of calling flb_sds_len() to prevent potential buffer overflows or crashes.

Change log messages about condition processing from info to debug level to reduce noise in the logs.

- Declare variables at the beginning of the function - Change for loop declarations to C89 style - Fix format specifier for uint64_t

Declare loop variables at the beginning of functions rather than within the for loops to ensure compatibility with C89 standard required for CentOS 7 builds.

Fixed memory leak in flb_processor_unit_set_condition when handling array values for the 'in' and 'not_in' operators. The issue was that the cleanup code relied on rule_val still pointing to the array when it might have been reassigned during the context check. Added a flag to track array allocations and fixed all test cleanup to properly free or ignore the condition based on ownership.

niedbalski added 7 commits March 28, 2025 10:02

processor: fix $ prefix field handling in conditions

49adbb2

Do not strip $ prefix from field names when parsing conditions. This ensures that field names with $ prefix are correctly passed to the condition evaluator.

conditionals: change debug statements to trace level

a61665f

Convert all debug statements to trace level in the conditional evaluation code to reduce log noise while allowing detailed tracing when needed.

config: add condition as special property

ab949c1

Add 'condition' to the list of special properties that are handled by the processor and bypass plugin-specific validation.

mp: change debug statements to trace level for conditions

c76792d

Convert all debug statements related to condition evaluation to trace level in the MessagePack processing code.

mp: add condition field to chunk object

73e7de9

Add support for conditional filtering in MP chunk objects to allow processors to selectively process records that match a condition.

processor: add condition field to processor unit

ee7fa97

Add condition field to processor unit structure to support conditional processing of records based on their content.

ra: fix string length calculation in record accessor

5694276

Use the actual string size from the record accessor value instead of calling flb_sds_len() to prevent potential buffer overflows or crashes.

niedbalski requested review from edsiper, leonardo-albertovich, fujimotos and koleini as code owners March 28, 2025 09:14

github-actions bot added the docs-required label Mar 28, 2025

niedbalski temporarily deployed to pr March 28, 2025 09:14 — with GitHub Actions Inactive

processor: change condition logging from info to debug level

21cf3f2

Change log messages about condition processing from info to debug level to reduce noise in the logs.

niedbalski temporarily deployed to pr March 28, 2025 09:16 — with GitHub Actions Inactive

niedbalski removed request for leonardo-albertovich, koleini and fujimotos March 28, 2025 09:19

niedbalski self-assigned this Mar 28, 2025

niedbalski temporarily deployed to pr March 28, 2025 09:39 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 09:40 — with GitHub Actions Inactive

processor: fix C89 compatibility issues

794e388

- Declare variables at the beginning of the function - Change for loop declarations to C89 style - Fix format specifier for uint64_t

niedbalski temporarily deployed to pr March 28, 2025 09:43 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 11:12 — with GitHub Actions Inactive

tests: internal: fix C89 compatibility in processor_conditional.c

e5e93ba

Declare loop variables at the beginning of functions rather than within the for loops to ensure compatibility with C89 standard required for CentOS 7 builds.

niedbalski temporarily deployed to pr March 28, 2025 11:36 — with GitHub Actions Inactive

tests: internal: fix memory leak in processor_conditional tests

186cd33

niedbalski temporarily deployed to pr March 28, 2025 11:38 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 12:01 — with GitHub Actions Inactive

tests: fix memory leaks in processor conditional tests

adb3637

niedbalski temporarily deployed to pr March 28, 2025 13:06 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 13:29 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 14:09 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 14:32 — with GitHub Actions Inactive

niedbalski temporarily deployed to pr March 28, 2025 14:33 — with GitHub Actions Inactive

edsiper merged commit fa7492b into master Mar 28, 2025
52 checks passed

edsiper deleted the fix/10127-conditional-fields branch March 28, 2025 23:42

edsiper added this to the Fluent Bit v4.0.0 milestone Mar 28, 2025

BrewTestBot mentioned this pull request Apr 1, 2025

fluent-bit 4.0.0 Homebrew/homebrew-core#217556

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

processor: add conditional processing support for logs #10142

processor: add conditional processing support for logs #10142

niedbalski commented Mar 28, 2025 •

edited

Loading

processor: add conditional processing support for logs #10142

processor: add conditional processing support for logs #10142

Conversation

niedbalski commented Mar 28, 2025 • edited Loading

Processor conditional processing for logs

Example configurations

Added tests

Internal unit tests

Runtime shell tests

niedbalski commented Mar 28, 2025 •

edited

Loading