Clarify how aggregation affects partition key

PettitWesley · PettitWesley · commit 58916f8cc930 · 2022-02-04T10:02:52.000-08:00
Signed-off-by: Wesley Pettit &lt;wppttt@amazon.com&gt;
diff --git a/README.md b/README.md
@@ -22,7 +22,7 @@ If you think you’ve found a potential security issue, please do not post it in
 * `time_key_format`: [strftime](http://man7.org/linux/man-pages/man3/strftime.3.html) compliant format string for the timestamp; for example, `%Y-%m-%dT%H:%M:%S%z`. This option is used with `time_key`. You can also use `%L` for milliseconds and `%f` for microseconds. If you are using ECS FireLens, make sure you are running Amazon ECS Container Agent v1.42.0 or later, otherwise the timestamps associated with your container logs will only have second precision.
 * `experimental_concurrency`: Specify a limit of concurrent go routines for flushing records to kinesis.  By default `experimental_concurrency` is set to 0 and records are flushed in Fluent Bit's single thread. This means that requests to Kinesis will block the execution of Fluent Bit.  If this value is set to `4` for example then calls to Flush records from fluentbit will spawn concurrent go routines until the limit of `4` concurrent go routines are running.  Once the `experimental_concurrency` limit is reached calls to Flush will return a retry code.  The upper limit of the `experimental_concurrency` option is `10`.  WARNING:  Enabling `experimental_concurrency` can lead to data loss if the retry count is reached.  Enabling concurrency will increase resource usage (memory and CPU).
 * `experimental_concurrency_retries`: Specify a limit to the number of retries concurrent goroutines will attempt.  By default `4` retries will be attempted before records are dropped.
-* `aggregation`: Setting `aggregation` to `true` will enable KPL aggregation of records sent to Kinesis.  This feature isn't compatible with the `partition_key` feature.  See the KPL aggregation section below for more details.
+* `aggregation`: Setting `aggregation` to `true` will enable KPL aggregation of records sent to Kinesis.  This feature changes the behavior of the `partition_key` feature.  See the KPL aggregation section below for more details.
 * `compression`: Specify an algorithm for compression of each record. Supported compression algorithms are `zlib` and `gzip`. By default this feature is disabled and records are not compressed.
 * `replace_dots`: Replace dot characters in key names with the value of this option. For example, if you add `replace_dots _` in your config then all occurrences of `.` will be replaced with an underscore. By default, dots will not be replaced.
 
@@ -121,7 +121,7 @@ The advantages of enabling KPL aggregation are:
 The disadvantages are:
  - The flush time (or buffer size) will need to be tuned to take advantage of aggregation (more on that below).
  - You must use the KCL library to read data from kinesis to de-aggregate the protobuf serialization (if Firehose isn't the consumer).
- - The `partition_key` feature isn't compatible with aggregation given multiple records are in each PutRecord structure.  The `partition_key` value of the first record in the batch will be used to route the entire batch to a given shard.  Given this limitation, using both `partition_key` and `aggregation` simultaneously isn't recommended.
+ - The `partition_key` feature isn't fully compatible with aggregation given multiple records are in each PutRecord structure.  The `partition_key` value of the first record in the batch will be used to route the entire batch to a given shard.  Given this limitation, using both `partition_key` and `aggregation` simultaneously requires careful consideration. In most container log use cases, all logs from a single container/pod are sent in the same stream, thus if you use the pod/container as the partition key, it should still work as expected since all records in an aggregated batch can use the same partition key. In other use cases, aggregation will cause records that should have had different partition keys to have the same partition key.
 
 KPL Aggregated Record Reference:  https://github.com/awslabs/amazon-kinesis-producer/blob/master/aggregation-format.md
 
diff --git a/fluent-bit-kinesis.go b/fluent-bit-kinesis.go
@@ -113,7 +113,7 @@ func newKinesisOutput(ctx unsafe.Pointer, pluginID int) (*kinesis.OutputPlugin,
 	}
 
 	if isAggregate && partitionKey != "" {
-		logrus.Errorf("[kinesis %d]  WARNING: The options 'aggregation' and  'partition_key' should not be used simultaneously", pluginID)
+		logrus.Warnf("[kinesis %d] 'partition_key' has different behavior when 'aggregation' enabled. All aggregated records will use a partition key sourced from the first record in the batch", pluginID)
 	}
 
 	var concurrencyInt, concurrencyRetriesInt int

Original file line number	Diff line number	Diff line change
`@@ -113,7 +113,7 @@ func newKinesisOutput(ctx unsafe.Pointer, pluginID int) (*kinesis.OutputPlugin,`
`113`	`113`	`}`
`114`	`114`
`115`	`115`	`if isAggregate && partitionKey != "" {`
`116`		`- logrus.Errorf("[kinesis %d] WARNING: The options 'aggregation' and 'partition_key' should not be used simultaneously", pluginID)`
	`116`	`+ logrus.Warnf("[kinesis %d] 'partition_key' has different behavior when 'aggregation' enabled. All aggregated records will use a partition key sourced from the first record in the batch", pluginID)`
`117`	`117`	`}`
`118`	`118`
`119`	`119`	`var concurrencyInt, concurrencyRetriesInt int`