|
| 1 | +# Kafka Archiver |
| 2 | + |
| 3 | +## Features |
| 4 | +- Uses Kafka Client Sarama with Cluster extension |
| 5 | +- Stores offsets in Kafka consumer groups (0.9+) |
| 6 | +- Includes Datadog (statsd) monitoring |
| 7 | +- Uses parallel GZIP implementation |
| 8 | +- Multiple partitioners: |
| 9 | + - DefaultPartitioner (topic, partition) |
| 10 | + - TimeFieldPartitioner (JSON field, Unix Timestamp in milliseconds) |
| 11 | + - IsoDateFieldPartitioner (JSON field, RFC3339Nano) |
| 12 | +- Uses S3 Multipart uploader |
| 13 | +- Graceful shutdown (with timeout) |
| 14 | + |
| 15 | +## Usage |
| 16 | +``` |
| 17 | +Usage of kafka-archiver: |
| 18 | + -alsologtostderr |
| 19 | + log to standard error as well as files |
| 20 | + -brokers value |
| 21 | + Kafka brokers to connect to, as a comma separated list. (required) |
| 22 | + -buffer-interval value |
| 23 | + The duration when the buffer should close the file. (default "15m") |
| 24 | + -buffer-location string |
| 25 | + The base folder where to store temporary files. (default "$TMPDIR") |
| 26 | + -buffer-mem string |
| 27 | + Amount of memory a buffer can use. (default "8KB") |
| 28 | + -buffer-queue-length int |
| 29 | + The size of the queue to hold the files to be uploaded. (default 32) |
| 30 | + -buffer-size string |
| 31 | + The size in bytes when the buffer should close the file. (default "100MiB") |
| 32 | + -consumer-auto-offset-reset string |
| 33 | + Kafka consumer group topic default offset. (default "oldest") |
| 34 | + -consumer-delay-start value |
| 35 | + Number of seconds to wait before starting consuming. (default "0s") |
| 36 | + -consumer-group string |
| 37 | + Name of your kafka consumer group. (required) |
| 38 | + -datadog-host string |
| 39 | + The host where the datadog agents listens to. (default "localhost:2585") |
| 40 | + -log_backtrace_at value |
| 41 | + when logging hits line file:N, emit a stack trace |
| 42 | + -log_dir string |
| 43 | + If non-empty, write log files in this directory |
| 44 | + -logtostderr |
| 45 | + log to standard error instead of files |
| 46 | + -partitioner string |
| 47 | + The name of the partitioner to use. (default "DefaultPartitioner") |
| 48 | + -partitioner-key string |
| 49 | + Name of the JSON field to parse. |
| 50 | + -partitioner-path-folder string |
| 51 | + The top level folder to prepend to the path used when partitioning files. (default "backup") |
| 52 | + -partitioner-path-topic-prefix string |
| 53 | + A prefix to prepend to the path used when partitioning files. (default "topic=") |
| 54 | + -s3 |
| 55 | + Enable S3 uploader. |
| 56 | + -s3-bucket string |
| 57 | + S3 Bucket where to upload files. |
| 58 | + -s3-client-debug |
| 59 | + Enable to enable debug logging on S3 client. |
| 60 | + -s3-concurrency int |
| 61 | + S3 Uploader part size. (default 5) |
| 62 | + -s3-endpoint string |
| 63 | + S3 Bucket Endpoint to use for the client. |
| 64 | + -s3-force-path-style |
| 65 | + Enable to force the request to use path-style addressing on S3. |
| 66 | + -s3-notification-topic string |
| 67 | + Kafka topic used to store uploaded S3 files. |
| 68 | + -s3-part-size string |
| 69 | + S3 Uploader part size. (default "5MiB") |
| 70 | + -s3-region string |
| 71 | + S3 Bucket Region. |
| 72 | + -statsd-prefix string |
| 73 | + The name prefix for statsd metrics. (default "kafka-archiver") |
| 74 | + -stderrthreshold value |
| 75 | + logs at or above this threshold go to stderr |
| 76 | + -topic-blacklist string |
| 77 | + An additional blacklist of topics, precedes the whitelist. |
| 78 | + -topic-whitelist string |
| 79 | + An additional whitelist of topics to subscribe to. |
| 80 | + -topics value |
| 81 | + Kafka topics to subscribe to. |
| 82 | + -v value |
| 83 | + log level for V logs |
| 84 | + -vmodule value |
| 85 | + comma-separated list of pattern=N settings for file-filtered logging |
| 86 | +``` |
| 87 | + |
| 88 | + |
| 89 | +## Topic Notification |
| 90 | + |
| 91 | +Kafka Archiver can write a message to a topic upon successful upload. Set the `-s3-notification-topic` to an existing Kafka topic. |
| 92 | + |
| 93 | +The payload of a message looks like the example below. |
| 94 | + |
| 95 | +```json |
| 96 | +{ |
| 97 | + "provider": "s3", |
| 98 | + "region": "us-west-1", |
| 99 | + "bucket": "test", |
| 100 | + "key": "backup/topic=events/year=2017/month=03/day=24/hour=23/0042.176939634.22dd3d64-2761-4be6-be91-0e70f252dec8.gz", |
| 101 | + "topic": "events", |
| 102 | + "partition": 42, |
| 103 | + "opened": 1490398341, |
| 104 | + "closed": 1490398348, |
| 105 | + "age": "6.532102267s", |
| 106 | + "bytes": 1309890, |
| 107 | + "writes": 2486, |
| 108 | + "timestamp": 1490398350 |
| 109 | +} |
| 110 | +``` |
0 commit comments