Detailed metrics for a queue are only reported from the node hosting the leader replica of that queue #12022

kvasserman · 2024-08-15T22:41:37Z

kvasserman
Aug 15, 2024

Describe the bug

We are running RabbitMQ cluster in Kubernetes, deployed using RabbitMQ Kubernetes Operator. We are working on a project that suppose to scrape the prometheus metrics from the cluster. I've noticed that a lot of my requests for detailed metrics info

curl -sS 'http://:15692/metrics/detailed?family=queue_coarse_metrics&family=queue_metrics'

do not include some of the queues in their metrics.

After an investigation and querying each RabbitMQ pod individually, I've realized that only the pod that is the leader for the queue, reports metrics about the queue. Since the scraping applications will be pointing at the service, that is load-balancing the requests between the 3 nodes, this means that 66% of the time metrics do not include information for a particular queue.

I don't know if this is a configuration issue of some kind, but it seems like a bug to me. The application consuming the metrics from a cluster should not be aware of which node of the cluster to query for which queue.

Perhaps I'm missing something. Any help would be appreciated.

Thank you.

Reproduction steps

Deploy 3 node RabbitMQ cluster in Kubernetes using Kubernetes Operator
Create a new queue.
Repeatedly query k8s service for the cluster for detailed metrics.
Observe that many of the requests do not return metrics for the above queue.

Expected behavior

I would expect to get the metrics about a queue regardless on which node of the cluster the request for metrics lands.

Additional context

No response

Answered by ansd

Aug 16, 2024

Since the scraping applications will be pointing at the service, that is load-balancing the requests between the 3 nodes, this means that 66% of the time metrics do not include information for a particular queue.
I don't know if this is a configuration issue of some kind, but it seems like a bug to me. The application consuming the metrics from a cluster should not be aware of which node of the cluster to query for which queue.

This is a configuration issue on your side. There is no bug. Prometheus metrics must be scraped from each pod.
See https://www.rabbitmq.com/blog/2021/05/03/alerting:

ServiceMonitor configuration for Prometheus which helps discover RabbitMQ metrics from all Rabbi…

View full answer

michaelklishin · 2024-08-16T00:54:18Z

michaelklishin
Aug 16, 2024
Maintainer

@kvasserman "repeatedly query the K8S service" is not a reproduction step. We will not guess how exactly you "query the service". We do not guess in this community. In fact, you haven't even provided the RabbitMQ version used or any details on how the queue was declared.

If you use Prometheus to gather metrics, then every node will return its own local metrics only, by design. They are supposed to be aggregated, and since every queue and stream have a leader, by aggregating data from all nodes you will have a complete data set.

This is different from the RabbitMQ HTTP API where a request to any node will be routed to the node hosting the queue leader, and cached on the node handling the HTTP API request. This approach can work work but not how the Prometheus plugin works, again, by design.

0 replies

michaelklishin · 2024-08-16T01:10:19Z

michaelklishin
Aug 16, 2024
Maintainer

Given a cluster of three nodes started with

bazel run start-cluster NODES=3 RABBITMQ_ENABLED_PLUGINS=rabbitmq_management,rabbitmq_prometheus

and three quorum queues qq.1, qq.2, and qq.3 declared in a way that places their leaders on different nodes:

Publish one message in a way that it will be routed to qq.1, two messages to qq.2 and three messages to qq.3 to make their stats stand out:

Then query the Prometheus per-object metrics endpoint on every node.

Node 1, port 15692

curl -s -u guest:guest -X GET http://localhost:15692/metrics/per-object | grep "qq.1"
# => rabbitmq_raft_term_total{vhost="/",queue="qq.1"} 1
# => rabbitmq_raft_log_snapshot_index{vhost="/",queue="qq.1"} 0
# => rabbitmq_raft_log_last_applied_index{vhost="/",queue="qq.1"} 5
# => rabbitmq_raft_log_commit_index{vhost="/",queue="qq.1"} 5
# => rabbitmq_raft_log_last_written_index{vhost="/",queue="qq.1"} 5
# => rabbitmq_raft_entry_commit_latency_seconds{vhost="/",queue="qq.1"} 0.002
# => rabbitmq_queue_messages_ready{vhost="/",queue="qq.1"} 1
# => rabbitmq_queue_messages_unacked{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_messages{vhost="/",queue="qq.1"} 1
# => rabbitmq_queue_process_reductions_total{vhost="/",queue="qq.1"} 65337
# => rabbitmq_queue_consumers{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_consumer_capacity{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_consumer_utilisation{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_process_memory_bytes{vhost="/",queue="qq.1"} 110236
# => rabbitmq_queue_messages_ram{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_messages_ram_bytes{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_messages_persistent{vhost="/",queue="qq.1"} 1
# => rabbitmq_queue_messages_persistent_bytes{vhost="/",queue="qq.1"} 1
# => rabbitmq_queue_messages_bytes{vhost="/",queue="qq.1"} 1
# => rabbitmq_queue_messages_ready_bytes{vhost="/",queue="qq.1"} 1
# => rabbitmq_queue_messages_unacked_bytes{vhost="/",queue="qq.1"} 0
# => rabbitmq_queue_exchange_messages_published_total{vhost="/",queue="qq.1",vhost="/",exchange=""} 1

curl -s -u guest:guest -X GET http://localhost:15692/metrics/per-object | grep "qq.2"
# => rabbitmq_raft_term_total{vhost="/",queue="qq.2"} 1
# => rabbitmq_raft_log_snapshot_index{vhost="/",queue="qq.2"} 0
# => rabbitmq_raft_log_last_applied_index{vhost="/",queue="qq.2"} 8
# => rabbitmq_raft_log_commit_index{vhost="/",queue="qq.2"} 8
# => rabbitmq_raft_log_last_written_index{vhost="/",queue="qq.2"} 8
# => rabbitmq_raft_entry_commit_latency_seconds{vhost="/",queue="qq.2"} 0.006
# => rabbitmq_queue_exchange_messages_published_total{vhost="/",queue="qq.2",vhost="/",exchange=""} 2

curl -s -u guest:guest -X GET http://localhost:15692/metrics/per-object | grep "qq.3"
# => rabbitmq_raft_term_total{vhost="/",queue="qq.3"} 1
# => rabbitmq_raft_log_snapshot_index{vhost="/",queue="qq.3"} 0
# => rabbitmq_raft_log_last_applied_index{vhost="/",queue="qq.3"} 11
# => rabbitmq_raft_log_commit_index{vhost="/",queue="qq.3"} 11
# => rabbitmq_raft_log_last_written_index{vhost="/",queue="qq.3"} 11
# => rabbitmq_raft_entry_commit_latency_seconds{vhost="/",queue="qq.3"} 0.002
# => rabbitmq_queue_exchange_messages_published_total{vhost="/",queue="qq.3",vhost="/",exchange=""} 3

Node 2, port 15693

curl -s -u guest:guest -X GET http://localhost:15693/metrics/per-object | grep "qq.1"
# => rabbitmq_raft_term_total{vhost="/",queue="qq.1"} 1
# => rabbitmq_raft_log_snapshot_index{vhost="/",queue="qq.1"} 0
# => rabbitmq_raft_log_last_applied_index{vhost="/",queue="qq.1"} 5
# => rabbitmq_raft_log_commit_index{vhost="/",queue="qq.1"} 5
# => rabbitmq_raft_log_last_written_index{vhost="/",queue="qq.1"} 5
# => rabbitmq_raft_entry_commit_latency_seconds{vhost="/",queue="qq.1"} 0.00

curl -s -u guest:guest -X GET http://localhost:15693/metrics/per-object | grep "qq.2"
# => rabbitmq_raft_term_total{vhost="/",queue="qq.2"} 1
# => rabbitmq_raft_log_snapshot_index{vhost="/",queue="qq.2"} 0
# => rabbitmq_raft_log_last_applied_index{vhost="/",queue="qq.2"} 8
# => rabbitmq_raft_log_commit_index{vhost="/",queue="qq.2"} 8
# => rabbitmq_raft_log_last_written_index{vhost="/",queue="qq.2"} 8
# => rabbitmq_raft_entry_commit_latency_seconds{vhost="/",queue="qq.2"} 0.006
# => rabbitmq_queue_messages_ready{vhost="/",queue="qq.2"} 2
# => rabbitmq_queue_messages_unacked{vhost="/",queue="qq.2"} 0
# => rabbitmq_queue_messages{vhost="/",queue="qq.2"} 2
# => rabbitmq_queue_process_reductions_total{vhost="/",queue="qq.2"} 96677
# => rabbitmq_queue_consumers{vhost="/",queue="qq.2"} 0
# => rabbitmq_queue_consumer_capacity{vhost="/",queue="qq.2"} 0
# => rabbitmq_queue_consumer_utilisation{vhost="/",queue="qq.2"} 0
# => rabbitmq_queue_process_memory_bytes{vhost="/",queue="qq.2"} 109596
# => rabbitmq_queue_messages_ram{vhost="/",queue="qq.2"} 0
# => rabbitmq_queue_messages_ram_bytes{vhost="/",queue="qq.2"} 0
# => rabbitmq_queue_messages_persistent{vhost="/",queue="qq.2"} 2
# => rabbitmq_queue_messages_persistent_bytes{vhost="/",queue="qq.2"} 4
# => rabbitmq_queue_messages_bytes{vhost="/",queue="qq.2"} 4
# => rabbitmq_queue_messages_ready_bytes{vhost="/",queue="qq.2"} 4
# => rabbitmq_queue_messages_unacked_bytes{vhost="/",queue="qq.2"} 0

curl -s -u guest:guest -X GET http://localhost:15693/metrics/per-object | grep "qq.3"
# => rabbitmq_raft_term_total{vhost="/",queue="qq.3"} 1
# => rabbitmq_raft_log_snapshot_index{vhost="/",queue="qq.3"} 0
# => rabbitmq_raft_log_last_applied_index{vhost="/",queue="qq.3"} 11
# => rabbitmq_raft_log_commit_index{vhost="/",queue="qq.3"} 11
# => rabbitmq_raft_log_last_written_index{vhost="/",queue="qq.3"} 11
# => rabbitmq_raft_entry_commit_latency_seconds{vhost="/",queue="qq.3"} 0.002

Node 3, Port 15694

Node 3's results are omitted for brevity but it consistently reports more metrics for qq.3 because that queue's leader is hosted on this node.

Observations

These results are expected from the Prometheus plugin. If the same experiment is repeated with
five nodes, two nodes out of five will not return any metrics for the above queues because they won't
host any replicas with the default replication factor which equals 3.

Aggregating the above metrics the way our Grafana dashboard does it will provide a complete set
of metrics. Only returning local metrics was a conscious design decision in the Prometheus plugin
which makes it drastically simpler and more resilient than the HTTP API which performs intra-cluster
requests to other nodes, which can fail, must be cached, and can take N seconds to time out
if a peer node is not available for any reason at request time.

In addition, the Prometheus plugin never has to cache any data from other nodes, reducing
its memory footprint with queries that return large(r) data sets.

0 replies

ansd · 2024-08-16T06:59:06Z

ansd
Aug 16, 2024
Maintainer

Since the scraping applications will be pointing at the service, that is load-balancing the requests between the 3 nodes, this means that 66% of the time metrics do not include information for a particular queue.
I don't know if this is a configuration issue of some kind, but it seems like a bug to me. The application consuming the metrics from a cluster should not be aware of which node of the cluster to query for which queue.

This is a configuration issue on your side. There is no bug. Prometheus metrics must be scraped from each pod.
See https://www.rabbitmq.com/blog/2021/05/03/alerting:

ServiceMonitor configuration for Prometheus which helps discover RabbitMQ metrics from all RabbitMQ nodes

From https://observability.thomasriley.co.uk/prometheus/configuring-prometheus/using-service-monitors/ :

Within the ServiceMonitor we specify the Kubernetes Labels that the Operator can use to identify the Kubernetes Service which in turn then identifies the Pods, that we wish to monitor.

0 replies

ansd · 2024-08-16T07:54:58Z

ansd
Aug 16, 2024
Maintainer

Note that in #12022 (comment) each node emits Raft metrics (because each node has a Raft member), but queue metrics are only reported by the leader (even though each node also has a queue member). This makes sense because a queue is a single entity conceptually whether it is replicated or not but as soon as you look at Raft metrics, where different members are at becomes more important, e.g. detecting replication lag etc.

0 replies

kvasserman · 2024-08-16T18:08:56Z

kvasserman
Aug 16, 2024
Author

Thank you, everyone for valuable information. I was able to adjust my scraping code to get and aggregate the metrics from each RabbitMQ server pod and everything works fine now.

For general metrics our company doesn't use Prometheus, we use DataDog. DataDog scraper only seem to allow for one endpoint setting to get RabbitMQ prometheus metrics. This was part of the confusion, I figured they scrape everything from the RabbitMQ service (which load-balances the requests between the nodes). I wonder if DataDog folks realize that they are missing per queue metrics 2 out of 3 requests. But perhaps they do something behind the scenes to discover and scrape metrics per pod. It's unclear to me.

Anyway, thank you.

1 reply

michaelklishin Aug 16, 2024
Maintainer

I suspect that DataDog still uses/relies on the HTTP API, and still have as little understanding of RabbitMQ as they had a few years ago.

Infamously DataDog's scraper in the past was known for extremely wasteful use of RabbitMQ HTTP API, like using GET /api/queues to get one metric of a single queue.

Please direct your system to DataDog, our team cannot help you beyond explaining what was explained above and pointing out that some tools use RabbitMQ HTTP API endpoints in a way that is nothing short of irresponsible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detailed metrics for a queue are only reported from the node hosting the leader replica of that queue #12022

{{title}}

Replies: 5 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Detailed metrics for a queue are only reported from the node hosting the leader replica of that queue #12022

kvasserman Aug 15, 2024

Describe the bug

Reproduction steps

Expected behavior

Additional context

Replies: 5 comments · 1 reply

michaelklishin Aug 16, 2024 Maintainer

michaelklishin Aug 16, 2024 Maintainer

Node 1, port 15692

Node 2, port 15693

Node 3, Port 15694

Observations

ansd Aug 16, 2024 Maintainer

ansd Aug 16, 2024 Maintainer

kvasserman Aug 16, 2024 Author

michaelklishin Aug 16, 2024 Maintainer

kvasserman
Aug 15, 2024

Replies: 5 comments 1 reply

michaelklishin
Aug 16, 2024
Maintainer

michaelklishin
Aug 16, 2024
Maintainer

ansd
Aug 16, 2024
Maintainer

ansd
Aug 16, 2024
Maintainer

kvasserman
Aug 16, 2024
Author

michaelklishin Aug 16, 2024
Maintainer