Skip to content

Conversation

cl-bvl
Copy link

@cl-bvl cl-bvl commented Aug 7, 2025

Hello.

I'm proposing to add several helper functions to make working with timers more convenient.
I haven’t brought this up on the mailing list yet, but I’d like to hear your feedback.
The code is still a draft; if you like the approach, I’ll add more detailed comments, tests, and documentation.

Here are a few examples of how the new timers could be used:

package prometheus

import (
	"database/sql"
	"time"
)

type Worker struct {
	db                     *sql.DB
	workerChan             chan *Job
	metricDBQueryTime      *TimerObserverVec
	metricWorkingTime      *TimerCounterVec
	metricProcessingTime   *TimerObserver
	metricWaitTime         *TimerCounter
	metricWaitCount        Gauge
	metricModuleUptime     *TimerCounter
	metricModuleUptimeStop func()
}

type Job struct {
	Type          string
	Payload       string // Some Job payload, for example
	waitTimerStop func() // If the queue processing is blocked, for example by jobs with a long running time, then we want to see the metric with the waiting time in live mode - it will increase and show that there are messages that require processing.
}

func NewWorker(db *sql.DB, workerCnt int) *Worker {
	w := &Worker{
		db:                   db,
		workerChan:           make(chan *Job, workerCnt),
		metricDBQueryTime:    NewTimerObserverVec(NewHistogramVec(HistogramOpts{Namespace: "test_module", Subsystem: "database", Name: "query_time_seconds"}, []string{"query_type"})),
		metricWorkingTime:    NewTimerCounterVec(NewCounterVec(CounterOpts{Namespace: "test_module", Subsystem: "worker", Name: "working_time_seconds"}, []string{"job_type"}), 0),
		metricProcessingTime: NewTimerObserver(NewSummary(SummaryOpts{Namespace: "test_module", Subsystem: "worker", Name: "processing_time_seconds"})),
		metricWaitTime:       NewTimerCounter(NewCounter(CounterOpts{Namespace: "test_module", Subsystem: "worker", Name: "wait_time_seconds"}), time.Second),
		metricWaitCount:      NewGauge(GaugeOpts{Namespace: "test_module", Subsystem: "worker", Name: "wait_count"}),
		metricModuleUptime:   NewTimerCounter(NewCounter(CounterOpts{Namespace: "test_module", Subsystem: "worker", Name: "uptime"}), time.Second),
	}

	w.metricModuleUptimeStop = w.metricModuleUptime.Observe()

	for i := 0; i < workerCnt; i++ {
		go w.run()
	}

	return w
}

func (w *Worker) Destroy() {
	close(w.workerChan)
	w.metricModuleUptimeStop()
}

func (w *Worker) Describe(ch chan<- *Desc) {
	DescribeByCollect(w, ch)
}

func (w *Worker) Collect(ch chan<- Metric) {
	w.metricDBQueryTime.Collect(ch)
	w.metricWorkingTime.Collect(ch)
	w.metricProcessingTime.Collect(ch)
	w.metricWaitTime.Collect(ch)
	w.metricWaitCount.Collect(ch)
	w.metricModuleUptime.Collect(ch)
}

func (w *Worker) run() {
	for job := range w.workerChan {
		job.waitTimerStop()
		w.metricWaitCount.Dec()
		func() {
			defer w.metricWorkingTime.ObserveLabelValues(job.Type)()

			var selectResult sql.Result
			var err error
			w.metricDBQueryTime.WrapLabelValues([]string{"SELECT"}, func() {
				selectResult, err = w.db.Exec("SELECT * FROM my_table WHERE payload = ?", job.Payload)
			})
			if err != nil {
				return
			}

			// Processing  result
			stop := w.metricProcessingTime.Observe()
			result := doSomeWork(selectResult)
			stop()

			w.metricDBQueryTime.WrapLabelValues([]string{"INSERT"}, func() {
				selectResult, err = w.db.Exec("INSERT INTO results (payload, result) VALUES (?, ?)", job.Payload, result)
			})
		}()
	}
}

func (w *Worker) AddJob(job *Job) {
	job.waitTimerStop = w.metricWaitTime.Observe()
	w.metricWaitCount.Inc()
	w.workerChan <- job
}

func doSomeWork(selectResult sql.Result) string {
	return ""
}

cl-bvl added 2 commits August 8, 2025 02:40
Signed-off-by: Vladimir Buyanov <b.vladimir@clickadu.com>
Signed-off-by: Vladimir Buyanov <b.vladimir@clickadu.com>
@cl-bvl cl-bvl force-pushed the feature/timer_helpers branch from 1777cf3 to eb20df1 Compare August 7, 2025 23:40
@bwplotka
Copy link
Member

Thanks for this proposal!

BTW: Likely here or Slack is a better place for those discussions.

I have some questions around benefit vs cons (code to maintain, 2 ways of doing same thing).

Correct if I'm wrong, but I see you proposed:

1) Wrap methods

So instead of:

timer := NewTimer(myHistogram)
func() {
  defer timer.ObserveDuration()
  // Do actual work.
}()

one could do:

timer := NewTimerHistogram(myHistogramOpts)
timer.Wrap( func() {
  // Do actual work.
})

Is it really that beneficial? 🤔

2) TimerHistogram type

So instead of:

timer := NewTimer(myHistogram)
defer timer.ObserveDuration()
// Do actual work.

one could do:

timer := NewTimerHistogram(myHistogramOpts)
defer timer.ObserveDuration()
// Do actual work.

What this type gives us that's not possible now?

3) TimerHistogramVec type

So (conceptually) instead of:

timer := NewTimer(myHistogram.WithLabelValues("200"))
timerErr := NewTimer(myHistogram.WithLabelValues("500"))

defer func() { 
  if err != {
    timerErr.ObserveDuration()
  } else {
    timer.ObserveDuration()
  }
}()
// Do actual work, return err

one could do:

timer := NewTimerHistogram(myHistogramOpts, "code")
defer func() { 
  if err != {
    timerErr.WithLabelValues("500").ObserveDuration()
  } else {
    timerErr.WithLabelValues("200").ObserveDuration()
  }
}()
// Do actual work, return err

This might indeed has some benefits, I think I like it

I only wonder if some generic function would not work better. This is because we already have "generic" timer that should work for summary and histogram 🤔

type ObserverVec interface{
  *HistogramVec | *SummaryVec
}
func NewTimerVec[T ObserverVec](o T) *TimerVec {
	return &TimerVec{
		begin:    time.Now(),
		o: o,
	}
}

or

func NewTimerVec(o any) *TimerVec {
	return &TimerVec{
		begin:    time.Now(),
		o: o,
	}
}

4) TimerCounter type

This is interesting, I guess it might make sense to count total time or even gauge the last duration.

I would not call the method Observe though, but rather AddDuration and SetDuration for gauge.

5) TimerContinuous type

That's a fun one! For this one I think one could write a separate function, because you can easily use TimerCounter for this without client_golang changes, no? At least in the beginning I would write an example how to write something like that before considering adding this to v1.

Also I believe there might be better ways to implement it than timer. Why not using CounterFunc to update the time on scrape?

Summary

Generally:

  • I don't see wrap being that much helping (1).
  • If we add typed functions then (2) and (3), (4) makes sense, but we should likely follow
    func NewCounterFunc(opts CounterOpts, function func() float64) CounterFunc {
    so NewCounterTimer etc. Questions is... should add typed versions OR try write generic ones (Timer and TimerVec only would work for observers, perhaps even for all metrics if we use ObserveDuration for adding and setting logic).
  • I would use *Func for (5) and start with example - if the example will be heavily use, we can consider adding as a helper

If we want to make NewTimer work for counters/gauges it could be possible without braking changes with any, so

func NewTimer(o Observer) *Timer {

into

func NewTimer(o any) *Timer {

.. but maybe it's too much -- maybe it's ok to have per metric type functions with corresponding opts, similar to *Func types

@cl-bvl
Copy link
Author

cl-bvl commented Aug 20, 2025

Hello.
Thanks for feedback)

1) Wrap methods.

Yes, it does exactly that. But with a few caveats.

  1. If you use the existing timer, you need to store the histogram separately and create a new timer each time. This makes the code a little less readable.
type Handler struct {
	reqDuration Histogram
}

func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	timer := NewTimer(h.reqDuration)
	defer timer.ObserveDuration()
}

VS

type Handler2 struct {
	reqDuration *TimerCounter
}

func (h *Handler2) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	defer h.reqDuration.Observe()()
}

I understand that the difference is not very big, but it is there.
Another important nuance is that a histogram is not always required for a timer. Histograms are quite expensive to store. A common case for us is the need to store only a counter of the total elapsed time.

  1. Using a separate Wrap function makes the code a little clearer.
    When using the closure option, the person changing the code may want to remove the useless wrapping function, which will break the code
timer := NewTimer(myHistogram)
func() {
  defer timer.ObserveDuration()
  work()
}()

Can be mistakenly converted to:

timer := NewTimer(myHistogram)
defer timer.ObserveDuration()
work()

And although this is 100% the fault of the programmer who did this, but it seems to me that a good API should protect against errors, and not push towards them.

2) TimerHistogram type

You are right, this type is made to maintain a uniform API and does not provide new functionality.
Small bonuses - it's a bit more intuitive to create a HistogramTimer right away, rather than first a Histogram and then a timer on top of it. And it simplifies the code quite a bit.

3) TimerHistogramVec type

Yes, we can do it like this:

type TimerVec struct {
	ObserverVec
}

func NewTimerVec[T HistogramOpts | SummaryOpts](opts T, labelNames []string) *TimerVec {
	t := &TimerVec{}
	switch o := any(opts).(type) {
	case HistogramOpts:
		t.ObserverVec = NewHistogramVec(o, labelNames)
	case SummaryOpts:
		t.ObserverVec = NewSummaryVec(o, labelNames)
         default:
		panic("invalid TimerVec options type")
	}

	return t
}

......

Do you like this?

4) TimerCounter type

Yes, we can call the method AddDuration instead of Observe.
I didn't implement Gauge and it seems to be poorly related to the ideology of Prometheus and timers.
Counter and increase() are great for understanding how much time has been spent since the last measurement.

5) TimerContinuous type

I poked around for 10 minutes and couldn't figure out how to use CounterFunc to implement a ticking timer. I'll think about it some more, but if you have an idea of what it might look like, that would help.

Signed-off-by: Vladimir Buyanov <b.vladimir@clickadu.com>
@cl-bvl cl-bvl force-pushed the feature/timer_helpers branch from 9287210 to 7afe36d Compare August 21, 2025 11:30
@cl-bvl
Copy link
Author

cl-bvl commented Aug 21, 2025

Hello.
Small changes in timers.

  1. Removed separate type for Continous timers. Now they are combined with Counter / CounterVec.
    When passing updateInterval = 0 they work as normal counters. If not zero - they are updated during operation.
  2. Histogram and HistogramVec have been replaced with generic ones. They can be used with Histogram, Summary or your own implementations.
  3. Added Add methods that allow you to directly add duration.

Regarding changing the current Timer. There are several objections why we don't want to do this.

  1. We can't change it much. It is necessary to maintain backward compatibility so as not to break the code of people who already use it.
  2. Currently, this timer measures the time between its creation and stopping. It can take one measurement. I find this quite inconvenient. Usually, the metrics themselves are created in one place (in the constructor), and are used in methods.

Therefore, I would leave it in its current state and add alternative types.

@bwplotka
Copy link
Member

bwplotka commented Aug 22, 2025

Great discussion!

Just curious, who is "we" you are referring to in this PR? (:

type Handler2 struct {
	reqDuration *TimerCounter
}

func (h *Handler2) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	defer h.reqDuration.Observe()
}

Is the above example really correct? What do you want to measure here? It looks like in this example this would measure the time since the creation of the handler (program start) to the last request, no?

Probably more realistic example would be:

> func (h *Handler2) ServeHTTP(w http.ResponseWriter, r *http.Request) {
         reqDuration := NewTimer(h.reqDurationHistogram) // or NewTimer(h.reqDurationCounter) 
 	defer h.reqDuration.Observe()
}

...which proves that NewTimer(<metric>) is more useful than NewTimer(<metricOpts), wdyt? Is there a use for NewTimer(couterOpts/histogramOpts) even?

We can't change it much. It is necessary to maintain backward compatibility so as not to break the code of people who already use it.

I believe there are ways to not break compatibility if we use type casting and extend the interface on timer to any and reuse ObserveDuration method.

Currently, this timer measures the time between its creation and stopping. It can take one measurement. I find this quite inconvenient. Usually, the metrics themselves are created in one place (in the constructor), and are used in methods.

Not sure I understand this argument. What do you mean take one measurement? Do you mean one start time? (you can observe duration as many times as you want). If that's what you meant then I don't see how your structures are different here. In fact making the constructor taking opts NewTimer*(opts), it's even more single-use (and sort of wasteful) because I cannot reuse the same metric (e.g. histogram) for multiple timers. In the current API I can.

To sum up.... I think we need to make a few high level decisions first:

  1. Per type vs generic:

A) NewTimer(any) and NewTimerVec(any) (second one can also use generics)
B) Per type structurs.

  1. Taking metric vs Opts and always creating

A) NewHistogramVecTimer(*HistogramVec) etc... similar to promhttp handlers:

func InstrumentHandlerDuration(obs prometheus.ObserverVec, next http.Handler, opts ...Option) http.HandlerFunc {

B) NewHistogramVecTimer(Opts, labels) etc... similar to CounterFunc (nothing like that for histograms):
func NewCounterFunc(opts CounterOpts, function func() float64) CounterFunc {

Seems like you propose 1B and 2B. I think 1B is fair, given what we have so far in client_golang.. I would be keen on adding this.

For 2B... as I wrote above, I don't see the usefulness of 2B, 2A is more useful, given you can reuse the same metric (it's not cheap to create and register new metric, it's cheap to create a new start time variable) for multiple timers uses. For 2B one could add Reset() method or let ObserveDuration reset the start time or something, but it might be confusing, plus typically (even in your HTTP request example) you need per request/per goroutine start time.

@bwplotka
Copy link
Member

Also for your HTTP request example, there are better ways than timer, promhttp gives you robust handlers.

Then if we choose 2A, I don't see the need for *Vec flavor too 🙈

Ideally to make some decisions, we need more example on practical cases where existing timer is not enough (counter is one fair example)

@cl-bvl
Copy link
Author

cl-bvl commented Aug 22, 2025

Hello.

Just curious, who is "we" you are referring to in this PR? (:

I mean us - me, you, other contributors ;)


Is the above example really correct? What do you want to measure here? It looks like in this example this would measure the time since the creation of the handler (program start) to the last request, no?
Probably more realistic example would be:

Yes, example is correct.
The main difference between my timers and the existing one is that the existing make only one measures the time from its creation to its termination.
And mine measure the time from the function start to its termination. Therefore, they can perform multiple measurements, not just one. You can call one timer concurrently from many goroutines.
In the example below, we want to measure the sum of the time spent on executing an http request.
The HTTP handler is just an example, it could be some similar component.
We do not need to create a new timer each time, we reuse the already created one.


which proves that NewTimer() is more useful than NewTimer(<metricOpts), wdyt?

Yes, that's exactly how it works in the current version.
But if for an existing timer it is necessary to store the histogram separately and transmit it for each measurement, then the proposed timers encapsulate the metric that is used as "backend" metric.


Not sure I understand this argument. What do you mean take one measurement?

// For existing timer we need to create a new instance  for each measurement
measure1 := NewTimer(h.reqDurationHistogram) // or NewTimer(h.reqDurationCounter) 
work()
measure1.Observe()

measure2 := NewTimer(h.reqDurationHistogram) // or NewTimer(h.reqDurationCounter) 
work()
measure2.Observe()

measure3 := NewTimer(h.reqDurationHistogram) // or NewTimer(h.reqDurationCounter) 
work()
measure3.Observe()

// With new timer we can create timer one time and reuse it.
timer := NewTimerObserver(NewHistogram(.....))

stop := timer.Observe()
work()
stop()

stop = timer.Observe()
work()
stop()

stop = timer.Observe()
work()
stop()

We can create timer in module contructor and use in methods, like:

type Module struct {
  workingTime *TimerCounter
}

func NewModule() *Module {
  return &Module{workingTime: NewTimerCounter(.....)}
}

func (m *Module) DoWork(....) {
  defer m.workingTime.Observe()()
}

In this example, we will measure all the time spent on executing the DoWork() function. It can be called in many goroutines concurently.


To sum up.... I think we need to make a few high level decisions first:

I don't think we should use something like NewTimer(any)
Counter, Histogram and HistogramVec have completely different APIs. For *Vec for example we need to pass labels, while for the other types we don't. Counter timers can be updated during operation, while the other types can't.
Trying to combine all these features into one API we will make it rather inconvenient and untyped, which will increase the probability of errors and complicate work with it.

In the current version, I propose to use 4 types of timers:

  • Counter
  • CounterVec
  • Observer (Histogram or Summary)
  • ObserverVec (HistogramVec or SummaryVec)

To unify the API, they now accept a metric, something like this

func NewTimerCounterVec(cnt *CounterVec, updateInterval time.Duration) *TimerCounterVec
// Example:
NewTimerCounterVec(NewCounterVec(CounterOpts{Name: "test"}, []string{"label1", "label2"}), 0)


func NewTimerObserverVec(obs ObserverVec) *TimerObserverVec
// Example:
NewTimerObserverVec(NewSummaryVec(SummaryOpts{Name: "test2"}, []string{"label3", "label4"}))

Seems like you propose 1B and 2B. I think 1B is fair, given what we have so far in client_golang.. I would be keen on adding this.
For 2B... as I wrote above, I don't see the usefulness of 2B, 2A is more useful, given you can reuse the same metric (it's not cheap to create and register new metric, it's cheap to create a new start time variable) for multiple timers uses.

The current version offers 2A.
Neither the current version nor the previous one assumed recreating the metric for each measurement. This have terribly performance and simply makes no sense))

Reset() method won't help. Timer should be able to work with multiple threads.


I updated my first post with a "prod like" example of using the new counters.
This is still a contrived example, but it should be similar to typical use cases.

@cl-bvl cl-bvl mentioned this pull request Aug 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants