Problem with Aggregation of CPU, maybe even Utilization in general. #44

stiesssh · 2024-07-03T07:39:48Z

Current Behaviour

Probe: for a given point in time, the CPU - utilization probe emit values $1$ or $0$, if jobs processed or waiting, or not.
Aggregation: Our aggregation aggregates all values while disregarding the duration. Which is perfectly fine with operation response time and other probes, whose value apply for a single point in time only. However, for the utilization, we must also consider the duration of each value, which we currently do not. In fact, we do consider the duration, that's what the timefactor is for, however it is still broken, because we do not (and at this point cannot) differentiate which resource emmitted a value, when aggregating, as a result a resource utilization like on the left would result in a very high average utilization, and a resource utilization like on the right would result in a very low utilization, whereas it should actually be both around 50%.

Extereme cases as in the examples are unlikely, but i had infact a simulation run, where i had two replicase of a resource and for some seconds only one of the received requests while the other was idle.
And with the currect EMA calculation, the idle time was disregarded, because the $1$ reported by the other resource "ended" the idle time of the first resource.

Desired Behaviour

Correct Aggregation results for all metrics.

Possible Solutions

Hacky fix (i tried it, it works)

see branches fix-Utilisation (still local) in SPD-Interpreter and Monitoring repositories.

Idea: Probe already considers the duration when emitting utilisation values $\rightarrow$ emitted values for utilisation are now "utilisation during two successive event". Aggregator must be adapted accordingly.
Problems:
- This works with average, and probably also with sum but will probably break on others.
- The emitted measurement values are not consistent with Palladios usual understanding of Utilization.

Better Solution

Enforce Aggregation for all Utilization Monitors (as SimuLizar does)
- $\rightarrow$ requires check for defined Monitors
Aggregate across multiple CPUs by aggregating the values that are already aggregated-per-resource
- $\rightarrow$ requires filter chain to listen to MeasurementUpdated instead of MeasurementMade
- $\rightarrow$ first make sure that aggregation-per-resource is correct (should be though, because it relies on aggregator classes from org.palladiosimulator.monitorrepository.statisticalcharacterization)
Problems:
- the windows of the aggregation-per-CPU must have the same point in time for start and end, or else it get's weird.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with Aggregation of CPU, maybe even Utilization in general. #44

Problem with Aggregation of CPU, maybe even Utilization in general. #44

stiesssh commented Jul 3, 2024 •

edited

Loading

Problem with Aggregation of CPU, maybe even Utilization in general. #44

Problem with Aggregation of CPU, maybe even Utilization in general. #44

Comments

stiesssh commented Jul 3, 2024 • edited Loading

Current Behaviour

Desired Behaviour

Possible Solutions

Hacky fix (i tried it, it works)

Better Solution

stiesssh commented Jul 3, 2024 •

edited

Loading