How are samples drawn from conformal prediction model / quantiles?

Hi, I'm using regression models together with a `ConformalNaive` model to generate probabilistic forecasts. I'm experimenting with two approaches:

1. Generating historical quantile forecasts using `predict_likelihood_parameters=True`.
2. Generating historical forecasts drawing a sampled distribution with `num_samples=1000`.

However, I'm confused about how the samples are being generated. For the same prediction date, I get the following quantiles:

```
q0.05 = 359.11175744  
q0.25 = 380.32542817  
q0.50 = 407.34136384  
q0.75 = 434.35729951  
q0.95 = 455.57097025
```

But when I inspect the sampled distribution, I get the following stats:

```
count    1000.000000  
mean      409.045300  
std        30.632939  
min       359.111757  
25%       382.522803  
50%       409.312385  
75%       436.266100  
max       455.570970
```

The 0.05 and 0.95 quantiles match the minimum and maximum of the sampled values, and their frequencies are unusually high:

```
455.570970    59  
359.111757    43  
(other values) ~1 each
```

This leads to a histogram with spikes at the ends of the distribution.

![Image](https://github.com/user-attachments/assets/bfbd8d92-553c-4288-9257-a46888e3bbde)

**My questions:**

* How are the samples generated from the quantiles?
* Is it expected that the extreme quantiles are overrepresented like this in the sample?
* Shouldn’t the samples more closely reflect a smooth distribution? I'm expecting the 0.05 quantile and the 0.95 quantile to cover the 90% of the sampled distribution but this will get me always 100% coverage.
* Am I doing something wrong here?

**Reproducible Example**

```
import pandas as pd

from darts import concatenate, metrics, TimeSeries
from darts.datasets import AirPassengersDataset
from darts.models import ConformalNaiveModel, LinearRegressionModel

series = AirPassengersDataset().load()

train_start = pd.Timestamp("1949-01-01")
cal_start = pd.Timestamp("1957-01-01")
test_start = pd.Timestamp("1959-01-01")
test_end = pd.Timestamp("1960-12-01")

train = series[train_start : cal_start - series.freq]
cal = series[cal_start : test_start - series.freq]
test = series[test_start:test_end]
cal_test = concatenate([cal, test])

multi_horizon = 3
quantiles = [0.05, 0.25, 0.50, 0.75, 0.95]
input_length = 10

model = LinearRegressionModel(
    lags=input_length, 
    output_chunk_length=multi_horizon, 
    use_static_covariates=False
)

model.fit(train)

cp_model = ConformalNaiveModel(
    model=model, 
    quantiles=quantiles
)

comformal_samples = cp_model.historical_forecasts(
    series=cal_test,
    start=test_start,
    forecast_horizon=multi_horizon,
    retrain=False,
    num_samples=1000,
    predict_likelihood_parameters=False,
    last_points_only=True,
)

comformal_quantiles = cp_model.historical_forecasts(
    series=cal_test,
    start=test_start,
    forecast_horizon=multi_horizon,
    retrain=False,
    num_samples=1,
    predict_likelihood_parameters=True,
    last_points_only=True,    
)
```

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How are samples drawn from conformal prediction model / quantiles? #2830

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How are samples drawn from conformal prediction model / quantiles? #2830

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions