The purpose of `adjust_anomaly_scores` #4

severous · 2021-07-22T04:48:13Z

Thanks for your briliiant work. I would like to know the purpose of adjust_anomaly_scores. Thanks for your time.

 # Remove errors for time steps when transition to new channel (as this will be impossible for model to predict)
    if dataset.upper() not in ['SMAP', 'MSL']:
        return scores

    adjusted_scores = scores.copy()
    if is_train:
        md = pd.read_csv(f'./datasets/data/{dataset.lower()}_train_md.csv')
    else:
        md = pd.read_csv('./datasets/data/labeled_anomalies.csv')
        md = md[md['spacecraft'] == dataset.upper()]

    md = md[md['chan_id'] != 'P-2']

    # Sort values by channel
    md = md.sort_values(by=['chan_id'])

    # Getting the cumulative start index for each channel
    sep_cuma = np.cumsum(md['num_values'].values) - lookback
......................

The text was updated successfully, but these errors were encountered:

axeloh · 2021-07-24T19:32:49Z

Hi, thanks for your question.

Recall that SMAP and MSL actually consist of multiple individual time-series (A-1, C-2, etc.), each of which has 24/54 (SMAP/MSL) one-hot encoded features and 1 continuous feature. Each of these time-series is very short (typically 1-3k timesteps), so most implementations (including ours) concatenate all these time-series in the time direction, creating one large time-series.

So, the dataset will "jump" up and down whenever it transitions to a new channel. This has some effects on the forecastings and reconstructions of the model:

It will be "impossible" for model to predict correct values when the data is transitioning to a new channel, so the error at these timestamps will be high. Because the errors are used to fit the threshold, we set the errors at these timestamps to zero, so that they do not affect the thresholding.
Different channels will have different ranges (min-max values), and will therefore typically yield errors with different ranges. Therefore, we want to normalize the errors for each channel before they are used to fit the threshold.

These two steps are performed in adjust_anomaly_scores and are only applied when dataset is MSL or SMAP.
labeled_anomalies.csv is used to get the length of each channel of the test set, while smap/msl_train_md.csv is used for the same purpose but for the train set.

Experiments

# This is the 1st commit message: feat: possibility to specify target dim # The commit message ML4ITS#2 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#3 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#4 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#5 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#6 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#7 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#8 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#9 will be skipped: # feat: possibility to specify target dim # The commit message ML4ITS#10 will be skipped: # fix # The commit message ML4ITS#11 will be skipped: # fix # The commit message ML4ITS#12 will be skipped: # feat: writing results to txt file # The commit message ML4ITS#13 will be skipped: # feat: writing results to txt file # The commit message ML4ITS#14 will be skipped: # .. # The commit message ML4ITS#15 will be skipped: # .. # The commit message ML4ITS#16 will be skipped: # .. # The commit message ML4ITS#17 will be skipped: # trying new anomaly score # The commit message ML4ITS#18 will be skipped: # trying new anomaly score # The commit message ML4ITS#19 will be skipped: # trying new anomaly score # The commit message ML4ITS#20 will be skipped: # trying new anomaly score # The commit message ML4ITS#21 will be skipped: # fix # The commit message ML4ITS#22 will be skipped: # results from all experiments, plotting, ++ # The commit message ML4ITS#23 will be skipped: # added plotter class and jupyter notebook file to visualize results

axeloh closed this as completed Aug 19, 2021

axeloh pinned this issue Sep 18, 2021

axeloh self-assigned this Sep 18, 2021

JinYang88 pushed a commit to JinYang88/mtad-gat-pytorch that referenced this issue Dec 17, 2023

Merge pull request ML4ITS#4 from ML4ITS/experiments

d8ed039

Experiments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The purpose of `adjust_anomaly_scores` #4

The purpose of `adjust_anomaly_scores` #4

severous commented Jul 22, 2021

axeloh commented Jul 24, 2021

The purpose of adjust_anomaly_scores #4

The purpose of adjust_anomaly_scores #4

Comments

severous commented Jul 22, 2021

axeloh commented Jul 24, 2021

The purpose of `adjust_anomaly_scores` #4

The purpose of `adjust_anomaly_scores` #4