Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The purpose of adjust_anomaly_scores #4

Closed
severous opened this issue Jul 22, 2021 · 1 comment
Closed

The purpose of adjust_anomaly_scores #4

severous opened this issue Jul 22, 2021 · 1 comment
Assignees

Comments

@severous
Copy link

Thanks for your briliiant work. I would like to know the purpose of adjust_anomaly_scores. Thanks for your time.

 # Remove errors for time steps when transition to new channel (as this will be impossible for model to predict)
    if dataset.upper() not in ['SMAP', 'MSL']:
        return scores

    adjusted_scores = scores.copy()
    if is_train:
        md = pd.read_csv(f'./datasets/data/{dataset.lower()}_train_md.csv')
    else:
        md = pd.read_csv('./datasets/data/labeled_anomalies.csv')
        md = md[md['spacecraft'] == dataset.upper()]

    md = md[md['chan_id'] != 'P-2']

    # Sort values by channel
    md = md.sort_values(by=['chan_id'])

    # Getting the cumulative start index for each channel
    sep_cuma = np.cumsum(md['num_values'].values) - lookback
......................
@axeloh
Copy link
Collaborator

axeloh commented Jul 24, 2021

Hi, thanks for your question.

Recall that SMAP and MSL actually consist of multiple individual time-series (A-1, C-2, etc.), each of which has 24/54 (SMAP/MSL) one-hot encoded features and 1 continuous feature. Each of these time-series is very short (typically 1-3k timesteps), so most implementations (including ours) concatenate all these time-series in the time direction, creating one large time-series.

So, the dataset will "jump" up and down whenever it transitions to a new channel. This has some effects on the forecastings and reconstructions of the model:

  1. It will be "impossible" for model to predict correct values when the data is transitioning to a new channel, so the error at these timestamps will be high. Because the errors are used to fit the threshold, we set the errors at these timestamps to zero, so that they do not affect the thresholding.
  2. Different channels will have different ranges (min-max values), and will therefore typically yield errors with different ranges. Therefore, we want to normalize the errors for each channel before they are used to fit the threshold.

These two steps are performed in adjust_anomaly_scores and are only applied when dataset is MSL or SMAP.
labeled_anomalies.csv is used to get the length of each channel of the test set, while smap/msl_train_md.csv is used for the same purpose but for the train set.

@axeloh axeloh closed this as completed Aug 19, 2021
@axeloh axeloh pinned this issue Sep 18, 2021
@axeloh axeloh self-assigned this Sep 18, 2021
JinYang88 pushed a commit to JinYang88/mtad-gat-pytorch that referenced this issue Dec 17, 2023
JinYang88 pushed a commit to JinYang88/mtad-gat-pytorch that referenced this issue Dec 17, 2023
# This is the 1st commit message:

feat: possibility to specify target dim

# The commit message ML4ITS#2 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#3 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#4 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#5 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#6 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#7 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#8 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#9 will be skipped:

# feat: possibility to specify target dim

# The commit message ML4ITS#10 will be skipped:

# fix

# The commit message ML4ITS#11 will be skipped:

# fix

# The commit message ML4ITS#12 will be skipped:

# feat: writing results to txt file

# The commit message ML4ITS#13 will be skipped:

# feat: writing results to txt file

# The commit message ML4ITS#14 will be skipped:

# ..

# The commit message ML4ITS#15 will be skipped:

# ..

# The commit message ML4ITS#16 will be skipped:

# ..

# The commit message ML4ITS#17 will be skipped:

# trying new anomaly score

# The commit message ML4ITS#18 will be skipped:

# trying new anomaly score

# The commit message ML4ITS#19 will be skipped:

# trying new anomaly score

# The commit message ML4ITS#20 will be skipped:

# trying new anomaly score

# The commit message ML4ITS#21 will be skipped:

# fix

# The commit message ML4ITS#22 will be skipped:

# results from all experiments, plotting, ++

# The commit message ML4ITS#23 will be skipped:

# added plotter class and jupyter notebook file to visualize results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants