Replies: 9 comments 2 replies
-
@notecola Thank you for the question. Let me take a look! |
Beta Was this translation helpful? Give feedback.
-
Thanks for your time |
Beta Was this translation helpful? Give feedback.
-
@notecola Thanks again for the reproducible example! Hopefully, my response here will come across as "welcoming an open discussion" rather than sounding like an "authoritative voice" (I certainly don't know everything!) so please feel free to challenge me as I will not take offense. I think that there are a few important points to make here. Firstly, we need to keep in mind that matrix profiles compute z-normalized distances between subsequences and so this means that the magnitude is often lost as a result of normalization. Handwaving-ly, z-normalization of a subsequence essentially removes changes in trend (think de-trending) and compares the shape of one subsequence with another (and not the raw magnitude). This can be a blessing and a curse depending on what problem you are trying to solve but, generally, this means that z-normalized distances are not all well suited for "point anomalies" (i.e., when you have individual points that are, say, two standard deviations away from the full time series mean). Instead, z-normalized distance comparisons are best used for comparing the relative shapes of subsequences (i.e., a contiguous set of neighboring data points). In fact, z-normalized distances are wonderful for finding "shape patterns" or, in the case of discords, the lack of a "shape pattern". See this related issue to get a sense of what I mean as it relates to removing shifts in magnitude (an increasing trend) while still recognizing shape patterns. Having said that, it is important to note that "discord" is not exactly synonymous with "anomaly". I would consider discords as a great "candidate" for a potentially anomalous shape. So, please keep this in mind. In general, I don't think anybody really "knows" how to define an anomaly... So, in your toy example, what you have are essentially point anomalies which are not great for z-normalization but, again, this is trivial to detect with simple standard deviation. However, we recently added an (unsupported) implementation that uses unnormalized Euclidean distance (i.e., we don't z-normalize and just compute the straight Euclidean distance) to compute matrix profiles based on this published
Then you can clearly see that the Let me know if this helps! |
Beta Was this translation helpful? Give feedback.
-
@seanlaw Great answer with a lot of useful information, thank you! The AAMP is a great addition, wonderful! Still wondering if it really makes sense trying to interpret the z-normalized matrix profile in case of non periodic time series? |
Beta Was this translation helpful? Give feedback.
-
Awesome! I'm glad you found it helpful. These discussions also help me think through my understanding as well.
There are currently no plans but I would recommend that, if you need it, to create a new Github issue requesting this feature. We can't promise anything as we have ton of other priorities that we'd like to get working on but a Having said that, we certainly welcome contributions for the community and I would be happy to work with you on it if you would consider contributing! Absolutely no pressure though.
This is a good question. Ultimately, there is no silver bullet. Each analysis method that you choose will come with trade-offs/assumptions. I think matrix profiles are great for exploratory data analysis as it can tell you whether or not a periodic event has occurred without assuming, a priori, that one exists. Same thing with discords. Additionally, there are a ton of other "nice" features of the matrix profile where, once you've computed it, many other analyses come essentially "for free". At the end of the day, all of these tools should be seen as complimentary and not "the one to rule them all". The great thing is that matrix profiles are fairly inexpensive to run (now) and it doesn't require that you perform any mental gymnastics to interpret. I'd love to learn more about your use case(s) |
Beta Was this translation helpful? Give feedback.
-
I have to check this out too! |
Beta Was this translation helpful? Give feedback.
-
Just chiming in to +1 this conversation. I also had this use case and really appreciate the thorough analysis, Sean. |
Beta Was this translation helpful? Give feedback.
-
I've added this as a new feature request in issue #232 |
Beta Was this translation helpful? Give feedback.
-
@notecola CUDA based AAMP is now available in the latest STUMPY release version 1.5.0 (along with a ton of new |
Beta Was this translation helpful? Give feedback.
-
We have a problem understanding the matrix profile for a very simple case.
The 1D data has 61 points and there clearly is an anomaly.
However, MP plots with various window sizes seem to be not related to the data and to each other.
Example code:
CSV zip attached
data.zip
contents are:
Beta Was this translation helpful? Give feedback.
All reactions