Replies: 10 comments 2 replies
-
@ywy9876 Thank you for your question. Our implementation of multi-dimensional STOMP (MSTUMP) produces identical results to the original author's open source reference implementation that can be found here. To answer your question, each row of the returned matrix profile actually corresponds to using one time series up to D-time series. So, if you have three time series then the first row of the matrix profile chooses the one time series (out of three) that produces the smallest (minimum) distance value for that window. The second row of the matrix profile chooses two time series (out of three) that produces the smallest (minimum) average distance value for that window. Finally, the third row of the matrix profile gives the (average) distance value for when all three time series are used. The matrix profile indices tell you where along the time series you'd find matching subsequence. However, it doesn't tell you which of the time series is chosen as you increase k, the number of dimensions to include. We have not implemented FLOSS but I suspect that you can do this fairly easily since the hard part is having an efficient way to compute the full matrix profile indices (which STUMPY does for you). Let me know if this answers your question. |
Beta Was this translation helpful? Give feedback.
-
@seanlaw ,
If I have the following results:
What I should interpret is that for the subsequence starting from position 4 of the original TS, the nearest neighbour is the subsequence starting from position 867 if we take into account all the dimensions (due to matrix_profile_indices[2][4]), am I wrong?
I'm not very sure what did you mean, what's the purpose of knowing which of the time series is chosen? Looking forward to hearing from you and thank you again. |
Beta Was this translation helpful? Give feedback.
-
This is correct. However, you must take into account what the matrix profile values are for k=1 to k=3 (where k is the number of sub-dimensions). The matrix profile value may be lowest for k=2 (not k=1 or k=3). When you read the original paper, it explains very clearly that one is rarely interested in using all of the dimensions. Instead, you want to choose the smallest subset of dimensions, k, that produces the smallest mean matrix profile value.
Consistent with what I said above, if k=2 produces the smallest mean matrix profile value then the matrix profile index only tells you where along the three original time series to look. However, for k=2, neither the matrix profile nor the matrix profile index tells you which two of the three time series contains the subsequence that was used in the matrix profile calculation. Now, imagine if you have D=20 time series and, for a given subsequence, k=5. You now have no idea which 5 time series out of 20 are important. The matrix profile index also doesn’t give you this information. Fortunately, this is something that can be tracked (though more complicated for MSTUMPED) and it would welcome a Pull Request for that. Let me know if this explanation helps and then I can close this issue. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the clarification. For my particular case, the data has only 2 dimensions (u,v components of wind), so I think this shouldn't be an issue for me.
Interesting work! I will see if I can manage it.
Thanks again for the explanation. I think the issue can be closed. |
Beta Was this translation helpful? Give feedback.
-
That’s great. We also lack a Tutorial for MSTUMP so that is also something that you could contribute as well if you have a nice data set that you can share. |
Beta Was this translation helpful? Give feedback.
-
@ywy9876 FLUSS and FLOSS have been implemented for 1-dimensional data. You might want to take a look at Tutorial 3 |
Beta Was this translation helpful? Give feedback.
-
Was wondering if FLUSS or FLOSS is applicable to segment multidimensional time series. Any ideas on how to achieve this? |
Beta Was this translation helpful? Give feedback.
-
@rnjv Unfortunately, it isn't clear how one would do this. |
Beta Was this translation helpful? Give feedback.
-
Could we port ESPRESSO - https://github.com/cruiseresearchgroup/ESPRESSO? With dependency to https://github.com/cruiseresearchgroup/IGTS-python |
Beta Was this translation helpful? Give feedback.
-
@rnjv Would you mind starting a new issue separate with this request and please clearly describe the problem that you are trying to solve? ESPRESSO may be beyond the scope of STUMPY. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I was looking at the example of Multi-dimensional time series data with MSTUMP. I tried to run the code and found that the returned values (matrix_profile, matrix_profile_indices) are both 3d (1 for each Time series I guess). My doubt is, when the indices for a segment i point to different nearest segments (say 847, 237, 847), how do we decide which is the nearest neighbour? In a document of Prof. Eamonn, it says that we simply add the distance profiles together and find the minimum. But with your implementation, I'm not sure how to do it.
Also I wonder if the FLOSS or similar algorithm for semantic segmentation is already implemented.
Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions