cpu: aarch64: hot fix for segfault in cached winograd #2151

taoye9 · 2024-10-04T18:09:37Z

Description

Same PR as #2149 but to backport to release v3.6.

Checklist

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

…ution primitive Signed-off-by: Ye Tao <ye.tao@arm.com>

theComputeKid

Even at its worst, it is better than having a segfault and the wrong answer.

mgouicem · 2024-10-04T19:37:20Z

src/cpu/aarch64/acl_winograd_convolution.hpp

@@ -59,7 +59,7 @@ struct acl_wino_convolution_fwd_t : public primitive_t {
 private:
    status_t execute_forward(const exec_ctx_t &ctx) const;
    const pd_t *pd() const { return (const pd_t *)primitive_t::pd().get(); }
-    std::unique_ptr<acl_obj_t<Op>> acl_obj_;
+    mutable std::unique_ptr<acl_obj_t<Op>> acl_obj_;


why not declare acl_obj_ directly in execute forward?
IIUC, this patch would still break in cache of concurrent execution.

Hi, @mgouicem, i've added the mutex following your comment. would you like to take a review?

Thanks. I am still confused by the use of the mutex. Why not make the object local to the execute function call? Each execute call would have his own object and this will remove any need for synchronization.

Furthermore, as currently implemented, I believe you might still have an issue in case of concurrent execution as only the initialization is mutex protected. For example if you have

t0: lock -> acl_obj init *, dtor previous -> unlock -> acl_obj.get() -> ... -> using handle t1: lock -> acl_obj init, dtor previous *-> unlock -> acl_obj.get() -> ...

In the above example, t0 will get some pointer to object upon acl_obj.get() call, but the underlying object will be destroyed by t1 acl_obj init.

hi, @mgouicem, thanks for your help, i've changed the acl_obj to a local variable.

FYI, I'm kind of confused about what kind of synchronisation should be posed on oneDNN primitives and how the primitives cache works? could you please share some docs/info?

Regarding synchronization, primitive.execute function should be const qualified. So because it is not supposed to change any state of the primitive object, there should be no sync necessary in execution.

We do have synchronizations happening upon primitive cache accesses (single writer, multiple readers). This should be transparent from primitive implementation.

Regarding scratchpad, when user managed, scratchpad is just another output from a primitive perspective (so user responsibility to sync when needed). When library managed, it is trickier as it is not thread-safe by default. You can find more info in the scratchpad devguide page.

thanks mgouicem, that's really helpful!

src/cpu/aarch64/acl_winograd_convolution.cpp

theComputeKid

Need to make it thread-safe.

… winograd conv Signed-off-by: Ye Tao <ye.tao@arm.com>

…v and winograd conv without lock. Signed-off-by: Ye Tao <ye.tao@arm.com> Change-Id: Ifb30292a8bfc5219c44515eb4d29b277a0f0b24a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/oncpuml/oneDNN/+/680780 Tested-by: svc_mongoosetron <svc_mongoosetron@arm.com> Reviewed-by: Hamza Butt <hamza.butt@arm.com> IP-Review: Hamza Butt <hamza.butt@arm.com>

cpu: aarch64: hot fix for segfault in cached winograd gradient convol…

bd9f64b

…ution primitive Signed-off-by: Ye Tao <ye.tao@arm.com>

taoye9 requested a review from a team as a code owner October 4, 2024 18:09

theComputeKid changed the title ~~cpu: aarch64: hot fix for segfault in cached winograd gradient convol…~~ cpu: aarch64: hot fix for segfault in cached winograd Oct 4, 2024

theComputeKid approved these changes Oct 4, 2024

View reviewed changes

mgouicem reviewed Oct 4, 2024

View reviewed changes

src/cpu/aarch64/acl_winograd_convolution.cpp Outdated Show resolved Hide resolved

theComputeKid requested changes Oct 5, 2024

View reviewed changes

cpu: aarch64: hot fix for aux tensor management of stateless gemm and…

395994a

… winograd conv Signed-off-by: Ye Tao <ye.tao@arm.com>

theComputeKid self-requested a review October 7, 2024 14:47

github-actions bot added platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 backport labels Oct 8, 2024

mgouicem approved these changes Oct 9, 2024

View reviewed changes

theComputeKid approved these changes Oct 9, 2024

View reviewed changes

avmanerikar merged commit 3b8f5cd into uxlfoundation:rls-v3.6 Oct 9, 2024
23 checks passed

vpirogov added this to the v3.6 milestone Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpu: aarch64: hot fix for segfault in cached winograd #2151

cpu: aarch64: hot fix for segfault in cached winograd #2151

taoye9 commented Oct 4, 2024 •

edited by theComputeKid

Loading

theComputeKid left a comment

mgouicem Oct 4, 2024

taoye9 Oct 7, 2024 •

edited

Loading

mgouicem Oct 7, 2024

taoye9 Oct 8, 2024

mgouicem Oct 9, 2024

taoye9 Oct 10, 2024

theComputeKid left a comment

cpu: aarch64: hot fix for segfault in cached winograd #2151

cpu: aarch64: hot fix for segfault in cached winograd #2151

Conversation

taoye9 commented Oct 4, 2024 • edited by theComputeKid Loading

Description

Checklist

General

theComputeKid left a comment

Choose a reason for hiding this comment

mgouicem Oct 4, 2024

Choose a reason for hiding this comment

taoye9 Oct 7, 2024 • edited Loading

Choose a reason for hiding this comment

mgouicem Oct 7, 2024

Choose a reason for hiding this comment

taoye9 Oct 8, 2024

Choose a reason for hiding this comment

mgouicem Oct 9, 2024

Choose a reason for hiding this comment

taoye9 Oct 10, 2024

Choose a reason for hiding this comment

theComputeKid left a comment

Choose a reason for hiding this comment

taoye9 commented Oct 4, 2024 •

edited by theComputeKid

Loading

taoye9 Oct 7, 2024 •

edited

Loading