-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[RFCs] Add Flow Graph try_put_and_wait RFC #1513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Made suggested wording changes. Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com> Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Otherwise, if the concurrency limit of the node is reached, both message and the associated metainformations would be rejected and the predecessor that called the ``try_put_task`` | ||
is responsible on buffering both of them. | ||
|
||
If the predecessor is not the buffering node, both message and the metainfo would be lost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should say what "lost" means in this context. How does it affect waiting? Does a lost message/metainfo properly decrement the count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the description
* Multi-output nodes support should be described and implemented | ||
* Feedback from the customers should be received | ||
* More multithreaded tests should be implemented for the existing functionality | ||
* The corresponding oneAPI specification update should be done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a clear
function to all buffering nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the list of the questions, together with the safety guarantees for this method.
Co-authored-by: Alexey Kukanov <alexey.kukanov@intel.com> Co-authored-by: Mike Voss <michaelj.voss@intel.com>
Each input port of the join_node should support the queue for both values and the associated metainformations. Once all of the input ports contain the value, the values | ||
should be combined into single tuple output and the metainformation objects should be combined into single metainfo using `metainfo1.merge(metainfo2)`, associated with the tuple | ||
and submitted to successors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the merged metainformation is associated with a tuple of messages, could not we additionally store the indexes of the input ports? That would allow split nodes to be smarter and split the meta info as well. Or would it be too risky to assume that nodes between join and split do not reshuffle the tuple? Tagging @vossmjp as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as far as I remember the risk of changing the tuple between join and split was the main reason why we did not make any splitting of the metainfo.
* Should ``clear()`` member function be added to the non-rejecting ``join_node`` to handle the case when when we don't have the input present on each input port. | ||
* Concurrent safety guarantees for ``clear()`` should be defined (e.g. is it safe to clear the buffer when other thread tries to insert the item). | ||
* Feedback from the customers should be received | ||
* More multithreaded tests should be implemented for the existing functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An idea for an example/test: a version of dining philosophers where an external thread would use try_put_and_wait()
for one of them (perhaps the fastest thinker :))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is also a good example that shows the necessity of some trait for buffering nodes that should allow ignoring the metainformation. If we consider the implementation from our examples but without using the multifunction_node
, the chopstick is always present in the graph. If we want to use try_put_and_wait
for one of the philosophers, at the point when both chopsticks are acquired, we have the tuple of signal and two chopsticks, associated with the metainfo from the signal.
When the chopstick is returned back to its buffer, it would contain the metainfo from the signal, and the chopstick would be considered part of computations needed by try_put_and_wait
.
In this case, the buffer for chopsticks can be marked by the special trait to ignore the metainformation.
I think the same problem takes place in each graph that uses token-passing.
Alternatively, it can be done by the multifunction_node
by explicitly passing the metainfo only to the "keep thinking" port but not to the "return chopstick" ports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea to keep a tuple index in the metainfo (#1513 (comment)) could help as well.
* The item associated with the computations is taken from the buffering node or from the internal buffer. | ||
* When the desired number of signals from the predecessors was received by the ``continue_node``. This case is equivalent to retrieving the set of buffered ``message_metainfo``s received previously | ||
from each predecessor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these cases, how it is ensured that the reference counters are not decreased prematurely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
continue_node
creates a copy of the stored metainfo merged from each of the predecessors and creates a task for executing the body that holds an additional reference on the counter (similarly to any task with the stored metainfo).
And only after spawning the task, the reference counters on the copy are decreased.
I have added a note that the reference counter is decreased after spawning a task.
|
||
It may buffer both ``t`` and ``metainfo`` or broadcast the result and the ``info`` to the successors of the node. | ||
|
||
The existing API ``try_put_task(const T& t)`` can reuse the new one with the empty metainfo object if an empty metainfo is a preserved as a lightweight structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would instead say "if that does not incur a noticeable overhead for the normal use of flow graphs."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied
The difference is that for lightweight nodes no tasks are created and spawned in most of the cases and the node body will be executed by the calling thread. | ||
Since there are no tasks, the calling thread broadcasts the output and the metainformation to the successors after completing the function. | ||
|
||
### ``continue_node`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid that the overhead of metainfo for dependency graphs can be big with a straightforward implementation.
Imagine a 2D wavefront graph started with try_put_and_wait
. Each node that receives the metainfo will send two copies of it, one for each successor. Each node in the middle of the graph will receive two piles of metainfo from its predecessors, merge those (doubling the size of the pile) and send to two its successors, This is an exponential growth pattern.
And the benefit of single-message wait for continue nodes is very doubtful, as dependency graphs are not supposed to handle several independent message flows.
I would reconsider implementation and/or applicability of this feature for continue nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a separate section for this issue.
Description
Add a comprehensive description of proposed changes
Fixes # - issue number(s) if exists
Type of change
Choose one or multiple, leave empty if none of the other choices apply
Add a respective label(s) to PR if you have permissions
Tests
Documentation
Breaks backward compatibility
Notify the following users
List users with
@
to send notificationsOther information