-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checkpoint_during arg #4169
Conversation
nfcampos
commented
Apr 3, 2025
- This provides a new mode of execution where only the last checkpoint is saved
- We save the last checkpoint no matter how the agent run is terminated (success, error, interrupt, etc)
- This cuts down on cpu time spent on checkpointing, while not losing any resilience benefits, given individual task writes are still saved
- If an error occurs and the run is retried, any tasks that completed successfully before will be skipped (as currently)
- checkpoint_during=True is useful when you want to time-travel to inner steps of a run
- The default value will remain the current behavior, ie. checkpoint_during=True
- This provides a new mode of execution where only the last checkpoint is saved - We save the last checkpoint no matter how the agent run is terminated (success, error, interrupt, etc) - This cuts down on cpu time spent on checkpointing, while not losing any resilience benefits, given individual task writes are still saved - If an error occurs and the run is retried, any tasks that completed successfully before will be skipped (as currently) - checkpoint_during=True is useful when you want to time-travel to inner steps of a run - The default value will remain the current behavior, ie. checkpoint_during=True
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
assert (await graph.aget_state(thread)).next == ("step_2",) | ||
|
||
await graph.ainvoke(None, thread, debug=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooc: why no longer set debug=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that's just a holdover from someone debugging this test and forgetting to remove it before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a good test that's run for checkpoint_during=False
for the folllowing?
- for resuming from an interrupt & resume within a subgraph?
- Or that where you Send to node that is interrupted?
@@ -5348,6 +5433,131 @@ async def side(state: State): | |||
assert times_called == 1 | |||
|
|||
|
|||
@pytest.mark.parametrize("checkpoint_during", [True, False]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests existed only for sync case before, figured was worth adding here