Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checkpoint_during arg #4169

Merged
merged 13 commits into from
Apr 8, 2025
Merged

Add checkpoint_during arg #4169

merged 13 commits into from
Apr 8, 2025

Conversation

nfcampos
Copy link
Contributor

@nfcampos nfcampos commented Apr 3, 2025

  • This provides a new mode of execution where only the last checkpoint is saved
  • We save the last checkpoint no matter how the agent run is terminated (success, error, interrupt, etc)
  • This cuts down on cpu time spent on checkpointing, while not losing any resilience benefits, given individual task writes are still saved
  • If an error occurs and the run is retried, any tasks that completed successfully before will be skipped (as currently)
  • checkpoint_during=True is useful when you want to time-travel to inner steps of a run
  • The default value will remain the current behavior, ie. checkpoint_during=True

- This provides a new mode of execution where only the last checkpoint is saved
- We save the last checkpoint no matter how the agent run is terminated (success, error, interrupt, etc)
- This cuts down on cpu time spent on checkpointing, while not losing any resilience benefits, given individual task writes are still saved
- If an error occurs and the run is retried, any tasks that completed successfully before will be skipped (as currently)
- checkpoint_during=True is useful when you want to time-travel to inner steps of a run
- The default value will remain the current behavior, ie. checkpoint_during=True
Copy link

vercel bot commented Apr 3, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langgraph-docs-preview ⬜️ Ignored (Inspect) Visit Preview Apr 8, 2025 9:20pm

assert (await graph.aget_state(thread)).next == ("step_2",)

await graph.ainvoke(None, thread, debug=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooc: why no longer set debug=True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that's just a holdover from someone debugging this test and forgetting to remove it before

@nfcampos nfcampos marked this pull request as ready for review April 7, 2025 23:48
Copy link
Contributor

@hinthornw hinthornw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a good test that's run for checkpoint_during=False for the folllowing?

  • for resuming from an interrupt & resume within a subgraph?
  • Or that where you Send to node that is interrupted?

@@ -5348,6 +5433,131 @@ async def side(state: State):
assert times_called == 1


@pytest.mark.parametrize("checkpoint_during", [True, False])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests existed only for sync case before, figured was worth adding here

@nfcampos nfcampos merged commit 4fb2aea into main Apr 8, 2025
61 checks passed
@nfcampos nfcampos deleted the nc/3apr/checkpoint-during branch April 8, 2025 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants