Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] Persisted Job Queue or Retry Logic #49

Open
frrist opened this issue Mar 13, 2025 · 1 comment · May be fixed by #51
Open

[SPIKE] Persisted Job Queue or Retry Logic #49

frrist opened this issue Mar 13, 2025 · 1 comment · May be fixed by #51
Assignees

Comments

@frrist
Copy link
Member

frrist commented Mar 13, 2025

Problem Statement:

When aggregating pieces or submitting aggregations, failures may occur due to software-related issues or unexpected storage node restarts.

Expected Behavior:

Jobs in the "job queue" must be retried until they succeed. If a job fails, it should either:

  • Be retrieved from its respective store and reprocessed, or
  • Persist in the queue (e.g., saved to disk) to ensure it remains available for retrying.

At present when jobs fail, or the node restarts the contents of the queue is lost.

Ideas

One approach may be to implement or adopt a job queuing library with persistence. (e.g. https://github.com/maragudk/goqite)
Another option could be to persist jobs to a store (we already have one) in a pending state, and ensure they are completed across restarts and failures.

@frrist frrist self-assigned this Mar 13, 2025
@frrist frrist moved this to Backlog in Storacha Project Planning Mar 13, 2025
frrist added a commit that referenced this issue Mar 15, 2025
- closes #49
- thank you OS contributor: https://github.com/maragudk/goqite/
  - Licence and credit at top of respective files
  - modified from source to allow generics
frrist added a commit that referenced this issue Mar 15, 2025
- closes #49
- thank you OS contributor: https://github.com/maragudk/goqite/
  - Licence and credit at top of respective files
  - modified from source to allow generics
@frrist frrist linked a pull request Mar 15, 2025 that will close this issue
@frrist frrist moved this from Backlog to In Progress in Storacha Project Planning Mar 17, 2025
@travis
Copy link
Member

travis commented Mar 17, 2025

@hannahhoward needs to review

frrist added a commit that referenced this issue Mar 18, 2025
- closes #49
- thank you OS contributor: https://github.com/maragudk/goqite/
  - Licence and credit at top of respective files
  - modified from source to allow generics
frrist added a commit that referenced this issue Apr 9, 2025
- closes #49
- thank you OS contributor: https://github.com/maragudk/goqite/
  - Licence and credit at top of respective files
  - modified from source to allow generics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

2 participants