-
Notifications
You must be signed in to change notification settings - Fork 203
Use itertools batch to get long jobs lists #3815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
parsl/providers/slurm/slurm.py
Outdated
cmd_timeout : int (Default = 10) | ||
Number of seconds to wait for slurm commands to finish. For schedulers with many this | ||
may need to be increased to wait longer for scheduler information. | ||
status_batch_size: ine (Default = 50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int
don't worry about the globus compute tests not passing here - they're intended to be broken (!) |
it would be better if you do code reformatting in a separate PR that claims to not change behaviour -- the diff for this PR is mostly that and nothing to do with the PR topic, and it hurts me to go look at this stuff in |
@AreWeDreaming has been testing this batching with Slurm on Perlmutter and it is working so I'm marking this as ready for review. |
Description
When there are many jobs being tracked by parsl
sacct
can become unresponsive and timeout if there are too many jobs. A users workflow was experiencing this while running through Globus Compute on Perlmutter.Changed Behaviour
This should batch calls in groups to Slurm
Fixes
Fixes #3814
Type of change
Choose which options apply, and delete the ones which do not apply.