Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] offload optmizer states in zero3 #6833

Closed
Hanqer opened this issue Dec 7, 2024 · 2 comments
Closed

[BUG] offload optmizer states in zero3 #6833

Hanqer opened this issue Dec 7, 2024 · 2 comments
Assignees
Labels
bug Something isn't working training

Comments

@Hanqer
Copy link

Hanqer commented Dec 7, 2024

Describe the bug
https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/engine.py#L3736 In offloading states of zero3 engine, the assertion 'self.zero_offload_param()' return a config class but not bool, why check it as not?
'assert not self.zero_offload_param()' will always raise an error when the ds_config set as:

"offload_optimizer": {
"device": "none"
}

@tohtana
Copy link
Contributor

tohtana commented Dec 11, 2024

Good catch, thank you @Hanqer!
As long as I check the behaviors of DeepSpeed engine, it returns None when offload_param is not given. We need to cover the case the device for offloading is none. The same goes for offloading of optimizer states.

I submitted #6855 to address this issue. I would appreciate it if you could give us any feedback.

loadams added a commit that referenced this issue Dec 16, 2024
This PR fixes the assertions in `offload_states` method mentioned in
#6833.

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
@tohtana
Copy link
Contributor

tohtana commented Dec 20, 2024

Closing as #6855 resolved this.

@tohtana tohtana closed this as completed Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working training
Projects
None yet
Development

No branches or pull requests

2 participants