Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix auto_gptq layer error device #2134

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ZX-ModelCloud
Copy link
Contributor

Fix the device error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! mentioned by MekkCyber commented.

test_quantization
RUN_SLOW=1 pytest tests/gptq/test_quantization.py

  • cpu tests
  • cuda tests

@Qubitium
Copy link
Contributor

@ZX-ModelCloud Move this PR to draft. May not be needed. This is actually related to deficiency in autogptq unable to pass cpu tests as optimum force move layer to gpu.

Gptqmodel has no such restrictions. We may bypass this by disabling cpu only tests for AutoGPTQ.

@ZX-ModelCloud ZX-ModelCloud marked this pull request as draft December 21, 2024 07:35
Copy link

This PR has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Mar 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants