[PT FE] Add ipex to GPTQ supported quantization types #30042

notsyncing · 2025-04-09T12:44:52Z

Details:

Hello, I'm trying converting a Qwen2.5 model loaded with gptqmodel, and it complains:

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.                                                       
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.                                                                               
`low_cpu_mem_usage` was None, now default to True since model is quantized.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
INFO   Kernel: Auto-selection: adding candidate `IPEXQuantLinear`                                                                                   
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
2025-04-09 20:31:41,266 - openvino.frontend.pytorch.ts_decoder - WARNING - Failed patching of AutoGPTQ model. Error message:
Tracing of the model will likely be unsuccessful or incorrect
Traceback (most recent call last):
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/openvino/frontend/pytorch/ts_decoder.py", line 150, in _get_scripted_model
    quantized.patch_quantized(pt_module)
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/openvino/frontend/pytorch/quantized.py", line 57, in patch_quantized
    gptq.patch_model(model)  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sfc/Projects/azarrot-py312/.venv/lib/python3.12/site-packages/openvino/frontend/pytorch/gptq.py", line 120, in patch_model
    raise ValueError(f'Unsupported QUANT_TYPE == {m.QUANT_TYPE} is discovered for '
ValueError: Unsupported QUANT_TYPE == ipex is discovered for AutoGPTQ model, only the following types are supported: ['triton', 'exllama', 'exllamav2', 'cuda-old']

It turns out that gptqmodel use IPEXQuantLinear, which has QUANT_TYPE = "ipex", so it complains about that.
I added "ipex" to supported_quant_types in src/bindings/python/src/openvino/frontend/pytorch/gptq.py, and it works.
Tested on Qwen 2.5 model.

Tickets:

None

mvafin · 2025-04-09T14:15:47Z

@notsyncing Did you verify the accuracy of the model?

notsyncing · 2025-04-10T01:42:52Z

@notsyncing Did you verify the accuracy of the model?

Not yet, just some conversations with it. Is there any guide to do this?

[PT FE] Add ipex to GPTQ supported quantization types

a4caf2a

notsyncing requested a review from a team as a code owner April 9, 2025 12:44

github-actions bot added category: Python API OpenVINO Python bindings category: PyTorch FE OpenVINO PyTorch Frontend labels Apr 9, 2025

sys-openvino-ci added the ExternalPR External contributor label Apr 9, 2025

mlukasze requested review from a team, rkazants and cavusmustafa and removed request for a team April 9, 2025 13:08

ilya-lavrenov assigned mvafin Apr 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PT FE] Add ipex to GPTQ supported quantization types #30042

[PT FE] Add ipex to GPTQ supported quantization types #30042

notsyncing commented Apr 9, 2025

mvafin commented Apr 9, 2025

notsyncing commented Apr 10, 2025

[PT FE] Add ipex to GPTQ supported quantization types #30042

Are you sure you want to change the base?

[PT FE] Add ipex to GPTQ supported quantization types #30042

Conversation

notsyncing commented Apr 9, 2025

Details:

Tickets:

mvafin commented Apr 9, 2025

notsyncing commented Apr 10, 2025