Is there any examples of using offload feature in GPT/BLOOM/OPT inference? #209

YJHMITWEB · 2023-03-17T18:21:41Z

Hi, currently in the examples, only linear describes a naive example of offload, in other projects such as opt, bloom, gpt, there is no option for offload.
I am wondering how to apply offload to large model inference, and any examples?

The text was updated successfully, but these errors were encountered:

binmakeswell · 2023-03-20T06:00:40Z

Hi @YJHMITWEB This is technically feasible, but would cause a sharp decline in the inference speed. Therefore, the practical significance is limited, and we currently do not consider it a high priority.
Welcome to submit the corresponding proposal or PR to participate in the construction. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any examples of using offload feature in GPT/BLOOM/OPT inference? #209

Is there any examples of using offload feature in GPT/BLOOM/OPT inference? #209

YJHMITWEB commented Mar 17, 2023 •

edited

Loading

binmakeswell commented Mar 20, 2023

Is there any examples of using offload feature in GPT/BLOOM/OPT inference? #209

Is there any examples of using offload feature in GPT/BLOOM/OPT inference? #209

Comments

YJHMITWEB commented Mar 17, 2023 • edited Loading

binmakeswell commented Mar 20, 2023

YJHMITWEB commented Mar 17, 2023 •

edited

Loading