You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This model is based on the CLIP architecture, which is popular for image-text tasks.
The closest model vllm already supports.
Transformers model
What's your difficulty of supporting the model you want?
When I tried evaluating the model, I got the error
Code
from vllm import LLM
import torch
from PIL import Image
from transformers import CLIPProcessor
# Load the model
llm = LLM(model="patrickjohncyh/fashion-clip")
# Load and preprocess the image
image = Image.open("/Users/pkumari/Downloads/monitor.png")
processor = CLIPProcessor.from_pretrained("patrickjohncyh/fashion-clip")
inputs = processor(images=image, return_tensors="pt")
# Generate embeddings
outputs = llm.generate({
"prompt": "<image>",
"multi_modal_data": {"image": inputs['pixel_values']},
})
# Print the embeddings
print(outputs)
Error
[rank0]: ValueError: CLIPModel has no vLLM implementation and the Transformers implementation is not compatible with vLLM. Try setting VLLM_USE_V1=0.
(search-eval) pkumari@GF4MX6XF00 search-eval % pip list
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
The model to consider.
This model is based on the CLIP architecture, which is popular for image-text tasks.
The closest model vllm already supports.
Transformers model
What's your difficulty of supporting the model you want?
When I tried evaluating the model, I got the error
Code
Error
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: