Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLaVa_mistral models are unrecognized #37087

Open
2 of 4 tasks
darshpatel1052 opened this issue Mar 28, 2025 · 4 comments
Open
2 of 4 tasks

LLaVa_mistral models are unrecognized #37087

darshpatel1052 opened this issue Mar 28, 2025 · 4 comments

Comments

@darshpatel1052
Copy link

System Info

Issue Title: Support for llava_mistral Model Architecture


Environment Information

  • Transformers Version: <4.51.0dev0>
  • Python Version: `3.10.16'
  • OS: 'Ubuntu 20.04.02'
  • PyTorch Version: 2.5.1
  • CUDA Version: 11.8
  • GPU: A4000

Describe the Bug

Reference Code:

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b")

I am trying to load the microsoft/llava-med-v1.5-mistral-7b model using AutoModelForCausalLM.from_pretrained, but I encounter the following error:

Traceback (most recent call last):
  File "/home/darsh/DC/llava-med-model/train.py", line 3, in <module>
    model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b")
  File "/home/darsh/anaconda3/envs/llava-med/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 531, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/home/darsh/anaconda3/envs/llava-med/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1115, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `llava_mistral` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`

Expected Behavior

The model should load successfully using AutoModelForCausalLM.from_pretrained.


Additional Context

  • I have tried upgrading transformers to the latest version using:

    pip install --upgrade transformers
  • I also tried installing the development version of transformers from the source:

    pip install git+https://github.com/huggingface/transformers.git

    However, the issue persists.

  • The model type llava_mistral seems to be unsupported by the current version of transformers. If this architecture is not yet supported, could you provide guidance on when it might be added or how I can manually add support for this model?


Request

Please add support for the llava_mistral model architecture in the transformers library or provide instructions on how to proceed with loading this model.


Links


Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Install the required libraries:
    pip install transformers
  2. Run the following Python script:
    from transformers import AutoModelForCausalLM
    
    model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b")
  3. Observe the error.

Expected behavior

The model should load successfully using AutoModelForCausalLM.from_pretrained.


@zucchini-nlp
Copy link
Member

@darshpatel1052 hey!

Yes, the model is not supported by transformers, but you can still use LLaVa Med repo for inference.

It seems to me that LLaVA Med has same arch as the llava model we have in transformers, so supporting it might be easy by converting the weights. I do not have much bandwidth to work on adding more models, but if you or anyone else wants to work on, feel free. To add the model, you can start on spotting the diffs between Llava-Med and simple Llava model, as I'd prefer to re-use existing LlavaForConditionalGeneration if there are no changes on model architecture.

@NielsRogge
Copy link
Contributor

Yes see my comment here on how to convert it to the Transformers format.

@darshpatel1052
Copy link
Author

darshpatel1052 commented Mar 30, 2025

@darshpatel1052 hey!

Yes, the model is not supported by transformers, but you can still use LLaVa Med repo for inference.

It seems to me that LLaVA Med has same arch as the llava model we have in transformers, so supporting it might be easy by converting the weights. I do not have much bandwidth to work on adding more models, but if you or anyone else wants to work on, feel free. To add the model, you can start on spotting the diffs between Llava-Med and simple Llava model, as I'd prefer to re-use existing LlavaForConditionalGeneration if there are no changes on model architecture.

@zucchini-nlp Thank you for your response,
Sure!, I would be more then happy to help, firstly this would be my first time working on an open source, so I might need your guidance to do so and secondly I would like to ask if the config for both LLaVA Med and llava models matches i would have to simply make a functionality for the the LLaVa Med checkpoints right and if it is not the case then how can i help by adding supporting functionality.

@zucchini-nlp
Copy link
Member

Great, thanks! From what I see in LLaVa-Med repo, they do not match. LLaVa config should consist of a separate text and vision configs, and the model will load the corresponding backbones with config. The conversion script handles adapting configs, but afair it is only for LLaVA. So if LLaVA-Med has different text/vision backbones, you'll need to adapt. Same for checkpoint key conversion

Also, please make sure if it is simple llava or llava-next with patches style model. We'll need to use a different class if the model does patching for images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants