LLaVa_mistral models are unrecognized #37087

darshpatel1052 · 2025-03-28T16:54:33Z

System Info

Issue Title: Support for `llava_mistral` Model Architecture

Environment Information

Transformers Version: <4.51.0dev0>
Python Version: `3.10.16'
OS: 'Ubuntu 20.04.02'
PyTorch Version: 2.5.1
CUDA Version: 11.8
GPU: A4000

Describe the Bug

Reference Code:

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b")

I am trying to load the microsoft/llava-med-v1.5-mistral-7b model using AutoModelForCausalLM.from_pretrained, but I encounter the following error:

Traceback (most recent call last):
  File "/home/darsh/DC/llava-med-model/train.py", line 3, in <module>
    model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b")
  File "/home/darsh/anaconda3/envs/llava-med/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 531, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/home/darsh/anaconda3/envs/llava-med/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1115, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `llava_mistral` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`

Expected Behavior

The model should load successfully using AutoModelForCausalLM.from_pretrained.

Additional Context

I have tried upgrading transformers to the latest version using:
```
pip install --upgrade transformers
```
I also tried installing the development version of transformers from the source:
```
pip install git+https://github.com/huggingface/transformers.git
```
However, the issue persists.
The model type llava_mistral seems to be unsupported by the current version of transformers. If this architecture is not yet supported, could you provide guidance on when it might be added or how I can manually add support for this model?

Request

Please add support for the llava_mistral model architecture in the transformers library or provide instructions on how to proceed with loading this model.

Links

Model: microsoft/llava-med-v1.5-mistral-7b

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Install the required libraries:
```
pip install transformers
```

Run the following Python script:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b")

Observe the error.

Expected behavior

The model should load successfully using AutoModelForCausalLM.from_pretrained.

The text was updated successfully, but these errors were encountered:

zucchini-nlp · 2025-03-28T20:50:31Z

@darshpatel1052 hey!

Yes, the model is not supported by transformers, but you can still use LLaVa Med repo for inference.

It seems to me that LLaVA Med has same arch as the llava model we have in transformers, so supporting it might be easy by converting the weights. I do not have much bandwidth to work on adding more models, but if you or anyone else wants to work on, feel free. To add the model, you can start on spotting the diffs between Llava-Med and simple Llava model, as I'd prefer to re-use existing LlavaForConditionalGeneration if there are no changes on model architecture.

NielsRogge · 2025-03-29T10:06:14Z

Yes see my comment here on how to convert it to the Transformers format.

darshpatel1052 · 2025-03-30T20:42:57Z

@darshpatel1052 hey!

Yes, the model is not supported by transformers, but you can still use LLaVa Med repo for inference.

It seems to me that LLaVA Med has same arch as the llava model we have in transformers, so supporting it might be easy by converting the weights. I do not have much bandwidth to work on adding more models, but if you or anyone else wants to work on, feel free. To add the model, you can start on spotting the diffs between Llava-Med and simple Llava model, as I'd prefer to re-use existing LlavaForConditionalGeneration if there are no changes on model architecture.

@zucchini-nlp Thank you for your response,
Sure!, I would be more then happy to help, firstly this would be my first time working on an open source, so I might need your guidance to do so and secondly I would like to ask if the config for both LLaVA Med and llava models matches i would have to simply make a functionality for the the LLaVa Med checkpoints right and if it is not the case then how can i help by adding supporting functionality.

zucchini-nlp · 2025-03-31T07:10:09Z

Great, thanks! From what I see in LLaVa-Med repo, they do not match. LLaVa config should consist of a separate text and vision configs, and the model will load the corresponding backbones with config. The conversion script handles adapting configs, but afair it is only for LLaVA. So if LLaVA-Med has different text/vision backbones, you'll need to adapt. Same for checkpoint key conversion

Also, please make sure if it is simple llava or llava-next with patches style model. We'll need to use a different class if the model does patching for images

darshpatel1052 added the bug label Mar 28, 2025

zucchini-nlp added the New model label Mar 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLaVa_mistral models are unrecognized #37087

LLaVa_mistral models are unrecognized #37087

darshpatel1052 commented Mar 28, 2025

zucchini-nlp commented Mar 28, 2025

NielsRogge commented Mar 29, 2025

darshpatel1052 commented Mar 30, 2025 •

edited

Loading

zucchini-nlp commented Mar 31, 2025

LLaVa_mistral models are unrecognized #37087

LLaVa_mistral models are unrecognized #37087

Comments

darshpatel1052 commented Mar 28, 2025

System Info

Issue Title: Support for llava_mistral Model Architecture

Environment Information

Describe the Bug

Expected Behavior

Additional Context

Request

Links

Who can help?

Information

Tasks

Reproduction

Expected behavior

zucchini-nlp commented Mar 28, 2025

NielsRogge commented Mar 29, 2025

darshpatel1052 commented Mar 30, 2025 • edited Loading

zucchini-nlp commented Mar 31, 2025

Issue Title: Support for `llava_mistral` Model Architecture

darshpatel1052 commented Mar 30, 2025 •

edited

Loading