Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral support through OpenAIChatCompletionClient. #6151

Open
1 task done
ekzhu opened this issue Mar 31, 2025 · 4 comments
Open
1 task done

Mistral support through OpenAIChatCompletionClient. #6151

ekzhu opened this issue Mar 31, 2025 · 4 comments
Labels
help wanted Extra attention is needed proj-extensions
Milestone

Comments

@ekzhu
Copy link
Collaborator

ekzhu commented Mar 31, 2025

Confirmation

  • I confirm that I am a maintainer and so can use this template. If I am not, I understand this issue will be closed and I will be asked to use a different template.

Issue body

Mistral AI models can be added to the list here: https://github.com/microsoft/autogen/blob/fbdd89b46bf3883efe8b0e0838370b9d74561ee7/python/packages/autogen-ext/src/autogen_ext/models/openai/_model_info.py

And the base_url should automatically be set if the model matches a Mistral AI model. API key should be detected from the environment. Use the support for Gemini model as a reference.

PR should address: #6147

@SongChiYoung
Copy link
Contributor

SongChiYoung commented Mar 31, 2025

Check : #6158

👋 While working on supporting Mistral models (e.g., mistral-large-latest, codestral-latest), I noticed that the current approach for inferring model_family via prefix matching may not work reliably anymore.

This is the current implementation:

def _find_model_family(api: str, model: str) -> str:
    for family in MESSAGE_TRANSFORMERS[api].keys():
        if model.startswith(family):
            return family
    return "default"

This logic worked well for cleanly prefixed models like gpt-4 or claude-3, and I originally wrote it when I was still getting familiar with AutoGen’s structure.
However, newer model names like mistral-large-latest, codestral-latest (but same mistral family, if someone want) introduce ambiguity, making prefix-based family inference fragile.

✅ I’m preparing a PR that refactors this logic in the following way:
• Allow model_family to be passed explicitly
• Fallback to _find_model_family() only when the value is "unknown"
• Integrate with the existing ModelInfo and ModelFamily infrastructure for consistency and clarity

Example usage:

        oai_messages_nested = [
            to_oai_type(
                m,
                prepend_name=self._add_name_prefixes,
                model=create_args.get("model", "unknown"),
                model_family=self._model_info["family"],
            )
            for m in messages
        ]

This provides more robust and future-proof transformer selection, especially for models with non-standard names like Mistral.

📌 I’ll open a PR soon with this change.

Would it also make sense to open a separate issue to track the broader design consideration around model family inference and registry consistency?
Happy to do that if it would help clarify the motivation and long-term direction.

Feedback welcome! 🙏

Check : #6158

@ekzhu
Copy link
Collaborator Author

ekzhu commented Mar 31, 2025

Would it also make sense to open a separate issue to track the broader design consideration around model family inference and registry consistency?

Can we use #6011 as the roadmap issue?

@SongChiYoung
Copy link
Contributor

@ekzhu Cool.
So, how could I do that...? Maybe I does not have capability of adding that issue to #6011.

@ekzhu
Copy link
Collaborator Author

ekzhu commented Apr 1, 2025

You can comment on that issue and we can create sub issues from your comment.

ekzhu added a commit that referenced this issue Apr 2, 2025
… Mistral (#6158)

This PR improves how model_family is resolved when selecting a
transformer from the registry.
Previously, model families were inferred using a simple prefix-based
match like:
```
if model.startswith(family): ...
```
This works for cleanly prefixed models (e.g., `gpt-4o`, `claude-3`) but
fails for models like `mistral-large-latest`, `codestral-latest`, etc.,
where prefix-based matching is ambiguous or misleading.

To address this:
	•	model_family can now be passed explicitly (e.g., via ModelInfo)
• _find_model_family() is only used as a fallback when the value is
"unknown"
	•	Transformer lookup is now more robust and predictable
• Example integration in to_oai_type() demonstrates this pattern using
self._model_info["family"]

This change is required for safe support of models like Mistral and
other future models that do not follow standard naming conventions.

Linked to discussion in
[#6151](#6151)
Related : #6011

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed proj-extensions
Projects
None yet
Development

No branches or pull requests

2 participants