Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

未来会支持混合图文数据的多模态增量预训练吗? #7581

Open
1 task done
windn0 opened this issue Apr 3, 2025 · 0 comments
Open
1 task done

未来会支持混合图文数据的多模态增量预训练吗? #7581

windn0 opened this issue Apr 3, 2025 · 0 comments
Labels
enhancement New feature or request pending This problem is yet to be addressed

Comments

@windn0
Copy link

windn0 commented Apr 3, 2025

Reminder

  • I have read the above rules and searched the existing issues.

Description

感谢作者们为Llama-Factory的贡献!
我在阅读预训练的dataset格式的时候发现,增量预训练只支持text文本数据,阅读大量issue后感觉目前应该是不支持多模态增量预训练的。
所以想问:
1、多模态大模型是否需要通过增量预训练来学习某一个领域的知识?增量预训练是否需要图文混合?
2、假如我想增量预训练一个多模态大模型,目前的方法是否只能是先用纯文本数据增量预训练,然后再做图文混合的微调?
3、未来是否会支持混合图文的增量预训练呢?
4、目前的项目如果做修改以支持图文混合增量预训练,我自己修改麻烦吗?大概需要修改哪些地方?

Pull Request

No response

@windn0 windn0 added enhancement New feature or request pending This problem is yet to be addressed labels Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant