Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why three special token is ignored in label #53

Open
Pride-Huang opened this issue Nov 27, 2024 · 1 comment
Open

why three special token is ignored in label #53

Pride-Huang opened this issue Nov 27, 2024 · 1 comment

Comments

@Pride-Huang
Copy link

image
The label just consists of image token, with the special token <|image start|> ignored. Why compute sft loss like this?

@Masaaki-75
Copy link

This might partly answer the question: https://github.com/baaivision/Emu3/blob/main/emu3/mllm/processing_emu3.py#L178-L183

During inference, the boi_token (which is "<|image start|>" in string form), the resolution information, and the img_token (which is "<|image token|>") are directly provided as a starter for the generation. (BTW, the mismatch between the string form and the variable naming is confusing and annoying, lol)

That said, i am also curious why the authors limited the supervision to the first visual token id and the last visual token id, while ignoring eol_token, eof_token, and eoi_token.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants