SGL

The official implementation of "2024 A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for accelerating Large VLMs".

Wangbo Zhao¹, Yizeng Han², Jiasheng Tang^2,3, Zhikai Li¹, Yibing Song^2,3, Kai Wang¹, Zhangyang Wang⁴, Yang You¹

¹National University of Singapore, ²DAMO Academy, Alibaba Group, ³Hupan Lab, ⁴The University of Texas at Austin

Paper

💥 Overview

(a) Small VLM-guided visual token pruning in a large VLM (SGP). We update a global attention map aggregated from all layer of a small VLM. This global attention map is used to rank visual tokens and guide the visual token pruning in a large VLM.

(b) Aggregation of attention maps in SGP. We aggregate the attention score of visual tokens received from prompt tokens and generated tokens across all heads and layers in the small LM. Higher scores indicate greater significance.

(c) Inference with Small VLM Early Exiting (SEE). When the early exiting decision score from the small VLM is sufficient, the larger VLM will not be invoked.

🔨 Usage

Please refer to the documentation of InternVL to set up the environment and prepare the data for evaluation.
We take 'bash textvqa2B-26B.sh' as an example, which takes InternVL2-2B as the small model to accelerate the large model InternVL2-26B.

🤔 Citation

If you found our work useful, please consider citing us.

@article{zhao2024stitch,
  title={A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for accelerating Large VLMs},
  author={Zhao, Wangbo and Han, Yizeng and Tang, Jiasheng and Li, Zhikai and Song, Yibing and Wang, Kai and Wang, Zhangyang and You, Yang},
  journal={arXiv preprint arXiv:2412.03324},
  year={2024}
}

🙏 Acknowledgement

SGL is built with reference to the code of the following projects: InternVL, FastV, QWen2-VL, and LLaVa-OneVision.

☎️ Contact

🔥🔥🔥 If you are interested in this work and hope to cooperate with us, please drop an email to wangbo.zhao96@gmail.com 🔥🔥🔥

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
eval		eval
internvl		internvl
tools		tools
README.md		README.md
logo.png		logo.png
misc.py		misc.py
run.py		run.py
textvqa2B-26B.sh		textvqa2B-26B.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SGL

💥 Overview

🔨 Usage

🤔 Citation

🙏 Acknowledgement

☎️ Contact

About

Releases

Packages

Languages

NUS-HPC-AI-Lab/SGL

Folders and files

Latest commit

History

Repository files navigation

SGL

💥 Overview

🔨 Usage

🤔 Citation

🙏 Acknowledgement

☎️ Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages