$D_{2}O$ : Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
The code for ICLR 2025 paper: $D_{2}O$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models.
📃 [Paper] • 💻 [Github] • 🤗 [Huggingface]
We recommend using Anaconda to create a new environment and install the required packages. You can create a new environment and install the required packages using the following commands:
pip install -r requirements.txt
conda create -n d2o_v2 python=3.10
conda activate d2o_v2
pip install --upgrade pip # enable PEP 660 support
You can run the inference code using the following command to run the Longbench sample:
CUDA_VISIBLE_DEVICES=0 python run_pred_long_bench_sample.py --model_name_or_path meta-llama/Meta-Llama-3-8B \
--cache_dir /your_hf_home_path \
--use_d2o True \
--model_type llama3 \
--hh_ratio 0.1 \
--recent_ratio 0.1 \
--action_name d2o_0.2 \
--e True
cache_dir
stores your model weights.use_d2o
specifies the execution strategy name.hh_ratio
refers to important tokens in our main paper.recent_ratio
represents the proportion of the window closest to the generated token.
Then, evaluate the results:
python eval_long_bench.py --model Meta-Llama-3-8B_d2o_0.2 --e
For tasks related to lm-evaluation-harness
GitHub Repository,
we recommend using the latest version by running:
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
Then, follow the installation instructions provided in the repository and execute our algorithm accordingly.
@article{wan2024d2o,
title={D2o: Dynamic discriminative operations for efficient generative inference of large language models},
author={Wan, Zhongwei and Wu, Xinjian and Zhang, Yu and Xin, Yi and Tao, Chaofan and Zhu, Zhihong and Wang, Xin and Luo, Siqi and Xiong, Jing and Zhang, Mi},
journal={arXiv preprint arXiv:2406.13035},
year={2024}
}
or
@inproceedings{
wan2025textdtexto,
title={\${\textbackslash}text\{D\}\_\{2\}{\textbackslash}text\{O\}\$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models},
author={Zhongwei Wan and Xinjian Wu and Yu Zhang and Yi Xin and Chaofan Tao and Zhihong Zhu and Xin Wang and Siqi Luo and Jing Xiong and Longyue Wang and Mi Zhang},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=HzBfoUdjHt}
}