- π₯π₯π₯[Apr 18, 2025] We release the parameter translator weights of LLama3.2-1B and Qwen2.5-1.5B (Others will coming soon) at here!
- π₯π₯π₯[Mar 31, 2025] Our paper Better wit than wealth: Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement" is available at arxiv !
- π₯π₯π₯[Mar 26, 2025] The code of Dynammic Parametric RAG is open-source at DyPRAG!
Overview of Dynamic Parametric RAG:
Dynamic Parametric RAG (DyPRAG) is a novel framework that utilizes a lightweight parameter translator model to efficiently map documents into parameterized knowledge by modeling the underlying function from documents to parameters, reducing inference, training and storage costs while enhancing LLMs knowledge in a plug-and-play manner at test-time.
- Extensive experiments on multiple datasets demonstrate DyPRAGβs effectiveness and generalization in test-time knowledge enhancement.
- DyPRAG dynamically integrates parameterized knowledge to resolve conflicts between contextual and parametric knowledge, offering a practical solution to mitigate RAG hallucination in real-world applications.
- DyPRAG-Combine is a novel powerful RAG paradigm that combines contextual knowledge with parametric knowledge enable LLMs to better manipulate knowledge and reduce hallucination.
Method | Inference Cost | Training Cost | Storage Cost | Generalization | RAG Hallucination |
---|---|---|---|---|---|
RAG | π₯Ά | π€ | π€ | π€ | π₯Ά |
PRAG | π€ | π₯Ά | π₯Ά | π₯Ά | π³ |
DyPRAG (ours) | π€ | π³ | π€ | π€ | π€ |
We propose simple pipeline to achieve DyPRAG.
- Stage 1: Collecting Doc-Param Pairs by offline parametrization.
- Stage 2: Training parameter translator by mimic the target LoRA behavior.
- Stage 3: Leveraging parameter translator to enhance LLM's knowledge at test-time.
cd DyPRAG
conda create -n dyprag python=3.10.4
conda activate dyprag
pip install -r requirements.txt
Note: for
data_aug
(test examples in DyPRAG), we use files provided in PRAG.
We also provide our complementation fordata_aug
indata_aug.tar.gz
anddata_aug_projector
(augmented training examples in DyPRAG) indata_aug_projector.tar.gz
.
In order to extract it, run the commandtar -xzvf data_aug.tar.gz
andtar -xzvf data_aug_projector.tar.gz
in your terminal.
If you want to rerun this process, please process the following steps:
We following PRAG to prepare the data.
Prepare retrival data: BM25
- Download the Wikipedia dump from the DPR repository using the following command:
mkdir -p data/dpr
wget -O data/dpr/psgs_w100.tsv.gz https://dl.fbaipublicfiles.com/dpr/wikipedia_split/psgs_w100.tsv.gz
pushd data/dpr
gzip -d psgs_w100.tsv.gz
popd
- Use Elasticsearch to index the Wikipedia dump
cd data
wget -O elasticsearch-8.15.0.tar.gz https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.15.0-linux-x86_64.tar.gz # download Elasticsearch
tar zxvf elasticsearch-8.15.0.tar.gz
rm elasticsearch-8.15.0.tar.gz
cd elasticsearch-8.15.0
nohup bin/elasticsearch > es.log 2>&1 & # run Elasticsearch in background
cd ../..
python prep_elastic.py --data_path data/dpr/psgs_w100.tsv --index_name wiki # build index
Prepare dataset
For 2WikiMultihopQA:
Download the 2WikiMultihopQA dataset from its repository https://www.dropbox.com/s/ms2m13252h6xubs/data_ids_april7.zip?e=1. Unzip it and move the folder to data/2wikimultihopqa.
For HotpotQA:
mkdir -p data/hotpotqa
wget -P data/hotpotqa/ http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json
For PopQA:
Download the PopQA dataset from its repository https://github.com/AlexTMallen/adaptive-retrieval/blob/main/data/popQA.tsv, and put the file popQA.tsv into folder data/popqa.
mkdir -p data/popqa
wget -P data/popqa https://github.com/AlexTMallen/adaptive-retrieval/blob/main/data/popQA.tsv
For ComplexWebQuestions:
Download the ComplexWebQuestions dataset from its repository https://www.dropbox.com/scl/fo/nqujvpg2gc4y0ozkw3wgr/AOzjVEsdUhv2Fx2pamfJlSw?rlkey=746t7xehfqxf1zr867nxiq8aq&e=1, and put the file ComplexWebQuestions_dev.json into folder data/complexwebquestions.
For StrategyQA:
wget -O data/strategyqa_dataset.zip https://storage.googleapis.com/ai2i/strategyqa/data/strategyqa_dataset.zip
mkdir -p data/strategyqa
unzip data/strategyqa_dataset.zip -d data/strategyqa
rm data/strategyqa_dataset.zip
For IIRC:
wget -O data/iirc.tgz https://iirc-dataset.s3.us-west-2.amazonaws.com/iirc_train_dev.tgz
tar -xzvf data/iirc.tgz
mv iirc_train_dev/ data/iirc
rm data/iirc.tgz
For RAGTruth:
Download the RAGTruth dataset from its repository https://github.com/ParticleMedia/RAGTruth/blob/main/dataset/ and put source_info.jsonl
into folder data/ragtruth.
mkdir -p data/ragtruth
wget -P data/ragtruth https://github.com/ParticleMedia/RAGTruth/blob/main/dataset/source_info.jsonl
We provide detail command for following three stages in configs
folder for both PRAG and DyPRAG.
- Data Augmentation
python src/augment.py \
--model_name llama3.2-1b-instruct \
--dataset 2wikimultihopqa \
--data_path data/2wikimultihopqa/ \
--sample 300 \
--topk 3 \
--output_dir data_aug_projector \
--projector \
Parameter | Example/Options |
---|---|
model_name |
llama3.2-1b-instruct , qwen2.5-1.5b-instruct , llama3-8b-instruct |
dataset |
2wikimultihopqa , hotpotqa , popqa , complexwebquestions |
data_path |
folder to the saved data, such as data/2wikimultihopqa |
sample |
Number of questions to run |
topk |
retrieval number |
output_dir |
folder to save the augmented data |
projector |
whether to use projector |
The results of data augmentation will be stored in the file {output_dir}/{dataset}/{data_type}.json
. To reproduce PRAG, you should set output_dir
to data_aug
and without projector
.
- Document Parameterizing
By calling the src/encode.py
file, you will generate a parameterized representation
python3 src/encode.py \
--model_name=llama3.2-1b-instruct \
--dataset=2wikimultihopqa \
--sample=300 \
--per_device_train_batch_size=1 \
--num_train_epochs=1 \
--learning_rate=0.0003 \
--lora_rank=2 \
--lora_alpha=32 \
--with_cot \
--projector
Parameter | Example/Options |
---|---|
model_name |
llama3.2-1b-instruct , qwen2.5-1.5b-instruct , llama3-8b-instruct |
dataset |
2wikimultihopqa , hotpotqa , popqa , complexwebquestions |
data_type |
Not set means using the entire dataset, otherwise, specify a particular data type |
with_cot |
If included, generate a CoT |
sample |
Number of questions to run |
augment_model |
Model used for data augmentation. If not set, the current model will be used for augmentation |
per_device_train_batch_size , num_train_epochs , learning_rate |
Training parameters |
lora_rank , lora_alpha |
LoRA parameters, dropout will be set to 0 |
projector |
Whether to use projector |
Set projector
to encode the data from data_aug_projector
folder and for PRAG unset projector
to encode the data from data_aug
folder.
All generated parameters are stored in the offline
folder.
The specific location of the parameter files is as follows:
offline/
βββ {model_name}/
β βββ rank={lora_rank}_alpha={lora_alpha}/
β βββ base_weight/
β βββ {dataset}/
β βββ lr={learning_rate}_epoch={num_train_epochs}/
β βββ aug_model={augment_model}/
β βββ {data_type}/
β βββ data_{did}/
β βββ passage_{pid}/
| βββ parameters
python3 -u src/train_dyprag.py \
--model_name=llama3-8b-instruct \
--datasets="2wikimultihopqa,hotpotqa,popqa,complexwebquestions" \
--learning_rate=0.0003 \
--lora_rank=2 \
--lora_alpha=32 \
--max_new_tokens=128 \
--sample_rate=1 \
--dyprag_learning_rate=1e-5 \
--dyprag_train_epochs=1 \
Parameter | Example/Options |
---|---|
model_name |
llama3.2-1b-instruct , qwen2.5-1.5b-instruct , llama3-8b-instruct
|
datasets |
datasets used for training DyPRAG |
learning_rate |
learning rate in stage 1 |
lora_rank , lora_alpha
|
LoRA settings in stage 1 |
max_new_tokens |
max generate tokens in stage 2 |
sample_rate |
sample rate for alignment datasets |
dyprag_learning_rate |
learning rate in stage 2 |
dyprag_train_epochs |
training epochs in stage 2 |
The well-trained parameter translator projector/f'{args.model_name}_hidden{args.projector_p}_sample{args.sample_rate}_lr{args.dyprag_learning_rate}
folder.
python3 src/inference_dyprag.py \
--model_name=llama3-8b-instruct \
--dataset=hotpotqa \
--sample=-1 \
--num_train_epochs=1 \
--learning_rate=0.0003 \
--lora_rank=2 \
--lora_alpha=32 \
--max_new_tokens=128 \
--inference_method=dyprag \
--inference_epoch=1 \
--projector_path=projector_path \
--projector_p=32
--with_cot \
Parameter | Example/Options |
---|---|
inference_epoch |
selected epoch checkpoint for inference |
projector_path |
path to trained parameter translator |
inference_method |
dyprag or dyprag_combine |
projector_p |
intermediate size of parameter translator |
You can use similar command to inference RAGTruth with --data_type="QA"
.
We have released the parameter translator weights of LLama3.2-1B and Qwen2.5-1.5B (Others will coming soon) at here, your can download and run inference.
python -u ./src/evaluate_ragtruth.py \
--dyprag_path=dyprag_output_path \
--rag_path=rag_output_path \
--output_path=output_path
- Go to transformers.models to find the model you want to use.
- Copy
configuration_xxx.py
andmodeling_xxx.py
to themodels
folder and modify the import information inmodeling_xxx.py
similar to our src/models/modeling_qwen2.py - Modify forward function of MLP module in
modeling_xxx.py
similar to our src/models/modeling_qwen2.py Line 57-69 - Add a new class in
get_model_class
function in src/utils.py to load the new type of LLMs.
If you find our work useful in your research and would like to cite our project, please use the following citation:
@misc{tan2025betterwitwealthdynamic,
title={Better wit than wealth: Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement},
author={Yuqiao Tan and Shizhu He and Huanxuan Liao and Jun Zhao and Kang Liu},
year={2025},
eprint={2503.23895},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.23895},
}