VLM-Based-Retrieval-Augmented-Generation

Stanford NLP Project Repo

VLM RAG pipeline based on CoPali.

Interpretable MaxSim Mapping:

Query: What is the hand-and-arm signal used for tuning right while driving?

Max MaxSim-Score Token: driving

Min MaxSim-Score Token: What

Project Structure Tree:

VLM RAG/
│
├── benchmark_run_metrics/        # ranking metrics for benchmark
│   ├── datasetName/
│       └── metrics.json           
│
├── codes/
│   ├── finetune.py               # script for fine-tuning retriever using contrastive learning
│   ├── run_benchmark.py          # script to run model on benchmark
│   └── utils                     # util functions
│
├── interpreted_output            # heatmap visualizing visual attention   
|
├── main/                         # main rag pipeline
│   ├── dbManager.py              # script for article vectorization
│   ├── gen.py                    # script for inference and synthetic question generation
│   ├── preprocessor.py           # script for doc preprocessing
│   ├── get_data.py               # scraper for evaluation set
│   └── pipeline.py               # script for RAG pipeline
│
├── dmv_example.png               # example image used for interpretable similarity mapping  
|
├── requirements.txt              # Python dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
benchmark_run_metrics/vidore_colSmol-500M		benchmark_run_metrics/vidore_colSmol-500M
codes		codes
interpreted_output		interpreted_output
interpreted_output_new		interpreted_output_new
main		main
.gitignore		.gitignore
README.md		README.md
dmv_example.png		dmv_example.png
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLM-Based-Retrieval-Augmented-Generation

Interpretable MaxSim Mapping:

Project Structure Tree:

About

Releases

Packages

Contributors 2

Languages

K0EKJE/VLM-Based-Retrieval-Augmented-Generation

Folders and files

Latest commit

History

Repository files navigation

VLM-Based-Retrieval-Augmented-Generation

Interpretable MaxSim Mapping:

Project Structure Tree:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages