-
-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: [V1][Speculative Decoding] Broadcasting error (Something isn't working
ValueError
) and socket error (ZMQError
) using [ngram]
decoding
bug
#16058
opened Apr 4, 2025 by
lbeisteiner
1 task done
[Usage]: v1 engine on CPU
usage
How to use vllm
#16056
opened Apr 4, 2025 by
harryhan618
1 task done
[Bug]: CI flake - v1/engine/test_async_llm.py::test_abort - assert has_unfinished_requests()
bug
Something isn't working
ci/build
v1
#16054
opened Apr 4, 2025 by
markmc
1 task done
[RFC]: Extending VLLM towards native support of non text-generating models
RFC
#16052
opened Apr 4, 2025 by
christian-pinto
1 task done
[Doc]: Steps to run 2 different models on Kaggle GPUs using vllm
documentation
Improvements or additions to documentation
#16051
opened Apr 4, 2025 by
furkanbk
1 task done
[Performance]: LLM Offline Inference Slowing Down Over Time
performance
Performance-related issues
#16050
opened Apr 4, 2025 by
uyzhang
1 task done
[Bug]: Multiple rounds of dialogue, only infering for the last round
bug
Something isn't working
#16046
opened Apr 4, 2025 by
missTL
1 task done
[RFC]: Data Parallel Attention and Expert Parallel MoEs
RFC
#16037
opened Apr 3, 2025 by
tlrmchlsmth
7 of 13 tasks
[Bug]: xgrammar missing file crashes the server
bug
Something isn't working
#16030
opened Apr 3, 2025 by
servient-ashwin
1 task done
[Feature]: Adding tool_choice: required for lm-format-enforcer
feature request
New feature or request
#16029
opened Apr 3, 2025 by
ItzAmirreza
1 task done
[Bug]: Two beginning of sequence tokens for Lllama-3.2-3B-Instruct
bug
Something isn't working
#16028
opened Apr 3, 2025 by
Naqu6
1 task done
[Bug]: Unable to run Phi4 with tensor-parallel-size 4 torch.compile compatiblity
bug
Something isn't working
#16021
opened Apr 3, 2025 by
roguetech
1 task done
[New Model]: support for fashion-clip
new model
Requests to new models
#16019
opened Apr 3, 2025 by
priyankaiiit14
1 task done
[RFC]: Cache Salting for Secure and Flexible Prefix Caching in vLLM
RFC
#16016
opened Apr 3, 2025 by
dr75
1 task done
[Bug]: Null response for Mistral3.1
bug
Something isn't working
#16014
opened Apr 3, 2025 by
hahmad2008
1 task done
[Bug]: Cannot use FlashAttention-2 backend because the vllm.vllm_flash_attn package is not found. Make sure that vllm_flash_attn was built and installed (on by default).
bug
Something isn't working
#16013
opened Apr 3, 2025 by
GGBond8488
1 task done
[RFC]: Is
huggingface-cli[hf_xet]
needed for vllm build?
RFC
#16012
opened Apr 3, 2025 by
Shafi-Hussain
1 task done
[Usage]: Performance Comparison: 1x8 (TP=8) vs 2x4 (TP=4) in vLLM - Why Does 1x8 Outperform 2x4 in Concurrency?
usage
How to use vllm
#16011
opened Apr 3, 2025 by
hwb96
1 task done
[Bug]: TypeError: __init__() missing 1 required positional argument: 'inner_exception'
bug
Something isn't working
torch.compile
#16009
opened Apr 3, 2025 by
Satonio1
1 task done
[Bug]: Tool call auto not working with Qwen models in v0.8.2
bug
Something isn't working
#16008
opened Apr 3, 2025 by
pivotal-marcela-campo
1 task done
[Bug]: crash during debug, works ok running cli
bug
Something isn't working
torch.compile
#16006
opened Apr 3, 2025 by
CharlesJu1
1 task done
[Bug] [Misc]: test_sharded_state_loader run failed
bug
Something isn't working
#16004
opened Apr 3, 2025 by
Accelerator1996
1 task done
[Usage]: Is it possible to run vLLM inside a Jupyter Notebook?
usage
How to use vllm
#16003
opened Apr 3, 2025 by
repodiac
1 task done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.