vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 6.6k
Star 43.5k

Code
Issues 1.6k
Pull requests 551
Discussions
Actions
Projects 8
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 1

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 82

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,569 Open 6,092 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: [V1][Speculative Decoding] Broadcasting error (ValueError) and socket error (ZMQError) using [ngram] decoding bug

Something isn't working

#16058 opened Apr 4, 2025 by lbeisteiner

1 task done

[Usage]: v1 engine on CPU usage

How to use vllm

#16056 opened Apr 4, 2025 by harryhan618

1 task done

[Bug]: CI flake - v1/engine/test_async_llm.py::test_abort - assert has_unfinished_requests() bug

Something isn't working

ci/build v1

#16054 opened Apr 4, 2025 by markmc

1 task done

[Bug]: CI flake - v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output - JSONDecodeError: Expecting value: line 1 column 1 (char 0) bug

Something isn't working

ci/build v1

#16053 opened Apr 4, 2025 by markmc

1 task done

[RFC]: Extending VLLM towards native support of non text-generating models RFC

#16052 opened Apr 4, 2025 by christian-pinto

1 task done

[Doc]: Steps to run 2 different models on Kaggle GPUs using vllm documentation

Improvements or additions to documentation

#16051 opened Apr 4, 2025 by furkanbk

1 task done

[Performance]: LLM Offline Inference Slowing Down Over Time performance

Performance-related issues

#16050 opened Apr 4, 2025 by uyzhang

1 task done

[Bug]: Multiple rounds of dialogue, only infering for the last round bug

Something isn't working

#16046 opened Apr 4, 2025 by missTL

1 task done

Integrate PPLX-kernels

#16039 opened Apr 3, 2025 by tlrmchlsmth

[RFC]: Data Parallel Attention and Expert Parallel MoEs RFC

#16037 opened Apr 3, 2025 by tlrmchlsmth

7 of 13 tasks

[Bug]: xgrammar missing file crashes the server bug

Something isn't working

#16030 opened Apr 3, 2025 by servient-ashwin

1 task done

[Feature]: Adding tool_choice: required for lm-format-enforcer feature request

New feature or request

#16029 opened Apr 3, 2025 by ItzAmirreza

1 task done

[Bug]: Two beginning of sequence tokens for Lllama-3.2-3B-Instruct bug

Something isn't working

#16028 opened Apr 3, 2025 by Naqu6

1 task done

[Bug]: Unable to run Phi4 with tensor-parallel-size 4 torch.compile compatiblity bug

Something isn't working

#16021 opened Apr 3, 2025 by roguetech

1 task done

[New Model]: support for fashion-clip new model

Requests to new models

#16019 opened Apr 3, 2025 by priyankaiiit14

1 task done

[RFC]: Cache Salting for Secure and Flexible Prefix Caching in vLLM RFC

#16016 opened Apr 3, 2025 by dr75

1 task done

[Bug]: Null response for Mistral3.1 bug

Something isn't working

#16014 opened Apr 3, 2025 by hahmad2008

1 task done

[Bug]: Cannot use FlashAttention-2 backend because the vllm.vllm_flash_attn package is not found. Make sure that vllm_flash_attn was built and installed (on by default). bug

Something isn't working

#16013 opened Apr 3, 2025 by GGBond8488

1 task done

[RFC]: Is huggingface-cli[hf_xet] needed for vllm build? RFC

#16012 opened Apr 3, 2025 by Shafi-Hussain

1 task done

[Usage]: Performance Comparison: 1x8 (TP=8) vs 2x4 (TP=4) in vLLM - Why Does 1x8 Outperform 2x4 in Concurrency? usage

How to use vllm

#16011 opened Apr 3, 2025 by hwb96

1 task done

[Bug]: TypeError: __init__() missing 1 required positional argument: 'inner_exception' bug

Something isn't working

torch.compile

#16009 opened Apr 3, 2025 by Satonio1

1 task done

[Bug]: Tool call auto not working with Qwen models in v0.8.2 bug

Something isn't working

#16008 opened Apr 3, 2025 by pivotal-marcela-campo

1 task done

[Bug]: crash during debug, works ok running cli bug

Something isn't working

torch.compile

#16006 opened Apr 3, 2025 by CharlesJu1

1 task done

[Bug] [Misc]: test_sharded_state_loader run failed bug

Something isn't working

#16004 opened Apr 3, 2025 by Accelerator1996

1 task done

[Usage]: Is it possible to run vLLM inside a Jupyter Notebook? usage

How to use vllm

#16003 opened Apr 3, 2025 by repodiac

1 task done

Previous 1 2 3 4 5 … 62 63 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly