HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 81
Star 66

Code
Issues 11
Pull requests 57
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 17 Milestones 0

New pull request New

57 Open 894 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Added link for example scripts and FP8 tutorial and documentation

#1032 opened Apr 8, 2025 by MohitIntel

Loading…

[deepseek r1] HPU support for deepseek

#1030 opened Apr 8, 2025 by xuechendi

Loading…

[aice/v1.20.1] PRC branch migration for v1.20.1

#1029 opened Apr 8, 2025 by ranzhejiang

Loading…

[1.21 cherry-pick] Fix async callback ordering (#1023)

#1028 opened Apr 8, 2025 by madamczykhabana

Loading…

[SW-224648] Fix test logs redirection

#1027 opened Apr 8, 2025 by bmyrcha

Loading…

[SW-224648] Fix test logs redirection

#1026 opened Apr 8, 2025 by bmyrcha

Loading…

Support Data Parallel MOE on HPU

#1022 opened Apr 8, 2025 by xinyu-intel

Loading…

[WIP] Bypass detokenize for openai completion

#1021 opened Apr 7, 2025 by tianmu-li • Draft

APC - Remove prompt attn with context and use existing implementation

#1020 opened Apr 7, 2025 by adobrzyn

Loading…

Michalkuligowski patch update workflows

#1019 opened Apr 7, 2025 by michalkuligowski • Draft

[Deepseek-R1] PR to habana main

#1014 opened Apr 6, 2025 by xuechendi • Draft

Enable alibi_slope with FusedSDPA.

#1013 opened Apr 4, 2025 by libinta

Loading…

Warmup V1

#1012 opened Apr 4, 2025 by iboiko-habana

Loading…

Implement Pipeline Parallelism support for HPU.

#1000 opened Apr 2, 2025 by jmaksymczuk

Loading…

Modify RobertaEmbedding forward as custom op method

#996 opened Apr 1, 2025 by yeonsily

Loading…

Update gaudi readme about compile mode execution

#993 opened Apr 1, 2025 by afierka-intel • Draft

Remove unnecessary comments and files from deepseek_r1 branch

#991 opened Mar 31, 2025 by kwisniewski98

Loading…

Use the correct fp8 range for G2

#984 opened Mar 31, 2025 by czhu15

Loading…

add torch profiler for the LLM engine

#979 opened Mar 28, 2025 by yangulei

Loading…

Enable torchrun on Gaudi

#974 opened Mar 27, 2025 by czhu15

Loading…

enable fp32 softmax in flat_pa_mla

#972 opened Mar 27, 2025 by yangulei

Loading…

workaround cpu fallback when calling max on a bool tensor

#970 opened Mar 26, 2025 by yangw1234

Loading…

Update linear.py

#964 opened Mar 25, 2025 by michalkuligowski • Draft

Update layers.py

#957 opened Mar 25, 2025 by michalkuligowski • Draft

Perform KV cache update on flat layout

#955 opened Mar 25, 2025 by mswiniarsk • Draft

Previous 1 2 3 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly