-
Notifications
You must be signed in to change notification settings - Fork 385
chore: Make llama.cpp a default engine #1177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8514ef5
to
59bcf14
Compare
364c121
to
2ab8eb2
Compare
2da7a52
to
f21d7fc
Compare
f21d7fc
to
ac16354
Compare
ac16354
to
198b9be
Compare
198b9be
to
b091ad4
Compare
It does not need any external libraries.
Also fix two docker lints (undefined variables).
b091ad4
to
6eaa483
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (11)
container/Dockerfile.none (1)
35-35
: Switch Rustup profile todefault
Using--profile default
installs all standard Rust components (e.g.,rustfmt
,clippy
), aligning with other Dockerfiles. Note this increases the image size. To optimize, consider installing only required components after a--profile minimal
install viarustup component add
.docs/guides/dynamo_run.md (2)
308-308
: Add missing comma for clarity
For smoother reading, update the punctuation in this sentence:-[llama.cpp] is built for CPU by default. For an optimized build… +[llama.cpp] is built for CPU by default, for an optimized build…🧰 Tools
🪛 LanguageTool
[uncategorized] ~308-~308: Possible missing comma found.
Context: ...lt for CPU by default. For an optimized build pass the appropriate feature flag (high...(AI_HYDRA_LEO_MISSING_COMMA)
311-311
: Specify code block language
Thecargo build
snippet should declare its language for markdown lint compliance:```shell cargo build --features cuda|metal|vulkan -p dynamo-run</blockquote></details> <details> <summary>container/Dockerfile.vllm (3)</summary><blockquote> `257-257`: **Consistent Rustup profile change to `default`** Switching from `minimal` to `default` ensures full toolchain components are present, but at the cost of larger images. If only core components are needed, trim with `rustup component add`. --- `296-301`: **Consolidated workspace copy in dev stage** Copying the full `/workspace` directory simplifies the stage but may bring in temporary files or build caches. Consider copying only the necessary artifacts (e.g., `dist/`, `target/`) to keep the dev image lean and predictable. --- `305-306`: **Build C bindings in dev stage** This C bindings build mirrors the `ci_minimum` stage. Since multiple Dockerfiles repeat this step, extracting it into a shared script or base layer could reduce duplication and improve maintainability. </blockquote></details> <details> <summary>container/Dockerfile.tensorrt_llm (4)</summary><blockquote> `204-204`: **Switch Rustup profile to `default`** Changing to `--profile default` aligns with other images by installing all standard components, though it increases the image footprint. If disk space is a concern, revert to `minimal` and selectively add needed components via `rustup component add`. --- `223-226`: **Install `genai-perf` inside the virtual environment** The step correctly activates the venv before installing `genai-perf`. For brevity, you might use `uv pip install --python=3.12 genai-perf==$GENAI_PERF_VERSION` without a preceding `.` activation. --- `299-301`: **Simplified artifact copy in dev stage** Copying the complete `/workspace` tree may include unnecessary files. Narrowing to essential build outputs (e.g., `dist/`, `target/`) can speed builds and reduce image size. --- `304-306`: **Consistent C bindings build step** Building the C bindings here maintains parity with other stages. To avoid repeating this block in multiple Dockerfiles, consider centralizing it in a shared snippet or script. </blockquote></details> <details> <summary>container/Dockerfile.sglang (1)</summary><blockquote> `323-330`: **Optimize artifact copying and ensure reproducible C bindings build** - Copying the entire `/workspace` (line 323) can bloat the image—consider copying only the final artifacts (`dist`, `target/release`) and the C headers (`lib/bindings/c/include`). - Add `--locked` to the C-bindings `cargo build --release` (line 329) to honor the lockfile and guarantee reproducible builds. ```diff -COPY --from=wheel_builder /workspace /workspace +COPY --from=wheel_builder /workspace/dist /workspace/dist +COPY --from=wheel_builder /workspace/target/release /workspace/target/release +COPY --from=wheel_builder /workspace/lib/bindings/c/include /workspace/lib/bindings/c/include -RUN cd /workspace/lib/bindings/c && cargo build --release +RUN cd /workspace/lib/bindings/c && cargo build --release --locked
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
container/Dockerfile.none
(1 hunks)container/Dockerfile.sglang
(2 hunks)container/Dockerfile.tensorrt_llm
(4 hunks)container/Dockerfile.vllm
(4 hunks)docs/guides/dynamo_run.md
(1 hunks)launch/dynamo-run/Cargo.toml
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/guides/dynamo_run.md
[uncategorized] ~308-~308: Possible missing comma found.
Context: ...lt for CPU by default. For an optimized build pass the appropriate feature flag (high...
(AI_HYDRA_LEO_MISSING_COMMA)
🪛 markdownlint-cli2 (0.17.2)
docs/guides/dynamo_run.md
310-310: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
🔇 Additional comments (5)
launch/dynamo-run/Cargo.toml (1)
17-17
:❓ Verification inconclusive
Default features updated to include
llamacpp
Adding"llamacpp"
to thedefault
feature set ensures CPU-only llama.cpp support is enabled by default. Please confirm that CI jobs, packaging workflows, and downstream documentation/examples (especially those referencing--no-default-features
) have been updated accordingly.You can search for
--no-default-features
usages to catch any outdated references:
🏁 Script executed:
rg --ignore-case "no-default-features"
Length of output: 412
🏁 Script executed:
#!/bin/bash rg --ignore-case "llamacpp" -nLength of output: 4722
🏁 Script executed:
#!/bin/bash # List CI workflow files echo "CI workflow files:" find .github/workflows -type f -name "*.yml" -o -name "*.yaml" # Search for cargo commands in CI workflows echo -e "\nCargo-related commands in workflows:" rg -n "cargo" .github/workflows # Search for feature flags usage in CI workflows echo -e "\nFeature flags in workflows:" rg -n "features" .github/workflowsLength of output: 2971
🏁 Script executed:
#!/bin/bash # Inspect cargo commands in generate-docs.yml echo "Cargo commands in generate-docs.yml:" rg -n "run:.*cargo" .github/workflows/generate-docs.yml || echo "No cargo commands found." # Inspect cargo commands in trigger_ci.yml echo -e "\nCargo commands in trigger_ci.yml:" rg -n "run:.*cargo" .github/workflows/trigger_ci.yml || echo "No cargo commands found."Length of output: 396
Confirm CI, packaging, and docs updated for llama.cpp default
Default features now includellamacpp
, enabling llama.cpp support by default. Please verify that:
- CI workflows (
.github/workflows/pre-merge-rust.yml
, etc.) install any native dependencies required by thellamacpp
engine (e.g.libclang-dev
,git
).- Packaging or publish workflows (if any) have been updated to build with the new default features.
- Downstream documentation (
docs/guides/dynamo_run.md
) reflects that disabling default features (--no-default-features
) now drops bothmistralrs
andllamacpp
(falling back to vllm).You can catch any remaining
--no-default-features
examples with:rg --ignore-case "no-default-features"
container/Dockerfile.vllm (2)
223-226
: Installgenai-perf
within the virtual environment
Movinggenai-perf
installation under theuv venv
is correct. Ensure the virtualenv activation is valid for non-interactive shells (.
works), or simplify with:RUN uv pip install genai-perf==$GENAI_PERF_VERSION
370-374
:❓ Verification inconclusive
Unified workspace Cargo build
Building the entire workspace with--workspace
and theblock-manager
feature streamlines the process. Please verify that only intended crates are produced and no ancillary workspace members are inadvertently compiled.You can test with:
🏁 Script executed:
docker build . --target wheel_builder --progress=plain
Length of output: 153
Unified workspace Cargo build
Building the entire workspace with--workspace
and theblock-manager
feature streamlines the process. Please manually verify that only the intended crates are produced and no ancillary workspace members are compiled:• Run the build target locally:
docker build . --target wheel_builder --progress=plain
• Inspect the output to confirm only your desired crates were built.
container/Dockerfile.tensorrt_llm (1)
277-281
: Workspace-wide Cargo build
Leveraging--workspace
with theblock-manager
feature consolidates builds. Please confirm no extraneous workspace members are compiled.container/Dockerfile.sglang (1)
297-301
: Unified workspace build is correctly simplified
The newcargo build --workspace
replaces multiple package-specific invocations with a single command, making the build step more maintainable and consistent with the other Dockerfiles in this PR.
--exclude file://$PWD/components/http \ | ||
--exclude metrics | ||
# Build C bindings, creates lib/bindings/c/include | ||
RUN cd /workspace/lib/bindings/c && cargo build --release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @nv-anants on single cargo build change - not familiar with the context for the original change that split it into two builds.
@@ -224,6 +220,10 @@ COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ | |||
RUN mkdir /opt/dynamo && \ | |||
uv venv /opt/dynamo/venv --python 3.12 | |||
|
|||
# Install genai-perf for benchmarking | |||
RUN . /opt/dynamo/venv/bin/activate && \ | |||
uv pip install genai-perf==$GENAI_PERF_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I touched similar part of the file here: #1256
It does not need any external libraries.
Summary by CodeRabbit
Chores
Documentation
llama.cpp
engine, specifying default CPU support and GPU build recommendations.Style
New Features