Skip to content

chore: Make llama.cpp a default engine #1177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

grahamking
Copy link
Contributor

@grahamking grahamking commented May 22, 2025

It does not need any external libraries.

Summary by CodeRabbit

  • Chores

    • Updated Docker build processes to streamline Rust workspace builds, simplify artifact copying, and adjust environment variable handling across multiple container images.
    • Changed Rust toolchain installation to use the default profile for broader component support.
    • Modified installation steps for certain Python packages within container environments.
  • Documentation

    • Clarified build instructions for the llama.cpp engine, specifying default CPU support and GPU build recommendations.
  • Style

    • Removed license header comments from configuration files.
  • New Features

    • Enabled the "llamacpp" feature by default in the package configuration.

@grahamking grahamking force-pushed the gk-llamacpp-default branch from b091ad4 to 6eaa483 Compare May 28, 2025 22:11
@grahamking grahamking marked this pull request as ready for review May 28, 2025 22:44
@ai-dynamo ai-dynamo deleted a comment from coderabbitai bot May 28, 2025
@grahamking grahamking enabled auto-merge (squash) May 28, 2025 22:49
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
container/Dockerfile.none (1)

35-35: Switch Rustup profile to default
Using --profile default installs all standard Rust components (e.g., rustfmt, clippy), aligning with other Dockerfiles. Note this increases the image size. To optimize, consider installing only required components after a --profile minimal install via rustup component add.

docs/guides/dynamo_run.md (2)

308-308: Add missing comma for clarity
For smoother reading, update the punctuation in this sentence:

-[llama.cpp] is built for CPU by default. For an optimized build…
+[llama.cpp] is built for CPU by default, for an optimized build…
🧰 Tools
🪛 LanguageTool

[uncategorized] ~308-~308: Possible missing comma found.
Context: ...lt for CPU by default. For an optimized build pass the appropriate feature flag (high...

(AI_HYDRA_LEO_MISSING_COMMA)


311-311: Specify code block language
The cargo build snippet should declare its language for markdown lint compliance:

```shell
cargo build --features cuda|metal|vulkan -p dynamo-run

</blockquote></details>
<details>
<summary>container/Dockerfile.vllm (3)</summary><blockquote>

`257-257`: **Consistent Rustup profile change to `default`**  
Switching from `minimal` to `default` ensures full toolchain components are present, but at the cost of larger images. If only core components are needed, trim with `rustup component add`.

---

`296-301`: **Consolidated workspace copy in dev stage**  
Copying the full `/workspace` directory simplifies the stage but may bring in temporary files or build caches. Consider copying only the necessary artifacts (e.g., `dist/`, `target/`) to keep the dev image lean and predictable.

---

`305-306`: **Build C bindings in dev stage**  
This C bindings build mirrors the `ci_minimum` stage. Since multiple Dockerfiles repeat this step, extracting it into a shared script or base layer could reduce duplication and improve maintainability.

</blockquote></details>
<details>
<summary>container/Dockerfile.tensorrt_llm (4)</summary><blockquote>

`204-204`: **Switch Rustup profile to `default`**  
Changing to `--profile default` aligns with other images by installing all standard components, though it increases the image footprint. If disk space is a concern, revert to `minimal` and selectively add needed components via `rustup component add`.

---

`223-226`: **Install `genai-perf` inside the virtual environment**  
The step correctly activates the venv before installing `genai-perf`. For brevity, you might use `uv pip install --python=3.12 genai-perf==$GENAI_PERF_VERSION` without a preceding `.` activation.

---

`299-301`: **Simplified artifact copy in dev stage**  
Copying the complete `/workspace` tree may include unnecessary files. Narrowing to essential build outputs (e.g., `dist/`, `target/`) can speed builds and reduce image size.

---

`304-306`: **Consistent C bindings build step**  
Building the C bindings here maintains parity with other stages. To avoid repeating this block in multiple Dockerfiles, consider centralizing it in a shared snippet or script.

</blockquote></details>
<details>
<summary>container/Dockerfile.sglang (1)</summary><blockquote>

`323-330`: **Optimize artifact copying and ensure reproducible C bindings build**  
- Copying the entire `/workspace` (line 323) can bloat the image—consider copying only the final artifacts (`dist`, `target/release`) and the C headers (`lib/bindings/c/include`).  
- Add `--locked` to the C-bindings `cargo build --release` (line 329) to honor the lockfile and guarantee reproducible builds.  


```diff
-COPY --from=wheel_builder /workspace /workspace
+COPY --from=wheel_builder /workspace/dist /workspace/dist
+COPY --from=wheel_builder /workspace/target/release /workspace/target/release
+COPY --from=wheel_builder /workspace/lib/bindings/c/include /workspace/lib/bindings/c/include

-RUN cd /workspace/lib/bindings/c && cargo build --release
+RUN cd /workspace/lib/bindings/c && cargo build --release --locked
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9723d62 and 6eaa483.

📒 Files selected for processing (6)
  • container/Dockerfile.none (1 hunks)
  • container/Dockerfile.sglang (2 hunks)
  • container/Dockerfile.tensorrt_llm (4 hunks)
  • container/Dockerfile.vllm (4 hunks)
  • docs/guides/dynamo_run.md (1 hunks)
  • launch/dynamo-run/Cargo.toml (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/guides/dynamo_run.md

[uncategorized] ~308-~308: Possible missing comma found.
Context: ...lt for CPU by default. For an optimized build pass the appropriate feature flag (high...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.17.2)
docs/guides/dynamo_run.md

310-310: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (5)
launch/dynamo-run/Cargo.toml (1)

17-17:

❓ Verification inconclusive

Default features updated to include llamacpp
Adding "llamacpp" to the default feature set ensures CPU-only llama.cpp support is enabled by default. Please confirm that CI jobs, packaging workflows, and downstream documentation/examples (especially those referencing --no-default-features) have been updated accordingly.

You can search for --no-default-features usages to catch any outdated references:


🏁 Script executed:

rg --ignore-case "no-default-features"

Length of output: 412


🏁 Script executed:

#!/bin/bash
rg --ignore-case "llamacpp" -n

Length of output: 4722


🏁 Script executed:

#!/bin/bash
# List CI workflow files
echo "CI workflow files:"
find .github/workflows -type f -name "*.yml" -o -name "*.yaml"

# Search for cargo commands in CI workflows
echo -e "\nCargo-related commands in workflows:"
rg -n "cargo" .github/workflows

# Search for feature flags usage in CI workflows
echo -e "\nFeature flags in workflows:"
rg -n "features" .github/workflows

Length of output: 2971


🏁 Script executed:

#!/bin/bash
# Inspect cargo commands in generate-docs.yml
echo "Cargo commands in generate-docs.yml:"
rg -n "run:.*cargo" .github/workflows/generate-docs.yml || echo "No cargo commands found."

# Inspect cargo commands in trigger_ci.yml
echo -e "\nCargo commands in trigger_ci.yml:"
rg -n "run:.*cargo" .github/workflows/trigger_ci.yml || echo "No cargo commands found."

Length of output: 396


Confirm CI, packaging, and docs updated for llama.cpp default
Default features now include llamacpp, enabling llama.cpp support by default. Please verify that:

  • CI workflows (.github/workflows/pre-merge-rust.yml, etc.) install any native dependencies required by the llamacpp engine (e.g. libclang-dev, git).
  • Packaging or publish workflows (if any) have been updated to build with the new default features.
  • Downstream documentation (docs/guides/dynamo_run.md) reflects that disabling default features (--no-default-features) now drops both mistralrs and llamacpp (falling back to vllm).

You can catch any remaining --no-default-features examples with:

rg --ignore-case "no-default-features"
container/Dockerfile.vllm (2)

223-226: Install genai-perf within the virtual environment
Moving genai-perf installation under the uv venv is correct. Ensure the virtualenv activation is valid for non-interactive shells (. works), or simplify with:

RUN uv pip install genai-perf==$GENAI_PERF_VERSION

370-374:

❓ Verification inconclusive

Unified workspace Cargo build
Building the entire workspace with --workspace and the block-manager feature streamlines the process. Please verify that only intended crates are produced and no ancillary workspace members are inadvertently compiled.

You can test with:


🏁 Script executed:

docker build . --target wheel_builder --progress=plain

Length of output: 153


Unified workspace Cargo build
Building the entire workspace with --workspace and the block-manager feature streamlines the process. Please manually verify that only the intended crates are produced and no ancillary workspace members are compiled:

• Run the build target locally:

docker build . --target wheel_builder --progress=plain

• Inspect the output to confirm only your desired crates were built.

container/Dockerfile.tensorrt_llm (1)

277-281: Workspace-wide Cargo build
Leveraging --workspace with the block-manager feature consolidates builds. Please confirm no extraneous workspace members are compiled.

container/Dockerfile.sglang (1)

297-301: Unified workspace build is correctly simplified
The new cargo build --workspace replaces multiple package-specific invocations with a single command, making the build step more maintainable and consistent with the other Dockerfiles in this PR.

--exclude file://$PWD/components/http \
--exclude metrics
# Build C bindings, creates lib/bindings/c/include
RUN cd /workspace/lib/bindings/c && cargo build --release
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @nv-anants on single cargo build change - not familiar with the context for the original change that split it into two builds.

@@ -224,6 +220,10 @@ COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
RUN mkdir /opt/dynamo && \
uv venv /opt/dynamo/venv --python 3.12

# Install genai-perf for benchmarking
RUN . /opt/dynamo/venv/bin/activate && \
uv pip install genai-perf==$GENAI_PERF_VERSION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I touched similar part of the file here: #1256

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants