close
Skip to content

feat: support vLLM nightly builds via wheels.vllm.ai#736

Merged
doringeman merged 2 commits intomainfrom
vllm-from-commit
Mar 9, 2026
Merged

feat: support vLLM nightly builds via wheels.vllm.ai#736
doringeman merged 2 commits intomainfrom
vllm-from-commit

Conversation

@doringeman
Copy link
Copy Markdown
Contributor

@doringeman doringeman commented Mar 5, 2026

Install vLLM from https://wheels.vllm.ai/{VLLM_VERSION}/{VLLM_CUDA_VERSION} instead of GitHub Releases, allowing nightly builds to be used via make docker-run-vllm VLLM_VERSION=nightly (or pinned to a specific commit hash for reproducible builds).

vLLM stable releases (0.16.x) do not yet support Qwen3.5 (#731) — support is available on the main branch ahead of 0.17.0. vLLM publishes pre-built wheels for every merged commit at wheels.vllm.ai, which this change allows us to use.

Tested in https://github.com/docker/model-runner/actions/runs/22712921339.

Usage

  • Latest nightly
make docker-run-vllm VLLM_VERSION=nightly

E.g.,

$ docker model status | grep vllm
vllm       Running        vllm 0.16.1rc1.dev268+ge2b31243c
  • Pinned to a specific commit (recommended for reproducible builds)
make docker-run-vllm VLLM_VERSION=e2b31243c092e9f4ade5ffe4bf9a5d5ddae06ca7

E.g., (intentionally the same commit as nightly)

$ docker model status | grep vllm
vllm       Running        vllm 0.16.1rc1.dev268+ge2b31243c
  • Default stable release (unchanged)
make docker-run-vllm

E.g.,

$ docker model status | grep vllm
vllm       Running        vllm 0.12.0

Also bump vLLM to 0.17.0. Fixes #731.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for installing vLLM from wheels.vllm.ai, enabling the use of nightly builds and specific commit hashes. However, a critical command injection vulnerability has been identified in both the Dockerfile and Makefile. This occurs because the user-provided VLLM_VERSION is used in shell commands within double quotes, allowing for shell expansion and potential arbitrary command execution, even affecting the validation check. Additionally, the current implementation has a critical issue where release versions are not correctly prefixed with v (e.g., v0.12.0) as required by wheels.vllm.ai, which will cause default builds to fail. To mitigate the command injection, it is recommended to use single quotes around variables in shell commands to prevent shell expansion.

Comment thread Makefile Outdated
Comment thread Dockerfile
Comment thread Makefile Outdated
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
@ericcurtin
Copy link
Copy Markdown
Contributor

CUDA/ROCm and Metal make sense for us to integrate, there's a really simplified installation guide here now:

https://vllm.ai/

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
@doringeman doringeman marked this pull request as ready for review March 9, 2026 09:30
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The VLLM_VERSION validation regex only allows plain x.y.z, 'nightly', or a hex commit and will reject pre-releases or other valid upstream tags (e.g., 0.17.0rc1); consider broadening this to avoid breaking future version formats from vLLM.
  • The GitHub release workflow no longer forwards VLLM_CUDA_VERSION and VLLM_PYTHON_TAG as build args, so these are now effectively fixed to the Dockerfile defaults; if different CUDA/Python combos are expected in CI, you may want to reintroduce configurable inputs for those.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The VLLM_VERSION validation regex only allows plain x.y.z, 'nightly', or a hex commit and will reject pre-releases or other valid upstream tags (e.g., 0.17.0rc1); consider broadening this to avoid breaking future version formats from vLLM.
- The GitHub release workflow no longer forwards VLLM_CUDA_VERSION and VLLM_PYTHON_TAG as build args, so these are now effectively fixed to the Dockerfile defaults; if different CUDA/Python combos are expected in CI, you may want to reintroduce configurable inputs for those.

## Individual Comments

### Comment 1
<location path="Dockerfile" line_range="109-111" />
<code_context>
-    else \
-    ~/.local/bin/uv pip install --python /opt/vllm-env/bin/python "vllm==${VLLM_VERSION}"; \
-    fi
+    && printf '%s' "${VLLM_VERSION}" | grep -qE '^(nightly|[0-9]+\.[0-9]+\.[0-9]+|[0-9a-f]{7,40})$' \
+            || { echo "Invalid VLLM_VERSION: must be a version (e.g. 0.16.0), 'nightly', or a hex commit hash"; exit 1; } \
+        && ~/.local/bin/uv pip install --python /opt/vllm-env/bin/python vllm \
+            --extra-index-url "https://wheels.vllm.ai/${VLLM_VERSION}/${VLLM_CUDA_VERSION}"

</code_context>
<issue_to_address>
**issue (bug_risk):** vLLM version is no longer pinned in the pip install, which can ignore VLLM_VERSION and pick a different release from PyPI.

With `pip install ... vllm` unpinned, pip will pick the highest version from PyPI and the extra index. If PyPI has a newer `vllm` than `https://wheels.vllm.ai/${VLLM_VERSION}/${VLLM_CUDA_VERSION}`, `VLLM_VERSION` is effectively ignored and a different release may be installed. To preserve reproducibility and ensure nightly/commit-hash flows actually use the intended wheel, please pin the version (e.g. `vllm==${VLLM_VERSION}` or another mapping from `VLLM_VERSION` to the published wheel version) so the build fails instead of silently drifting.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread Dockerfile
Comment on lines +109 to +111
&& printf '%s' "${VLLM_VERSION}" | grep -qE '^(nightly|[0-9]+\.[0-9]+\.[0-9]+|[0-9a-f]{7,40})$' \
|| { echo "Invalid VLLM_VERSION: must be a version (e.g. 0.16.0), 'nightly', or a hex commit hash"; exit 1; } \
&& ~/.local/bin/uv pip install --python /opt/vllm-env/bin/python vllm \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): vLLM version is no longer pinned in the pip install, which can ignore VLLM_VERSION and pick a different release from PyPI.

With pip install ... vllm unpinned, pip will pick the highest version from PyPI and the extra index. If PyPI has a newer vllm than https://wheels.vllm.ai/${VLLM_VERSION}/${VLLM_CUDA_VERSION}, VLLM_VERSION is effectively ignored and a different release may be installed. To preserve reproducibility and ensure nightly/commit-hash flows actually use the intended wheel, please pin the version (e.g. vllm==${VLLM_VERSION} or another mapping from VLLM_VERSION to the published wheel version) so the build fails instead of silently drifting.

@doringeman doringeman mentioned this pull request Mar 9, 2026
@doringeman doringeman merged commit d494aec into main Mar 9, 2026
11 checks passed
@doringeman doringeman deleted the vllm-from-commit branch March 9, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen3.5 models support

3 participants