Lack of data validation In vllm
Description
vLLM Vulnerable to Remote DoS via Special-Token Placeholders
Summary
This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. Severity: High (remote DoS). Reproduced on vLLM 0.10.0 with Qwen2.5-VL.
Details
Affected component: multimodal input position computation.
File/functions (paths are indicative):
vllm/model_executor/layers/rotary_embedding.py
get_input_positions_tensor(...)
_vl_get_input_positions_tensor(...)
Failure mechanism:
The code counts detected vision tokens and then indexes video_grid_thw/image_grid_thw accordingly.
When user input carries placeholder tokens but no actual multimodal payload, these grids are empty. The code does not bounds-check before indexing.
Representative snippet (context):
# vllm/model_executor/layers/rotary_embedding.py @classmethod def _vl_get_input_positions_tensor( cls, input_tokens, hf_config, image_grid_thw, video_grid_thw,...
Abbreviated call path:
OpenAI API request → vllm.v1.engine.core: step/execute_model → vllm.v1.worker.gpu_model_runner: _update_states/execute_model → vllm.model_executor.layers.rotary_embedding: get_input_positions_tensor → _vl_get_input_positions_tensor → IndexError: list index out of range
PoC
Environment
vLLM: 0.10.0
Model: Qwen/Qwen2.5-VL-3B-Instruct
Launch server:
python -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen2.5-VL-3B-Instruct \ --port 8000
Request (text-only, no image/video data)
cat > request.json <<'JSON' { "model": "Qwen/Qwen2.5-VL-3B-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text",...
Observed result
HTTP 500; logs show IndexError: list index out of range from _vl_get_input_positions_tensor(...).
In some deployments, the worker exits and capacity remains reduced until manual restart.
Impact
Type: Token Injection leading to Remote Denial of Service (unauthenticated). A single request can trigger the fault.
Scope: Any vLLM deployment that serves VLMs and accepts raw user text via OpenAI-compatible endpoints (self-hosted or proxied/managed fronts).
Effect: Request → unhandled exception in position computation → worker termination / service unavailability.
Fixes
Changes associated with https://github.com/vllm-project/vllm/issues/32656
Credits
Pengyu Ding (Infra Security, Ant Group)
Ziteng Xu (Infra Security, Ant Group)
Mitigation
Update Impact
Minimal update. May introduce new vulnerabilities or breaking changes.
Ecosystem | Package | Affected version | Patched versions |
|---|---|---|---|
pypi | vllm | 0.20.0 |
Aliases
References