CVE-2025-46570

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.

CVSS v3 2.6 LOW

2.6^/10

CVSS v3 : LOW

V3 Legend

Vector :

Exploitability : 1.2 / Impact : 1.4

Attack Vector NETWORK

Attack Complexity HIGH

Privileges Required LOW

User Interaction REQUIRED

Confidentiality Impact LOW

Integrity Impact NONE

Availability Impact NONE

Scope UNCHANGED

References

Link	Resource
https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f	Patch
https://github.com/vllm-project/vllm/pull/17045	Issue Tracking Vendor Advisory
https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r	Vendor Advisory

Configurations

Configuration 1 (hide)

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*

History

24 Jun 2025, 18:25

Type	Values Removed	Values Added
References	~~() https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f -~~	() https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f - Patch
References	~~() https://github.com/vllm-project/vllm/pull/17045 -~~	() https://github.com/vllm-project/vllm/pull/17045 - Issue Tracking, Vendor Advisory
References	~~() https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r -~~	() https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r - Vendor Advisory
CWE		CWE-203
First Time		Vllm vllm Vllm
CPE		cpe:2.3:a:vllm:vllm::::::::

30 May 2025, 16:31

Type	Values Removed	Values Added
Summary		(es) vLLM es un motor de inferencia y entrega para modelos de lenguaje grandes (LLM). Antes de la versión 0.9.0, al procesar una nueva solicitud, si el mecanismo PageAttention encuentra un fragmento de prefijo coincidente, el proceso de precompletado se acelera, lo que se refleja en el TTFT (Tiempo hasta el Primer Token). Estas diferencias de tiempo causadas por la coincidencia de fragmentos son lo suficientemente significativas como para ser detectadas y explotadas. Este problema se ha corregido en la versión 0.9.0.

29 May 2025, 17:15

Type	Values Removed	Values Added
New CVE

Information

Published : 2025-05-29 17:15

Updated : 2025-06-24 18:25

NVD link : CVE-2025-46570

Mitre link : CVE-2025-46570

CVE.ORG link : CVE-2025-46570

JSON object : View

Products Affected

vllm

vllm

CWE

CWE-208

Observable Timing Discrepancy

CWE-203

Observable Discrepancy

2.6 /10