CVE-2025-49847

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.

CVSS v3 8.8 HIGH

8.8^/10

CVSS v3 : HIGH

V3 Legend

Vector :

Exploitability : 2.8 / Impact : 5.9

Attack Vector NETWORK

Attack Complexity LOW

Privileges Required NONE

User Interaction REQUIRED

Confidentiality Impact HIGH

Integrity Impact HIGH

Availability Impact HIGH

Scope UNCHANGED

References

Link	Resource
https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5	Patch
https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr	Mitigation Vendor Advisory

Configurations

Configuration 1 (hide)

cpe:2.3:a:ggml:llama.cpp:*:*:*:*:*:*:*:*

History

27 Aug 2025, 13:48

Type	Values Removed	Values Added
Summary		(es) llama.cpp es una inferencia de varios modelos LLM en C/C++. Antes de la versión b5662, un vocabulario de modelo GGUF proporcionado por un atacante podía provocar un desbordamiento de búfer en el código de carga de vocabulario de llama.cpp. Específicamente, el asistente _try_copy en llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() convierte una longitud de token size_t muy grande en un int32_t, lo que provoca que se omita la comprobación de longitud (si (length < (int32_t)size)). Como resultado, se sigue llamando a memcpy con ese tamaño excesivo, lo que permite que un modelo malicioso sobrescriba la memoria más allá del búfer previsto. Esto puede provocar corrupción de memoria arbitraria y la posible ejecución de código. Este problema se ha corregido en la versión b5662.
First Time		Ggml Ggml llama.cpp
CPE		cpe:2.3:a:ggml:llama.cpp::::::::
References	~~() https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5 -~~	() https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5 - Patch
References	~~() https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr -~~	() https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr - Mitigation, Vendor Advisory

17 Jun 2025, 20:15

Type	Values Removed	Values Added
New CVE

Information

Published : 2025-06-17 20:15

Updated : 2025-08-27 13:48

NVD link : CVE-2025-49847

Mitre link : CVE-2025-49847

CVE.ORG link : CVE-2025-49847

JSON object : View

Products Affected

ggml

llama.cpp

CWE

CWE-119

Improper Restriction of Operations within the Bounds of a Memory Buffer

CWE-195

Signed to Unsigned Conversion Error

8.8 /10