CVE-2025-21880

In the Linux kernel, the following vulnerability has been resolved: drm/xe/userptr: fix EFAULT handling Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal. However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT. This explains an internal user report of hitting: [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] <TASK> [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report. v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue. (cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)
Configurations

Configuration 1 (hide)

OR cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc1:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc2:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc3:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc4:*:*:*:*:*:*

History

30 Oct 2025, 15:44

Type Values Removed Values Added
Summary
  • (es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: drm/xe/userptr: corrección de la gestión de EFAULT Actualmente tratamos EFAULT de hmm_range_fault() como un error no fatal cuando se llama desde xe_vm_userptr_pin() con la idea de que queremos evitar matar toda la máquina virtual y arrojar un error, bajo el supuesto de que el usuario solo hizo una desasignación o algo así, y no tiene intención de tocar esa memoria de la GPU. En este punto, ya hemos eliminado los PTE, por lo que cualquier acceso debería generar un fallo de página, y si el pin también falla allí, se volverá fatal. Sin embargo, parece que es posible que la vma userptr aún esté en la lista de revinculación en preempt_rebind_work_func(), si tuviéramos que volver a intentar el pin debido a que algo sucede en el llamador antes de realizar el paso de revinculación, pero mientras tanto necesitamos volver a validar el userptr y esta vez golpeando el EFAULT. Esto explica un informe interno de usuario sobre el resultado: [ 191.738349] ADVERTENCIA: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Seguido de NPD, al ejecutar alguna carga de trabajo, ya que el grupo de secuencias nunca se rellenó, pero el administrador de máquinas virtuales (VMMA) sigue marcado para revincular cuando debería omitirse para este caso especial de EFAULT. Esto se ha confirmado para corregir el informe del usuario. v2 (MattB): - Se ha movido a una versión anterior. v3 (MattB): - Actualizar el mensaje de confirmación para dejar claro que esto realmente soluciona el problema. (seleccionado del commit 6b93cb98910c826c2e2004942f8b060311e43618)
CWE NVD-CWE-noinfo
First Time Linux linux Kernel
Linux
References () https://git.kernel.org/stable/c/51cc278f8ffacd5f9dc7d13191b81b912829db59 - () https://git.kernel.org/stable/c/51cc278f8ffacd5f9dc7d13191b81b912829db59 - Patch
References () https://git.kernel.org/stable/c/a9f4fa3a7efa65615ff7db13023ac84516e99e21 - () https://git.kernel.org/stable/c/a9f4fa3a7efa65615ff7db13023ac84516e99e21 - Patch
References () https://git.kernel.org/stable/c/daad16d0a538fa938e344fd83927bbcfcd8a66ec - () https://git.kernel.org/stable/c/daad16d0a538fa938e344fd83927bbcfcd8a66ec - Patch
CVSS v2 : unknown
v3 : unknown
v2 : unknown
v3 : 5.5
CPE cpe:2.3:o:linux:linux_kernel:6.14:rc2:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc3:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc1:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.14:rc4:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*

27 Mar 2025, 15:15

Type Values Removed Values Added
New CVE

Information

Published : 2025-03-27 15:15

Updated : 2025-10-30 15:44


NVD link : CVE-2025-21880

Mitre link : CVE-2025-21880

CVE.ORG link : CVE-2025-21880


JSON object : View

Products Affected

linux

  • linux_kernel