CVE-2025-21732

In the Linux kernel, the following vulnerability has been resolved: RDMA/mlx5: Fix a race for an ODP MR which leads to CQE with error This patch addresses a race condition for an ODP MR that can result in a CQE with an error on the UMR QP. During the __mlx5_ib_dereg_mr() flow, the following sequence of calls occurs: mlx5_revoke_mr() mlx5r_umr_revoke_mr() mlx5r_umr_post_send_wait() At this point, the lkey is freed from the hardware's perspective. However, concurrently, mlx5_ib_invalidate_range() might be triggered by another task attempting to invalidate a range for the same freed lkey. This task will: - Acquire the umem_odp->umem_mutex lock. - Call mlx5r_umr_update_xlt() on the UMR QP. - Since the lkey has already been freed, this can lead to a CQE error, causing the UMR QP to enter an error state [1]. To resolve this race condition, the umem_odp->umem_mutex lock is now also acquired as part of the mlx5_revoke_mr() scope. Upon successful revoke, we set umem_odp->private which points to that MR to NULL, preventing any further invalidation attempts on its lkey. [1] From dmesg: infiniband rocep8s0f0: dump_cqe:277:(pid 0): WC error: 6, Message: memory bind operation error cqe_dump: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cqe_dump: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cqe_dump: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cqe_dump: 00000030: 00 00 00 00 08 00 78 06 25 00 11 b9 00 0e dd d2 WARNING: CPU: 15 PID: 1506 at drivers/infiniband/hw/mlx5/umr.c:394 mlx5r_umr_post_send_wait+0x15a/0x2b0 [mlx5_ib] Modules linked in: ip6table_mangle ip6table_natip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core CPU: 15 UID: 0 PID: 1506 Comm: ibv_rc_pingpong Not tainted 6.12.0-rc7+ #1626 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5r_umr_post_send_wait+0x15a/0x2b0 [mlx5_ib] [..] Call Trace: <TASK> mlx5r_umr_update_xlt+0x23c/0x3e0 [mlx5_ib] mlx5_ib_invalidate_range+0x2e1/0x330 [mlx5_ib] __mmu_notifier_invalidate_range_start+0x1e1/0x240 zap_page_range_single+0xf1/0x1a0 madvise_vma_behavior+0x677/0x6e0 do_madvise+0x1a2/0x4b0 __x64_sys_madvise+0x25/0x30 do_syscall_64+0x6b/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Configurations

Configuration 1 (hide)

OR cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*

History

28 Oct 2025, 20:41

Type Values Removed Values Added
CVSS v2 : unknown
v3 : unknown
v2 : unknown
v3 : 4.7
CPE cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
Summary
  • (es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: RDMA/mlx5: corrige una ejecución para un MR de ODP que conduce a un CQE con error Este parche soluciona una condición de ejecución para un MR de ODP que puede resultar en un CQE con un error en el QP de UMR. Durante el flujo __mlx5_ib_dereg_mr(), ocurre la siguiente secuencia de llamadas: mlx5_revoke_mr() mlx5r_umr_revoke_mr() mlx5r_umr_post_send_wait() En este punto, la lkey se libera desde la perspectiva del hardware. Sin embargo, al mismo tiempo, mlx5_ib_invalidate_range() podría ser activado por otra tarea que intente invalidar un rango para la misma lkey liberada. Esta tarea: - Adquirirá el bloqueo umem_odp-&gt;umem_mutex. - Llamará a mlx5r_umr_update_xlt() en el QP de UMR. - Dado que la lkey ya se ha liberado, esto puede provocar un error de CQE, lo que hace que el QP de UMR entre en un estado de error [1]. Para resolver esta condición de ejecución, el bloqueo umem_odp-&gt;umem_mutex ahora también se adquiere como parte del ámbito mlx5_revoke_mr(). Tras una revocación exitosa, configuramos umem_odp-&gt;private que apunta a ese MR en NULL, lo que evita cualquier otro intento de invalidación en su lkey. [1] De dmesg: infiniband rocep8s0f0: dump_cqe:277:(pid 0): Error de WC: 6, Mensaje: error de operación de enlace de memoria cqe_dump: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cqe_dump: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cqe_dump: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cqe_dump: 00000030: 00 00 00 00 08 00 78 06 25 00 11 b9 00 0e dd d2 ADVERTENCIA: CPU: 15 PID: 1506 en drivers/infiniband/hw/mlx5/umr.c:394 mlx5r_umr_post_send_wait+0x15a/0x2b0 [mlx5_ib] Módulos vinculados en: ip6table_mangle ip6table_natip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss superposición de oid_registry rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core CPU: 15 UID: 0 PID: 1506 Comm: ibv_rc_pingpong No contaminado 6.12.0-rc7+ #1626 Nombre del hardware: PC estándar QEMU (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 01/04/2014 RIP: 0010:mlx5r_umr_post_send_wait+0x15a/0x2b0 [mlx5_ib] [..] Llamada Rastreo: mlx5r_umr_update_xlt+0x23c/0x3e0 [mlx5_ib] mlx5_ib_invalidate_range+0x2e1/0x330 [mlx5_ib] __mmu_notifier_invalidate_range_start+0x1e1/0x240 zap_page_range_single+0xf1/0x1a0 madvise_vma_behavior+0x677/0x6e0 do_madvise+0x1a2/0x4b0 __x64_sys_madvise+0x25/0x30 do_syscall_64+0x6b/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e
First Time Linux linux Kernel
Linux
CWE CWE-362
References () https://git.kernel.org/stable/c/5297f5ddffef47b94172ab0d3d62270002a3dcc1 - () https://git.kernel.org/stable/c/5297f5ddffef47b94172ab0d3d62270002a3dcc1 - Patch
References () https://git.kernel.org/stable/c/abb604a1a9c87255c7a6f3b784410a9707baf467 - () https://git.kernel.org/stable/c/abb604a1a9c87255c7a6f3b784410a9707baf467 - Patch
References () https://git.kernel.org/stable/c/b13d32786acabf70a7b04ed24b7468fc3c82977c - () https://git.kernel.org/stable/c/b13d32786acabf70a7b04ed24b7468fc3c82977c - Patch

27 Feb 2025, 03:15

Type Values Removed Values Added
New CVE

Information

Published : 2025-02-27 03:15

Updated : 2025-10-28 20:41


NVD link : CVE-2025-21732

Mitre link : CVE-2025-21732

CVE.ORG link : CVE-2025-21732


JSON object : View

Products Affected

linux

  • linux_kernel
CWE
CWE-362

Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')