Skip to content

Commit 2ee9bf3

Browse files
committed
RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()
This three thread race can result in the work being run once the callback becomes NULL: CPU1 CPU2 CPU3 netevent_callback() process_one_req() rdma_addr_cancel() [..] spin_lock_bh() set_timeout() spin_unlock_bh() spin_lock_bh() list_del_init(&req->list); spin_unlock_bh() req->callback = NULL spin_lock_bh() if (!list_empty(&req->list)) // Skipped! // cancel_delayed_work(&req->work); spin_unlock_bh() process_one_req() // again req->callback() // BOOM cancel_delayed_work_sync() The solution is to always cancel the work once it is completed so any in between set_timeout() does not result in it running again. Cc: stable@vger.kernel.org Fixes: 44e7505 ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence") Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org Reported-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
1 parent a6f0b08 commit 2ee9bf3

1 file changed

Lines changed: 5 additions & 6 deletions

File tree

drivers/infiniband/core/addr.c

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -647,13 +647,12 @@ static void process_one_req(struct work_struct *_work)
647647
req->callback = NULL;
648648

649649
spin_lock_bh(&lock);
650+
/*
651+
* Although the work will normally have been canceled by the workqueue,
652+
* it can still be requeued as long as it is on the req_list.
653+
*/
654+
cancel_delayed_work(&req->work);
650655
if (!list_empty(&req->list)) {
651-
/*
652-
* Although the work will normally have been canceled by the
653-
* workqueue, it can still be requeued as long as it is on the
654-
* req_list.
655-
*/
656-
cancel_delayed_work(&req->work);
657656
list_del_init(&req->list);
658657
kfree(req);
659658
}

0 commit comments

Comments
 (0)