Skip to content

nbd: fix circular lock dependency in nbd_reconnect_socket#1020

Open
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1117874=>linus-master
Open

nbd: fix circular lock dependency in nbd_reconnect_socket#1020
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1117874=>linus-master

Conversation

@blktests-ci

@blktests-ci blktests-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown

Pull request for series with
subject: nbd: fix circular lock dependency in nbd_reconnect_socket
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1117874

@blktests-ci

blktests-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown
Author

Upstream branch: 4edcdef
series: https://patchwork.kernel.org/project/linux-block/list/?series=1117874
version: 1

@blktests-ci

blktests-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown
Author

Upstream branch: dc59e4f
series: https://patchwork.kernel.org/project/linux-block/list/?series=1117874
version: 1

Move sk_set_memalloc() out of the tx_lock critical section in
nbd_reconnect_socket() to break a circular lock dependency.

sk_set_memalloc() internally calls static_branch_inc() which acquires
cpu_hotplug_lock. When called under tx_lock, this creates the dependency:

  tx_lock -> cpu_hotplug_lock

The lockdep splat shows the following circular chain:

  cmd->lock -> tx_lock
    from nbd_queue_rq() in the block I/O dispatch path.

  tx_lock -> cpu_hotplug_lock
    from nbd_reconnect_socket() calling sk_set_memalloc() under tx_lock.

  cpu_hotplug_lock -> fs_reclaim -> q_usage_counter(io)
    from create_worker() during CPU hotplug needing memory allocation,
    which depends on block I/O completion to reclaim memory.

  q_usage_counter(io) -> elevator_lock
    from nbd_start_device() -> blk_mq_update_nr_hw_queues() ->
    elevator_change() which freezes the queue then acquires elevator_lock.

  elevator_lock -> set->srcu -> cmd->lock
    from elevator_switch() -> blk_mq_quiesce_queue() waiting for srcu,
    which waits for the I/O dispatch path holding cmd->lock.

Fix this by moving sk_set_memalloc() and sk_sndtimeo setup before the
tx_lock acquisition. This is safe because the new socket has not yet
been assigned to nsock->sock and is invisible to other code paths. In
the failure path (no dead connection found), sk_clear_memalloc() is
called to undo the setup before releasing the socket.

Fixes: b7aa3d3 ("nbd: add a reconfigure netlink command")
Reported-by: syzbot+3dbc6142c85cc77eaf04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=3dbc6142c85cc77eaf04
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
@blktests-ci blktests-ci Bot force-pushed the series/1117874=>linus-master branch from f08ad72 to 94d29d1 Compare June 29, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant