Skip to content

Commit 8823eae

Browse files
leitaohtejun
authored andcommitted
workqueue: Show all busy workers in stall diagnostics
show_cpu_pool_hog() only prints workers whose task is currently running on the CPU (task_is_running()). This misses workers that are busy processing a work item but are sleeping or blocked — for example, a worker that clears PF_WQ_WORKER and enters wait_event_idle(). Such a worker still occupies a pool slot and prevents progress, yet produces an empty backtrace section in the watchdog output. This is happening on real arm64 systems, where toggle_allocation_gate() IPIs every single CPU in the machine (which lacks NMI), causing workqueue stalls that show empty backtraces because toggle_allocation_gate() is sleeping in wait_event_idle(). Remove the task_is_running() filter so every in-flight worker in the pool's busy_hash is dumped. The busy_hash is protected by pool->lock, which is already held. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
1 parent e8e14ac commit 8823eae

1 file changed

Lines changed: 13 additions & 15 deletions

File tree

kernel/workqueue.c

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7583,9 +7583,9 @@ MODULE_PARM_DESC(panic_on_stall_time, "Panic if stall exceeds this many seconds
75837583

75847584
/*
75857585
* Show workers that might prevent the processing of pending work items.
7586-
* The only candidates are CPU-bound workers in the running state.
7587-
* Pending work items should be handled by another idle worker
7588-
* in all other situations.
7586+
* A busy worker that is not running on the CPU (e.g. sleeping in
7587+
* wait_event_idle() with PF_WQ_WORKER cleared) can stall the pool just as
7588+
* effectively as a CPU-bound one, so dump every in-flight worker.
75897589
*/
75907590
static void show_cpu_pool_hog(struct worker_pool *pool)
75917591
{
@@ -7596,19 +7596,17 @@ static void show_cpu_pool_hog(struct worker_pool *pool)
75967596
raw_spin_lock_irqsave(&pool->lock, irq_flags);
75977597

75987598
hash_for_each(pool->busy_hash, bkt, worker, hentry) {
7599-
if (task_is_running(worker->task)) {
7600-
/*
7601-
* Defer printing to avoid deadlocks in console
7602-
* drivers that queue work while holding locks
7603-
* also taken in their write paths.
7604-
*/
7605-
printk_deferred_enter();
7599+
/*
7600+
* Defer printing to avoid deadlocks in console
7601+
* drivers that queue work while holding locks
7602+
* also taken in their write paths.
7603+
*/
7604+
printk_deferred_enter();
76067605

7607-
pr_info("pool %d:\n", pool->id);
7608-
sched_show_task(worker->task);
7606+
pr_info("pool %d:\n", pool->id);
7607+
sched_show_task(worker->task);
76097608

7610-
printk_deferred_exit();
7611-
}
7609+
printk_deferred_exit();
76127610
}
76137611

76147612
raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
@@ -7619,7 +7617,7 @@ static void show_cpu_pools_hogs(void)
76197617
struct worker_pool *pool;
76207618
int pi;
76217619

7622-
pr_info("Showing backtraces of running workers in stalled CPU-bound worker pools:\n");
7620+
pr_info("Showing backtraces of busy workers in stalled CPU-bound worker pools:\n");
76237621

76247622
rcu_read_lock();
76257623

0 commit comments

Comments
 (0)