Skip to content

Commit ed016af

Browse files
committed
Merge tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar: "These are the locking updates for v5.10: - Add deadlock detection for recursive read-locks. The rationale is outlined in commit 224ec48 ("lockdep/ Documention: Recursive read lock detection reasoning") The main deadlock pattern we want to detect is: TASK A: TASK B: read_lock(X); write_lock(X); read_lock_2(X); - Add "latch sequence counters" (seqcount_latch_t): A sequence counter variant where the counter even/odd value is used to switch between two copies of protected data. This allows the read path, typically NMIs, to safely interrupt the write side critical section. We utilize this new variant for sched-clock, and to make x86 TSC handling safer. - Other seqlock cleanups, fixes and enhancements - KCSAN updates - LKMM updates - Misc updates, cleanups and fixes" * tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables" lockdep: Fix lockdep recursion lockdep: Fix usage_traceoverflow locking/atomics: Check atomic-arch-fallback.h too locking/seqlock: Tweak DEFINE_SEQLOCK() kernel doc lockdep: Optimize the memory usage of circular queue seqlock: Unbreak lockdep seqlock: PREEMPT_RT: Do not starve seqlock_t writers seqlock: seqcount_LOCKNAME_t: Introduce PREEMPT_RT support seqlock: seqcount_t: Implement all read APIs as statement expressions seqlock: Use unique prefix for seqcount_t property accessors seqlock: seqcount_LOCKNAME_t: Standardize naming convention seqlock: seqcount latch APIs: Only allow seqcount_latch_t rbtree_latch: Use seqcount_latch_t x86/tsc: Use seqcount_latch_t timekeeping: Use seqcount_latch_t time/sched_clock: Use seqcount_latch_t seqlock: Introduce seqcount_latch_t mm/swap: Do not abuse the seqcount_t latching API time/sched_clock: Use raw_read_seqcount_latch() during suspend ...
2 parents edaa5dd + 2116d70 commit ed016af

38 files changed

Lines changed: 3900 additions & 992 deletions

Documentation/locking/lockdep-design.rst

Lines changed: 258 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -392,3 +392,261 @@ Run the command and save the output, then compare against the output from
392392
a later run of this command to identify the leakers. This same output
393393
can also help you find situations where runtime lock initialization has
394394
been omitted.
395+
396+
Recursive read locks:
397+
---------------------
398+
The whole of the rest document tries to prove a certain type of cycle is equivalent
399+
to deadlock possibility.
400+
401+
There are three types of lockers: writers (i.e. exclusive lockers, like
402+
spin_lock() or write_lock()), non-recursive readers (i.e. shared lockers, like
403+
down_read()) and recursive readers (recursive shared lockers, like rcu_read_lock()).
404+
And we use the following notations of those lockers in the rest of the document:
405+
406+
W or E: stands for writers (exclusive lockers).
407+
r: stands for non-recursive readers.
408+
R: stands for recursive readers.
409+
S: stands for all readers (non-recursive + recursive), as both are shared lockers.
410+
N: stands for writers and non-recursive readers, as both are not recursive.
411+
412+
Obviously, N is "r or W" and S is "r or R".
413+
414+
Recursive readers, as their name indicates, are the lockers allowed to acquire
415+
even inside the critical section of another reader of the same lock instance,
416+
in other words, allowing nested read-side critical sections of one lock instance.
417+
418+
While non-recursive readers will cause a self deadlock if trying to acquire inside
419+
the critical section of another reader of the same lock instance.
420+
421+
The difference between recursive readers and non-recursive readers is because:
422+
recursive readers get blocked only by a write lock *holder*, while non-recursive
423+
readers could get blocked by a write lock *waiter*. Considering the follow example:
424+
425+
TASK A: TASK B:
426+
427+
read_lock(X);
428+
write_lock(X);
429+
read_lock_2(X);
430+
431+
Task A gets the reader (no matter whether recursive or non-recursive) on X via
432+
read_lock() first. And when task B tries to acquire writer on X, it will block
433+
and become a waiter for writer on X. Now if read_lock_2() is recursive readers,
434+
task A will make progress, because writer waiters don't block recursive readers,
435+
and there is no deadlock. However, if read_lock_2() is non-recursive readers,
436+
it will get blocked by writer waiter B, and cause a self deadlock.
437+
438+
Block conditions on readers/writers of the same lock instance:
439+
--------------------------------------------------------------
440+
There are simply four block conditions:
441+
442+
1. Writers block other writers.
443+
2. Readers block writers.
444+
3. Writers block both recursive readers and non-recursive readers.
445+
4. And readers (recursive or not) don't block other recursive readers but
446+
may block non-recursive readers (because of the potential co-existing
447+
writer waiters)
448+
449+
Block condition matrix, Y means the row blocks the column, and N means otherwise.
450+
451+
| E | r | R |
452+
+---+---+---+---+
453+
E | Y | Y | Y |
454+
+---+---+---+---+
455+
r | Y | Y | N |
456+
+---+---+---+---+
457+
R | Y | Y | N |
458+
459+
(W: writers, r: non-recursive readers, R: recursive readers)
460+
461+
462+
acquired recursively. Unlike non-recursive read locks, recursive read locks
463+
only get blocked by current write lock *holders* other than write lock
464+
*waiters*, for example:
465+
466+
TASK A: TASK B:
467+
468+
read_lock(X);
469+
470+
write_lock(X);
471+
472+
read_lock(X);
473+
474+
is not a deadlock for recursive read locks, as while the task B is waiting for
475+
the lock X, the second read_lock() doesn't need to wait because it's a recursive
476+
read lock. However if the read_lock() is non-recursive read lock, then the above
477+
case is a deadlock, because even if the write_lock() in TASK B cannot get the
478+
lock, but it can block the second read_lock() in TASK A.
479+
480+
Note that a lock can be a write lock (exclusive lock), a non-recursive read
481+
lock (non-recursive shared lock) or a recursive read lock (recursive shared
482+
lock), depending on the lock operations used to acquire it (more specifically,
483+
the value of the 'read' parameter for lock_acquire()). In other words, a single
484+
lock instance has three types of acquisition depending on the acquisition
485+
functions: exclusive, non-recursive read, and recursive read.
486+
487+
To be concise, we call that write locks and non-recursive read locks as
488+
"non-recursive" locks and recursive read locks as "recursive" locks.
489+
490+
Recursive locks don't block each other, while non-recursive locks do (this is
491+
even true for two non-recursive read locks). A non-recursive lock can block the
492+
corresponding recursive lock, and vice versa.
493+
494+
A deadlock case with recursive locks involved is as follow:
495+
496+
TASK A: TASK B:
497+
498+
read_lock(X);
499+
read_lock(Y);
500+
write_lock(Y);
501+
write_lock(X);
502+
503+
Task A is waiting for task B to read_unlock() Y and task B is waiting for task
504+
A to read_unlock() X.
505+
506+
Dependency types and strong dependency paths:
507+
---------------------------------------------
508+
Lock dependencies record the orders of the acquisitions of a pair of locks, and
509+
because there are 3 types for lockers, there are, in theory, 9 types of lock
510+
dependencies, but we can show that 4 types of lock dependencies are enough for
511+
deadlock detection.
512+
513+
For each lock dependency:
514+
515+
L1 -> L2
516+
517+
, which means lockdep has seen L1 held before L2 held in the same context at runtime.
518+
And in deadlock detection, we care whether we could get blocked on L2 with L1 held,
519+
IOW, whether there is a locker L3 that L1 blocks L3 and L2 gets blocked by L3. So
520+
we only care about 1) what L1 blocks and 2) what blocks L2. As a result, we can combine
521+
recursive readers and non-recursive readers for L1 (as they block the same types) and
522+
we can combine writers and non-recursive readers for L2 (as they get blocked by the
523+
same types).
524+
525+
With the above combination for simplification, there are 4 types of dependency edges
526+
in the lockdep graph:
527+
528+
1) -(ER)->: exclusive writer to recursive reader dependency, "X -(ER)-> Y" means
529+
X -> Y and X is a writer and Y is a recursive reader.
530+
531+
2) -(EN)->: exclusive writer to non-recursive locker dependency, "X -(EN)-> Y" means
532+
X -> Y and X is a writer and Y is either a writer or non-recursive reader.
533+
534+
3) -(SR)->: shared reader to recursive reader dependency, "X -(SR)-> Y" means
535+
X -> Y and X is a reader (recursive or not) and Y is a recursive reader.
536+
537+
4) -(SN)->: shared reader to non-recursive locker dependency, "X -(SN)-> Y" means
538+
X -> Y and X is a reader (recursive or not) and Y is either a writer or
539+
non-recursive reader.
540+
541+
Note that given two locks, they may have multiple dependencies between them, for example:
542+
543+
TASK A:
544+
545+
read_lock(X);
546+
write_lock(Y);
547+
...
548+
549+
TASK B:
550+
551+
write_lock(X);
552+
write_lock(Y);
553+
554+
, we have both X -(SN)-> Y and X -(EN)-> Y in the dependency graph.
555+
556+
We use -(xN)-> to represent edges that are either -(EN)-> or -(SN)->, the
557+
similar for -(Ex)->, -(xR)-> and -(Sx)->
558+
559+
A "path" is a series of conjunct dependency edges in the graph. And we define a
560+
"strong" path, which indicates the strong dependency throughout each dependency
561+
in the path, as the path that doesn't have two conjunct edges (dependencies) as
562+
-(xR)-> and -(Sx)->. In other words, a "strong" path is a path from a lock
563+
walking to another through the lock dependencies, and if X -> Y -> Z is in the
564+
path (where X, Y, Z are locks), and the walk from X to Y is through a -(SR)-> or
565+
-(ER)-> dependency, the walk from Y to Z must not be through a -(SN)-> or
566+
-(SR)-> dependency.
567+
568+
We will see why the path is called "strong" in next section.
569+
570+
Recursive Read Deadlock Detection:
571+
----------------------------------
572+
573+
We now prove two things:
574+
575+
Lemma 1:
576+
577+
If there is a closed strong path (i.e. a strong circle), then there is a
578+
combination of locking sequences that causes deadlock. I.e. a strong circle is
579+
sufficient for deadlock detection.
580+
581+
Lemma 2:
582+
583+
If there is no closed strong path (i.e. strong circle), then there is no
584+
combination of locking sequences that could cause deadlock. I.e. strong
585+
circles are necessary for deadlock detection.
586+
587+
With these two Lemmas, we can easily say a closed strong path is both sufficient
588+
and necessary for deadlocks, therefore a closed strong path is equivalent to
589+
deadlock possibility. As a closed strong path stands for a dependency chain that
590+
could cause deadlocks, so we call it "strong", considering there are dependency
591+
circles that won't cause deadlocks.
592+
593+
Proof for sufficiency (Lemma 1):
594+
595+
Let's say we have a strong circle:
596+
597+
L1 -> L2 ... -> Ln -> L1
598+
599+
, which means we have dependencies:
600+
601+
L1 -> L2
602+
L2 -> L3
603+
...
604+
Ln-1 -> Ln
605+
Ln -> L1
606+
607+
We now can construct a combination of locking sequences that cause deadlock:
608+
609+
Firstly let's make one CPU/task get the L1 in L1 -> L2, and then another get
610+
the L2 in L2 -> L3, and so on. After this, all of the Lx in Lx -> Lx+1 are
611+
held by different CPU/tasks.
612+
613+
And then because we have L1 -> L2, so the holder of L1 is going to acquire L2
614+
in L1 -> L2, however since L2 is already held by another CPU/task, plus L1 ->
615+
L2 and L2 -> L3 are not -(xR)-> and -(Sx)-> (the definition of strong), which
616+
means either L2 in L1 -> L2 is a non-recursive locker (blocked by anyone) or
617+
the L2 in L2 -> L3, is writer (blocking anyone), therefore the holder of L1
618+
cannot get L2, it has to wait L2's holder to release.
619+
620+
Moreover, we can have a similar conclusion for L2's holder: it has to wait L3's
621+
holder to release, and so on. We now can prove that Lx's holder has to wait for
622+
Lx+1's holder to release, and note that Ln+1 is L1, so we have a circular
623+
waiting scenario and nobody can get progress, therefore a deadlock.
624+
625+
Proof for necessary (Lemma 2):
626+
627+
Lemma 2 is equivalent to: If there is a deadlock scenario, then there must be a
628+
strong circle in the dependency graph.
629+
630+
According to Wikipedia[1], if there is a deadlock, then there must be a circular
631+
waiting scenario, means there are N CPU/tasks, where CPU/task P1 is waiting for
632+
a lock held by P2, and P2 is waiting for a lock held by P3, ... and Pn is waiting
633+
for a lock held by P1. Let's name the lock Px is waiting as Lx, so since P1 is waiting
634+
for L1 and holding Ln, so we will have Ln -> L1 in the dependency graph. Similarly,
635+
we have L1 -> L2, L2 -> L3, ..., Ln-1 -> Ln in the dependency graph, which means we
636+
have a circle:
637+
638+
Ln -> L1 -> L2 -> ... -> Ln
639+
640+
, and now let's prove the circle is strong:
641+
642+
For a lock Lx, Px contributes the dependency Lx-1 -> Lx and Px+1 contributes
643+
the dependency Lx -> Lx+1, and since Px is waiting for Px+1 to release Lx,
644+
so it's impossible that Lx on Px+1 is a reader and Lx on Px is a recursive
645+
reader, because readers (no matter recursive or not) don't block recursive
646+
readers, therefore Lx-1 -> Lx and Lx -> Lx+1 cannot be a -(xR)-> -(Sx)-> pair,
647+
and this is true for any lock in the circle, therefore, the circle is strong.
648+
649+
References:
650+
-----------
651+
[1]: https://en.wikipedia.org/wiki/Deadlock
652+
[2]: Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill

Documentation/locking/seqlock.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,24 @@ with the associated LOCKTYPE lock acquired.
139139

140140
Read path: same as in :ref:`seqcount_t`.
141141

142+
143+
.. _seqcount_latch_t:
144+
145+
Latch sequence counters (``seqcount_latch_t``)
146+
----------------------------------------------
147+
148+
Latch sequence counters are a multiversion concurrency control mechanism
149+
where the embedded seqcount_t counter even/odd value is used to switch
150+
between two copies of protected data. This allows the sequence counter
151+
read path to safely interrupt its own write side critical section.
152+
153+
Use seqcount_latch_t when the write side sections cannot be protected
154+
from interruption by readers. This is typically the case when the read
155+
side can be invoked from NMI handlers.
156+
157+
Check `raw_write_seqcount_latch()` for more information.
158+
159+
142160
.. _seqlock_t:
143161

144162
Sequential locks (``seqlock_t``)

arch/x86/kernel/tsc.c

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ struct clocksource *art_related_clocksource;
5454

5555
struct cyc2ns {
5656
struct cyc2ns_data data[2]; /* 0 + 2*16 = 32 */
57-
seqcount_t seq; /* 32 + 4 = 36 */
57+
seqcount_latch_t seq; /* 32 + 4 = 36 */
5858

5959
}; /* fits one cacheline */
6060

@@ -73,14 +73,14 @@ __always_inline void cyc2ns_read_begin(struct cyc2ns_data *data)
7373
preempt_disable_notrace();
7474

7575
do {
76-
seq = this_cpu_read(cyc2ns.seq.sequence);
76+
seq = this_cpu_read(cyc2ns.seq.seqcount.sequence);
7777
idx = seq & 1;
7878

7979
data->cyc2ns_offset = this_cpu_read(cyc2ns.data[idx].cyc2ns_offset);
8080
data->cyc2ns_mul = this_cpu_read(cyc2ns.data[idx].cyc2ns_mul);
8181
data->cyc2ns_shift = this_cpu_read(cyc2ns.data[idx].cyc2ns_shift);
8282

83-
} while (unlikely(seq != this_cpu_read(cyc2ns.seq.sequence)));
83+
} while (unlikely(seq != this_cpu_read(cyc2ns.seq.seqcount.sequence)));
8484
}
8585

8686
__always_inline void cyc2ns_read_end(void)
@@ -186,7 +186,7 @@ static void __init cyc2ns_init_boot_cpu(void)
186186
{
187187
struct cyc2ns *c2n = this_cpu_ptr(&cyc2ns);
188188

189-
seqcount_init(&c2n->seq);
189+
seqcount_latch_init(&c2n->seq);
190190
__set_cyc2ns_scale(tsc_khz, smp_processor_id(), rdtsc());
191191
}
192192

@@ -203,7 +203,7 @@ static void __init cyc2ns_init_secondary_cpus(void)
203203

204204
for_each_possible_cpu(cpu) {
205205
if (cpu != this_cpu) {
206-
seqcount_init(&c2n->seq);
206+
seqcount_latch_init(&c2n->seq);
207207
c2n = per_cpu_ptr(&cyc2ns, cpu);
208208
c2n->data[0] = data[0];
209209
c2n->data[1] = data[1];

0 commit comments

Comments
 (0)