Proving That Non Blocking Algorithms Don/'t Block
Proving That Non-Blocking Algorithms Don’t Block
Alexey Gotsman
Byron Cook
Matthew Parkinson
Viktor Vafeiadis
University of Cambridge
Microsoft Research
University of Cambridge
Microsoft Research
Abstract
Lock-freedom [23]: From any point in a program’s execution,
A concurrent data-structure implementation is considered non-
some thread is guaranteed to complete its operation. Lock-
blocking if it meets one of three following liveness criteria: wait-
freedom ensures the absence of livelock, but not starvation.
freedom, lock-freedom, or obstruction-freedom. Developers of non-
Obstruction-freedom [16]: Every thread is guaranteed to com-
blocking algorithms aim to meet these criteria. However, to date
plete its operation provided it eventually executes in isolation.
their proofs for non-trivial algorithms have been only manual
In other words, if at some point in a program’s execution we
pencil-and-paper semi-formal proofs. This paper proposes the first
suspend all threads except one, then this thread’s operation will
fully automatic tool that allows developers to ensure that their algo-
terminate.
rithms are indeed non-blocking. Our tool uses rely-guarantee rea-
soning while overcoming the technical challenge of sound reason-
The design of a non-blocking algorithm largely depends on which
ing in the presence of interdependent liveness properties.
of the above three criteria it satisfies. Thus, algorithm developers
aim to meet one of these criteria and correspondingly classify the
Categories and Subject Descriptors
D.2.4 [Software Engineer-
algorithms as wait-free, lock-free, or obstruction-free (e.g., [14, 16,
ing]: Software/Program Verification; F.3.1 [Logics and Meanings
25]). To date, proofs of the liveness properties for non-trivial cases
of Programs]: Specifying and Verifying and Reasoning about Pro-
have been only manual pencil-and-paper semi-formal proofs. This
grams
paper proposes the first fully automatic tool that allows developers
General Terms
Languages, Theory, Verification
to ensure that their algorithms are indeed non-blocking.
Reasoning about concurrent programs is difficult because of the
Keywords
Formal Verification, Concurrent Programming, Live-
need to consider all possible interactions between concurrently exe-
ness, Termination
cuting threads. This is especially true for non-blocking algorithms,
in which threads interact in subtle ways through dynamically-
allocated data structures. To combat this difficulty, we based our
1.
Introduction
tool on rely-guarantee reasoning [18, 29], which considers every
Non-blocking synchronisation is a style of multithreaded program-
thread in isolation under some assumptions on its environment and
ming that avoids the blocking inherent to lock-based mutual ex-
thus avoids reasoning about thread interactions directly. Much of
clusion. Instead, alternative synchronisation techniques are used,
rely-guarantee’s power comes from cyclic proof rules for safety;
which aim to provide certain progress guarantees even if some
straightforward generalisations of such proof rules to liveness prop-
threads are delayed for arbitrarily long. These techniques are pri-
erties are unsound [1]. Unfortunately, in our application, we have to
marily employed by concurrent implementations of data structures,
deal with interdependencies among liveness properties of threads in
such as stacks, queues, linked lists, and hash tables (see, for ex-
the program: validity of liveness properties of a thread can depend
ample, the java.util.concurrent library). Non-blocking data
on liveness properties of another thread and vice versa. We resolve
structures are generally much more complex than their lock-based
this apparent circularity by showing that (at least for all of the al-
counterparts, but can provide better performance in the presence of
gorithms that we have examined) proofs can found that layer non-
high contention between threads [38].
circular liveness reasoning on top of weak circular reasoning about
An algorithm implementing operations on a concurrent data
safety. We propose a method for performing such proofs by repeat-
structure is considered non-blocking if it meets one of three com-
edly strengthening threads’ guarantees using non-circular reason-
monly accepted liveness criteria that ensure termination of the op-
ing until they imply the required liveness property (Section 2). We
erations under various conditions:
develop a logic that allows us to easily express these layered proofs
for heap-manipulating programs (Sections 3 and 4) and prove it
Wait-freedom [15]: Every running thread is guaranteed to com-
sound with respect to an interleaving semantics (Section 6).
plete its operation, regardless of the execution speeds of the
In addition, we have found that the rely and guarantee condi-
other threads. Wait-freedom ensures the absence of livelock and
tions needed for proving algorithms non-blocking can be of a re-
starvation.
stricted form: they need only require that certain events do not hap-
pen infinitely often. This allows us to automate proving the liveness
properties by a procedure that systematically searches for proofs in
our logic with relies and guarantees of this form (Section 5).
Permission to make digital or hard copies of all or part of this work for personal or
Using our tool, we have automatically proved a number of the
classroom use is granted without fee provided that copies are not made or distributed
published algorithms to be formally non-blocking, including chal-
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
lenging examples such as the HSY stack [14] and Michael’s linked
to lists, requires prior specific permission and/or a fee.
list algorithm [25]. Proofs for some of the verified algorithms re-
POPL’09,
January 18–24, 2009, Savannah, Georgia, USA.
quire complex termination arguments and supporting safety prop-
Copyright c 2009 ACM 978-1-60558-379-2/09/01. . . $5.00
erties that are best constructed by automatic tools.
2.
Informal development
struct Node {
void init() {
We start by informally describing our method for verifying liveness
value_t data;
S = NULL;
properties and surveying the main results of the paper.
Node *next;
}
};
Example.
Figure 1 contains a simple non-blocking implemen-
Node *S;
value_t pop() {
tation of a concurrent stack due to Treiber [33], written in a C-
Node *t, *x;
like language. A client using the implementation can call several
void push(value_t v) {
do {
push or pop operations concurrently. To ensure the correctness of
Node *t, *x;
t = S;
the algorithm, we assume that it is executed in the presence of a
x = new Node();
if (t == NULL) {
garbage collector (see [17, Section 10.6] for justification). We also
x->data = v;
return EMPTY;
assume that single word reads and writes are executed atomically.
do {
}
The stack is stored as a linked list, and is updated by compare-and-
t = S;
x = t->next;
swap (CAS) instructions. CAS takes three arguments: a memory
x->next = t;
} while(!CAS(&S,t,x));
address, an expected value and a new value. It atomically reads the
} while(!CAS(&S,t,x));
return t->data;
memory address and updates it with the new value when the ad-
}
}
dress contains the expected value; otherwise, it does nothing. In C
syntax this might be written as follows:
Figure 1. Treiber’s non-blocking stack
int CAS(WORD *addr, WORD v1, WORD v2) {
atomic {
if (*addr == v1) { *addr = v2; return 1; }
We denote non-deterministic choice with nondet(). The definition
else { return 0; }
of lock-freedom of the data structure requires that for all m in any
}
(infinite) execution of the data structure’s most general client C(m)
}
defined below, some operation returns infinitely often:
m
In most architectures an efficient CAS (or an equivalent operation)
C(m) =
while (true) { op }
is provided natively by the processor.
i=1
The operations on the stack are implemented as follows. The
We now show that this is the case if and only if for all k the
function init initialises the data structure. The push operation (i)
following program C (k) terminates:
allocates a new node x; (ii) reads the current value of the top-of-the-
stack pointer S; (iii) makes the next field of the newly created node
k
point to the read value of
C (k) =
S; and (iv) atomically updates the top-of-
op
(2.2)
i=1
the-stack pointer with the new value x. If the pointer has changed
between (ii) and (iv) and has not been restored to its initial value,
The proof in the “only if” direction is by contrapositive: a non-
the CAS fails and the operation is restarted. The pop operation is
terminating execution of C (k) can be straightforwardly mapped to
implemented in a similar way.
an execution of C(k) violating lock-freedom in which the while
loops make at most one iteration executing the same operations
Liveness properties of non-blocking algorithms.
Notice that a
with the same parameters as in C (k). For the “if” direction note
push or pop operation of Treiber’s stack may not terminate if
that any infinite execution of C(m) violating lock-freedom has
other threads are continually modifying S: in this case the CAS
only finitely many (say, k) operations started: those that complete
instruction may always fail, which will cause the operation to
successfully, those that are suspended by the scheduler and never
restart continually. Thus, the algorithm is not wait-free. However, it
resumed again, and those that do not terminate. Such an execution
is lock-free: if push and pop operations execute concurrently, some
can then be mapped to a non-terminating execution of C (k), in
operation will always terminate.
which the operations are completed, suspended or non-terminating
We note that an additional requirement in the definitions of the
as above.
liveness properties given in Section 1 is that the properties have
Thus, to check lock-freedom of an algorithm, we have to check
to be satisfied under any scheduler, including an unfair one that
the termination of an arbitrary number of its operations running in
suspends some threads and never resumes them again: in this case
parallel.
the remaining threads still have to satisfy the liveness properties.
The properties form a hierarchy [10]: if an algorithm is wait-free,
Rely-guarantee reasoning and interference specifications.
We
it is also lock-free, and if it is lock-free, it is also obstruction-
prove termination of the program C (k) using rely-guarantee rea-
free. Note also that even the weakest property, obstruction-freedom,
soning [18, 29]. Rely-guarantee avoids direct reasoning about all
prevents the use of spinlocks, because if a thread has acquired a
possible thread interactions in a concurrent program by specifying
lock and is then suspended, another thread may loop forever trying
a relation (the guarantee condition) for every thread restricting how
to acquire that lock.
it can change the program state. For any given thread, the union of
We first describe our approach for verifying lock-freedom.
the guarantee conditions of all the other threads in the program (its
rely condition) restricts how those threads can interfere with it, and
Reducing lock-freedom to termination.
We show that the check-
hence, allows reasoning about this thread in isolation.
ing of lock-freedom can be reduced to the checking of termina-
The logic we develop in this paper uses a variant of rely-
tion in the spirit of [36]. Consider a non-blocking data structure
guarantee reasoning proposed in RGSep [35]—a logic for reason-
with operations op1, . . . , opn. Let op be the command that non-
ing about safety properties of concurrent heap-manipulating pro-
deterministically executes one of the operations on the data struc-
grams, which combines rely-guarantee reasoning with separation
ture with arbitrary parameters:
logic. RGSep partitions the program heap into several thread-local
parts (each of which can only be accessed by a given thread) and the
op =
shared part (which can be accessed by all threads). The partition-
if (nondet()) op1; else if (nondet()) op2; . . . else opn;
ing is defined by proofs in the logic: an assertion in the code of a
(2.1)
thread restricts its local state and the shared state. Additionally, the
partitioning is dynamic, meaning that we can use ownership trans-
two contending push and pop operations can eliminate each other
fer to move some part of the local state into the shared state and
without modifying the stack if pop returns the value that push is
vice versa. Rely and guarantee conditions are then specified with
trying to insert. An operation determines the existence of another
sets of actions, which are relations on the shared state determin-
operation it could eliminate itself with by selecting a random slot
ing how the threads change it. This is in contrast with the original
pos in the collision array, and atomically reading that slot and
rely-guarantee method, in which rely and guarantee conditions are
overwriting it with its thread identifier MYID. The identifier of an-
relations on the whole program state. Thus, while reasoning about
other thread read from the array can be subsequently used to per-
a thread, we do not have to consider local states of other threads.
form elimination. The corresponding code does not affect the lock-
For example, using RGSep we can prove memory safety (no
freedom of the algorithm and is therefore elided in Figure 2. The
invalid pointer dereferences) and data structure consistency (the
algorithm implements the atomic read-and-write operation on the
linked list is well-formed) of Treiber’s stack [34]. The proof con-
collision array in a lock-free fashion using CAS1. This illus-
siders the linked list with the head pointed to by the variable S to be
trates a common pattern, when one lock-free data structure is used
in the shared state. When a push operation allocates a new node x,
inside another.
it is initially in its local state. The node is transferred to the shared
An RGSep safety proof of the HSY stack would consider the
state once it is linked into the list with a successful CAS instruction.
data structures of the elimination scheme shared and describe in-
The proof specifies interference between threads in the shared state
terference on the shared state using the actions introduced for
with three actions, Push, Pop, and Id, with the following informal
Treiber’s stack and two additional actions: Xchg (which denotes
meaning: Push corresponds to pushing an element onto the stack
the effect of the successful operation on the collision array de-
(a successful CAS in push); Pop to removing an element from the
scribed above) and Others (which includes all the operations on
stack (a successful CAS in pop); and Id represents the identity ac-
the other data structures of the elimination scheme). Given this in-
tion that does not change the shared state (a failed CAS and all the
terference specification, the informal proof of lock-freedom of the
other commands in the code of the threads).
algorithm is as follows: in a parallel composition of several threads
each executing one push or pop operation,
Proving lock-freedom.
Using the splitting of the heap into local
and shared parts and the interference specification for Treiber’s
I. No thread executes Push or Pop actions infinitely often.
stack described above, we can establish its lock-freedom as follows.
II. push and pop do not execute the Xchg action infinitely often if
As we showed above, it is sufficient to prove termination of a fixed
no other thread executes Push or Pop actions infinitely often.
but arbitrary number of threads each executing a single push or
This is because a thread can only execute Xchg infinitely
pop operation with an arbitrary parameter. The informal proof of
often if its outer while loop does not terminate. This can only
this (formalised in Section 4) is as follows:
happen if some other thread executes Push or Pop infinitely
often.
I. No thread executes Push or Pop actions infinitely often.
This is because a Push or Pop action corresponds to a success-
III. push and pop terminate if no other thread executes Push,
ful CAS, and once a CAS succeeds, the corresponding while
Pop, or Xchg actions infinitely often.
loop terminates.
This is because in this case both inner and outer while loops
eventually terminate.
II. The while loop in an operation terminates if no other thread
executes Push or Pop actions infinitely often.
From Statements I and II, we get that no thread executes Push,
This is because the operation does not terminate only when its
Pop, or Xchg actions infinitely often. Hence, by Statement III every
CAS always fails, which requires the environment to execute
thread terminates.
Push or Pop actions infinitely often.
The above proof is done in a layered style, i.e., starting from the
weak guarantee provided by Statement I and strengthening it using
Hence, every thread terminates.
already established guarantees until it implies termination. This is
The above proof uses rely-guarantee reasoning: it consists of
informally illustrated in Figure 3 for the case of two operations (op1
proving several thread-local judgements, each of which establishes
and op2) running in parallel. The validity of the property of Thread
a property of a thread under an assumption about the interfer-
1 in the middle layer depends on the validity of the counterpart
ence from the environment. Properties of a parallel composition of
property of Thread 2 and vice versa. However, it is unsound to
threads are then derived from the thread-local judgements. This is
remove the upper layer of Figure 3 and justify the guarantee in the
done by first establishing the guarantee provided by Statement I and
middle layer by circular reasoning, i.e., by observing that a thread
then using it to prove termination of the operations. This pattern—
satisfies the guarantee if the other thread does.
establishing initial guarantees and then deriving new guarantees
We have found that the proof method described above was
from them—is typical for proofs of lock-freedom. We now con-
applicable in all of the examples of lock-free algorithms that we
sider a more complicated example in which the proof consists of
have considered. In the next two sections we develop a logic for
more steps of this form.
formalising proofs following the method.
Hendler, Shavit, and Yerushalmi [14] have presented an im-
proved version of Treiber’s stack that performs better in the case of
Automating lock-freedom proofs.
The above informal proofs of
higher contention between threads. Figure 2 shows an adapted and
lock-freedom use guarantee conditions of a restricted form that
abridged version of their algorithm. The implementation combines
specifies two sets of actions: those that a thread can execute and
two algorithms: Treiber’s stack and a so-called elimination scheme
those that it cannot execute infinitely often. We have found that
(partially elided). A push or a pop operation first tries to modify
guarantee conditions of this form were sufficient to prove lock-
the stack as in Treiber’s algorithm, by doing a CAS to change the
freedom for all the examples we considered. This observation al-
shared top-of-the-stack pointer. If the CAS is successful then the
lows us to automate proving lock-freedom of an algorithm by sys-
operation terminates. If the CAS fails (because of interference from
tematically searching for termination proofs for a program consist-
another thread), the operation backs off to the elimination scheme.
If this scheme fails, the whole operation is restarted.
1 Such an operation could be implemented with an atomic exchange instruc-
The elimination scheme works on data structures that are sepa-
tion. The reason for implementing it with CAS is that in some architectures
rate from the list implementing the stack. The idea behind it is that
the atomic exchange instruction is either not available or slow.
struct Node {
value_t data;
Node *next;
value_t pop() {
};
Node *t, *x;
Node *S;
while (1) {
int collision[SIZE];
t = S;
if (t == NULL) {
void push(value_t v) {
return EMPTY;
Node *t, *x;
}
x = new Node();
x = t->next;
x->data = v;
if (CAS(&S,t,x)) {
while (1) {
return t->data;
t = S;
}
x->next = t;
// Elimination scheme
if (CAS(&S,t,x)) { return; }
// ...
// Elimination scheme
int pos = GetPosition();
// ...
// 0 ≤ pos ≤ SIZE-1
int pos = GetPosition();
int hisId = collision[pos];
// 0 ≤ pos ≤ SIZE-1
while (!CAS(&collision[pos],hisId,MYID)) {
int hisId = collision[pos];
hisId = collision[pos];
while (!CAS(&collision[pos],hisId,MYID)) {
}
hisId = collision[pos];
// ...
}
}
// ...
}
}
}
Figure 2. The HSY non-blocking stack
Consider a program consisting of an arbitrary number of the al-
op1
op2
gorithm’s operations running in parallel. First, using existing tools
for verifying safety properties of non-blocking algorithms [6], we
?> Thread 1 does not =<
?> Thread 2 does not =<
can infer a splitting of the program state into local and shared
89
execute Push or
execute Push or
parts and a set of actions describing how the operations change the
Pop infinitely often :;
89 Pop infinitely often :;
shared state ({Push, Pop, Xchg, Others, Id} for the HSY stack).
The set defines the initial guarantee provided by every operation
v
mmmmmmmmmmmm(Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
in the program that ensures that the operation changes the shared
?>
state only according to one of the actions. Note that if several opera-
Thread 1 does not
=<
?> Thread 2 does not =<
tions satisfy this guarantee, then so does their parallel composition.
89 execute Push, Pop, or
execute Push, Pop, or
Xchg infinitely often :;
89 Xchg infinitely often :;
Hence, while checking a property of an operation in the program,
we can rely on its environment satisfying the guarantee. The guar-
antee, however, is too weak to establish termination of the oper-
v
nnnnnnnnnnnn(
P
P
P
P
P
P
P
P
P
P
P
P
ations. We therefore try to strengthen it by considering every ac-
?>
=<
?>
=<
tion in turn and attempting to prove that no operation executes the
action infinitely often in an environment satisfying the guarantee.
89 Thread 1 terminates :;
89 Thread 2 terminates :;
In our running example, we will be able to establish that the op-
erations do not execute the actions Push and Pop infinitely often
(but not Xchg and Others). Again, if several operations satisfy the
Figure 3. An informal proof argument where an arrow from state-
guarantee strengthened in this way, then so does their parallel com-
ment A to statement B means that A is used as a rely condition
position. Hence, we can check properties of the operations in the
while establishing the guarantee B.
program assuming that their environment satisfies the guarantee.
An attempt to prove their termination in this way fails again, and
we have to strengthen the guarantee one more time. Namely, we
try to prove that the operations do not execute the remaining ac-
ing of an arbitrary number of the algorithm’s operations running
tions Xchg and Others infinitely often. In this case, the available
in parallel: we search for proofs that follow the pattern described
guarantee is strong enough to prove that the Xchg action is not exe-
above and use rely and guarantee conditions of the restricted form.
cuted infinitely often. Finally, the strengthened guarantee allows us
Our proof search procedure performs a forward search, construct-
to establish termination of the operations.
ing proof graphs like the one in Figure 3 top-down. It is able to con-
struct proofs that the programs C (k) terminate for all k at once,
Proving obstruction-freedom and wait-freedom.
Obstruction-
because our guarantee conditions are such that if several threads
freedom of an operation ensures its termination in an environ-
satisfy a guarantee, then so does their parallel composition. We now
ment that eventually stops executing. Therefore, when proving
informally describe the procedure using the HSY stack as the run-
obstruction-freedom, infinite behaviours of the environment are ir-
ning example (the details are provided in Section 5).
relevant and the necessary environment guarantee can always be
Values = {. . . , −1, 0, 1, . . .}
Locs = {1, 2, . . .}
Separation logic assertions denote sets of program states Σ repre-
Vars = {x, y, . . . , &x, &y, . . .}
Stores = Vars → Values
sented by store-heap pairs (Figure 4). A store is a function from
Heaps = Locs
variables to values; a heap is a finite partial function from locations
fin Values
Σ = Stores × Heaps
to values. We omit the standard formal semantics for most of the as-
sertions [30]. Informally, emp describes the states where the heap
Figure 4. Program states Σ
is empty; e1 → e2 describes the states where the heap contains a
single allocated location at the address e1 with contents e2; p∗q de-
scribes the states where the heap is the union of two disjoint heaps,
represented by a safety property. For example, the operations of
one satisfying p and the other satisfying q. The formal semantics of
Treiber’s stack guarantee that they modify the shared state accord-
the assertion p ∗ q is defined using a partial operation · on Σ such
ing to one of the actions Push, Pop, and Id. To prove obstruction-
that for all (t1, h1), (t2, h2) ∈ Σ
freedom of a push or pop operation2, it is thus sufficient to prove
its termination in an environment that executes only finitely many
(t1, h1) · (t2, h2) = (t1, h1 h2)
such actions. This is true because, as in the proof of lock-freedom,
in such an environment the CAS in the code of the operation
if t1 = t2 and h1 h2 is defined, and (t1, h1)·(t2, h2) is undefined
will eventually succeed. In general, we can automatically check
otherwise. Then
obstruction-freedom by checking termination of every operation
u ∈ [p ∗ q] ⇔ ∃u1, u2. u = u1 · u2 ∧ u1 ∈ [p] ∧ u2 ∈ [q]
in the environment satisfying the safety guarantee inferred by the
safety verification tool and the additional assumption that it exe-
As we argued in Section 2, while reasoning about concurrent
cutes only finitely many actions. We describe this in more detail in
heap-manipulating programs it is useful to partition the program
Section 5.
state into thread-local and shared parts. Therefore, assertions in our
To the best of our knowledge, loops in all practical wait-free
logic denote sets of pairs of states from Σ. The two components
non-blocking algorithms have constant upper bounds on the num-
represent the state local to the thread in whose code the assertion
ber of their iterations. For example, Simpson’s algorithm [32] con-
is located and the shared state. We use the assertion language of
sists of straight-line code only, and the number of loop iterations
RGSep [35], which describes the local and shared components with
in the wait-free find operation of a non-blocking linked list [37] is
separation logic assertions and is defined by following grammar:
bounded by the number of distinct keys that can be stored in the list.
P, Q ::= p | p | P ∗ Q | true | false | P ∨ Q | P ∧ Q | ∃x.P
For this reason, termination of operations in wait-free algorithms
can be justified by considering only safety properties guaranteed by
An assertion p denotes the local-shared state pairs with the local
operations’ environment. The automatic check for wait-freedom is
state satisfying p; p the pairs with the shared state satisfying
similar to the one for obstruction-freedom (see Section 5).
p and the local state satisfying emp; P ∗ Q the pairs in which
the local state can be divided into two substates such that one
3.
Specifying liveness properties
of them together with the shared state satisfies P and the other
together with the shared state satisfies Q. The formal semantics
Our logic for reasoning about liveness properties is a Hoare-style
of P ∗ Q is defined using a partial operation
on Σ2 such that for
logic, which combines ideas from rely-guarantee reasoning and
all (l
separation logic. It generalises a recent logic for reasoning about
1, s1), (l2, s2) ∈ Σ2
safety properties of non-blocking algorithms, RGSep [35]. As any
(l1, s1) (l2, s2) = (l1 · l2, s1)
Hoare logic, ours consists of two formal systems: an assertion
language and a proof system for Hoare triples. In this section, we
if s1 = s2 and l1 · l2 is defined, and (l1, s1) (l2, s2) is undefined
describe the assertion language, define the form of judgements in
otherwise. Thus,
our logic, and show how to specify wait-freedom, lock-freedom,
σ ∈ [P ∗ Q] ⇔ ∃σ1, σ2. σ = σ1 σ2 ∧ σ1 ∈ [P ] ∧ σ2 ∈ [Q]
and obstruction-freedom in it. The next section presents the logic’s
proof system.
Note that by abuse of notation we denote the connectives inter-
preted by · and with the same symbol ∗. It should be always clear
Programming language.
We consider heap-manipulating pro-
from the structure of the assertion which of the two connectives is
grams written in the following simple programming language.
being used. We denote global states from Σ with small Latin let-
Commands C are given by the grammar
ters (u, l, s), and local-shared state pairs from Σ2 with small Greek
C ::= C1; C2 | if (e) C1 else C2 | while (e) C | x = new()
letters (σ).
| x = e | x = *e1 | *e1 = e2 | delete e | atomic C
Judgements.
The judgements in our logic include rely and guar-
where e ranges over arithmetic expressions, including non-
antee conditions determining how a command or its environment
deterministic choice nondet(). The command atomic C exe-
change the shared state. These represent languages of finite and
cutes C in one indivisible step. Programs consist of initialisa-
infinite words over the alphabet Σ2 of relations on shared states
tion code followed by a top-level parallel composition of threads:
and are denoted with capital calligraphic letters (R, G, . . .). A word
C0; (C1 . . . Cn).
in any of the languages describes the sequences of changes to the
To avoid side conditions in our proof rules, we treat each pro-
shared state. Thus, relies and guarantees can define liveness proper-
gram variable x as a memory cell at the constant address &x. Thus,
ties. This generalises the RGSep logic, in which rely and guarantee
any use of x in the code is just a shorthand for *(&x). Similarly, we
conditions define safety properties and are therefore represented
interpret field expressions x->next as *(x + offset of next). In
with relations on the shared state. Our proof system has two kinds
our examples, we also use other pieces of C syntax.
of judgements:
Assertion language.
Let p, q and r be separation logic assertions:
• R, G
{P } C {Q}: if the initial state satisfies P and the
p, q, r ::= emp | e1 → e2 | p ∗ q | false | p ⇒ q | ∃x.p | . . .
environment changes the shared state according to R, then
the program is safe (i.e., it does not dereference any invalid
2 This example is used here for illustrative purposes only: obstruction-
pointers), changes the shared state according to G, and the final
freedom of Treiber’s stack follows from its lock-freedom.
state (if the program terminates) satisfies Q.
• R, (G1, G2)
{P } C1 C2 {Q}: if the initial state satisfies P
or a guarantee all its prefixes also belong to it. Additionally, we
and the environment changes the shared state according to R,
require that relies and guarantees represent nonempty languages.
then the program C1 C2 is safe, C1 changes the shared state
Splitting states into local and shared parts.
We now informally
according to G1, C2 changes the shared state according to G2,
describe the split-state semantics of programs that splits the pro-
and the final state (if both threads terminate) satisfies Q. Dis-
gram state into local and shared parts (formalised in Section 6).
tinguishing the guarantee of each thread is crucial for liveness
We partition all the atomic commands in the program into those
proofs done according to the method described in Section 2.
that access only local state of the thread they are executed by and
The informal definition of judgements’ validity given above as-
those that can additionally access the shared state. By convention,
sumes a semantics of programs that distinguishes between the lo-
the only commands of the latter kind are atomic blocks. For
cal and the shared state. We sketch how such a semantics can be
example, in the operations of Treiber’s stack, all the commands
defined later in this section. In Section 6 we give formal definitions
except for CASes access only the local state. Further, we annotate
of the semantics and the notion of validity, and relate them to the
each atomic block with an action p
q determining how it
standard ones, which consider the program state as a whole.
treats the shared state, written atomicp q C. These annotations
are a part of proofs in our logic. For the logic to be sound, all
Specifying rely and guarantee conditions.
As noted above, a rely
the judgements used in a proof of a program have to agree on
or a guarantee condition defines sequences of atomic changes to
the treatment of the shared state. We therefore require that the
the shared state. We describe each such change with the aid of
same annotation be used for any fixed atomic block throughout
actions [35] of the form p
q, where p and q are separation logic
the proof.
assertions. Informally, this action changes the part of the shared
In the split-state semantics the command atomicp q C exe-
state that satisfies p into one that satisfies q, while leaving the rest
cutes as follows: it combines the local state of the thread it is exe-
of the shared state unchanged. Formally, its meaning is a binary
cuted by and the part of the shared state satisfying p, and runs C on
relation on shared states:
this combination. It then splits the single resulting state into local
[p
q] = {(s
and shared parts, determining the shared part as the one that satis-
1 · s0, s2 · s0) | s1 ∈ p ∧ s2 ∈ q }
fies q. The new shared state is thus this part together with the part
It relates some initial state s1 satisfying the precondition p to a
of the shared state untouched by C. For the splittings to be defined
final state s2 satisfying the postcondition q. In addition, there may
uniquely (and for our logic to be sound), we require that p and q in
be some disjoint state s0 that is not affected by the action.
all annotations be precise assertions [28]. An assertion r is precise
For example, we can define the three actions used in the proof of
if for any state u there exists at most one substate satr(u) satisfy-
lock-freedom of Treiber’s stack mentioned in Section 2 as follows:
ing r: u = satr(u) · restr(u) for some restr(u). The assertions in
&
all the actions used in this paper are precise.
S→y
&S→x ∗ x→Node(v, y) (Push)
Let
&
CASA(addr, v1, v2) be defined as follows:
S→x ∗ x→Node(v, y)
&S→y ∗ x→Node(v, y) (Pop)
if (nondet())
emp
emp
(Id)
atomicA {
Here x→Node(v, y) is a shortcut for x→v ∗ (x + 1)→y. Recall
assume(*addr == v1); *addr = v2; return 1;
that the algorithm is assumed to be executed in the presence of a
}
garbage collector. Hence, the node removed from the list by Pop
else
stays in the shared state as garbage.
atomicId { assume (*addr != v1); return 0; }
In our examples, we syntactically define rely and guarantee con-
where the assume command acts as a filter on the state space of
ditions using linear temporal logic (LTL) with actions as atomic
programs—e is assumed to evaluate to 1 after assume(e) is ex-
propositions. Let False and True be the actions denoting the re-
ecuted. The above definition of CAS is semantically equivalent
lations ∅ and Σ2 respectively. We denote temporal operators “al-
to the definition in Section 2, but allows different action anno-
ways”, “eventually”, and “next” with
,
, and
, respectively.
tations for the successful and the failure cases. We annotate the
Their semantics on infinite words is standard [22]. The semantics
CAS commands in the push and pop operations of Treiber’s stack
on finite words is defined as follows [20]: for m ≥ 0
as CASPush(&S, t, x) and CASPop(&S, t, x), respectively. Similarly,
δ
we annotate the CAS in the inner loop of the HSY stack as
1 . . . δm |=
Ψ ⇔ ∀i. 1 ≤ i ≤ m ⇒ δi . . . δm |= Ψ
δ
CASXchg(&collision[pos], hisId, MYID).
1 . . . δm |=
Ψ ⇔ ∃i. 1 ≤ i ≤ m ∧ δi . . . δm |= Ψ
δ
Specifying wait-freedom, lock-freedom, and obstruction-freedom.
1 . . . δm |=
Ψ ⇔ m ≥ 2 ⇒ δ2 . . . δm |= Ψ
A non-blocking data structure is given by an initialisation routine
Note that here
is the weak next operator: it is true if there is
init and operations op1, . . . , opn on the data structure. We require
no next state to interpret its argument over. For example,
R,
that the initialisation routine satisfy the triple
where R ⊆ Σ2, denotes the language of words in which every
letter is from R (including the empty word), ¬
R denotes words
Id, True
emp
init
Inv
which contain only finitely many letters from R (including all finite
words), and
False denotes exactly all finite words. We specify
for some data structure invariant Inv restricting only the shared
termination of a command by requiring that it satisfy the guarantee
state: the routine creates an instance of the data structure in the
False, i.e., we interpret termination as the absence of infinite
shared state when run in isolation (i.e., in the environment that
computations. This is adequate for programs in the language intro-
does not change the shared state). For Treiber’s stack an invariant
duced above, since they cannot deadlock.
maintained by all the operations on the data structure is that S
The semantics of triples in our logic makes no assumptions
points to the head of a linked list. We can express this in our
about the scheduler. In particular, it can be unfair with respect
assertion language using an inductive predicate assertion lseg(x, y)
to the command in the triple: the guarantee condition includes
of separation logic that represents the least predicate satisfying
words corresponding to the command being suspended and never
lseg(x, y) ⇔ (x = y ∧ emp)
executed again. For this reason, all rely and guarantee conditions in
this paper are prefix-closed, i.e., for any word belonging to a rely
∨ (∃z. x = y ∧ x→Node( , z) ∗ lseg(z, y))
Thus, lseg(x, NULL) represents all of the states in which the heap
R CL(G2), G1 {P1} C1 {Q1}
has the shape of a (possibly empty) linked list starting from location
R CL(G
x
1), G2
{P2} C2 {Q2}
and ending with NULL. The invariant can then be expressed as
P
R, (G
AR-C
1, G2)
{P1 ∗ P2} C1 C2 {Q1 ∗ Q2}
Inv = ∃x. &S → x ∗ lseg(x, NULL) ∗ true
(3.1)
R, G {P } C {Q}
In our logic, we can express the liveness properties of non-
P ⇒ P R ⊆ R G ⊆ G
Q ⇒ Q
blocking algorithms we introduced before as follows. Wait-
CONSEQ
freedom of an operation op
R , G
{P } C {Q }
i is captured by the triple
R,
R, (G
False
{Inv} op
1, G2)
{P1 ∗ P2} C1 C2 {true}
i {true}
(3.2)
R G2, G1 {P1} C1 {Q1}
which ensures termination of the operation under the interfer-
R G
ence from the environment allowed by the rely condition R.
1, G2
{P2} C2 {Q2}
PAR-NC
Obstruction-freedom of an operation opi can be expressed as
R, (G1, G2) {P1 ∗ P2} C1 C2 {Q1 ∗ Q2}
R ∧
False,
False
{Inv} op
R, (G
i {true}
(3.3)
1, G2)
{P } C1 C2 {Q} PAR-MERGE
Here R describes the allowed interference from the operation’s
R, G1 G2 {P } C1 C2 {Q}
environment, and the conjunct
False ensures that eventually all
R , G
{P } C {Q }
the threads in the environment will be suspended. As we showed in
Section 2, lock-freedom can be reduced to proving termination of
R , G
{P } C {Q }
CONJ
several operations run in isolation, which is ensured by the validity
R ∩ R , G ∩ G
{P ∧ P } C {Q ∧ Q }
of the triples
Id,
False
{Inv} C (k) {true}
(3.4)
Figure 5. Proof rules for reasoning about liveness properties of
for all k, where the program C (k) is defined by (2.2).
heap-manipulating programs. G denotes either G or (G1, G2) de-
Note that obstruction-freedom and wait-freedom are directly
pending on whether the triple distinguishes between the guarantees
compositional properties and can thus be specified for every op-
provided by the different threads. In the latter case operations on
eration separately. The specification of lock-freedom considers all
(G1, G2) are done componentwise.
operations at once, however, as we show in the next section, we can
still reason about lock-freedom in a compositional way.
parallel and consider the general case later. To prove this, we have
4.
Compositional proof system for liveness and
to derive the triple
heaps
Id,
False
{Inv} opi1 opi2 {true}
(4.1)
To reason about judgements of the form introduced in the previ-
for the data structure invariant Inv defined by (3.1).
ous section we need (i) a method for proving thread-local triples
Statement I.
(i.e., those giving a specification to a single thread) and (ii) a proof
Formally, the statement says that every thread has to
system for combining thread-local triples into triples about paral-
satisfy the guarantee
lel compositions. We describe an automatic method for proving
G = (Push ∨ Pop ∨ Id) ∧ ¬
(Push ∨ Pop)
thread-local triples in Section 5 (the THREADLOCAL procedure
and Figure 7). In this section, we present the second component—a
where the actions Push, Pop, and Id are defined in Section 3. The
compositional proof system for reasoning about liveness properties
first conjunct specifies the actions that the thread can execute, and
of heap-manipulating programs, shown in Figure 5. We explain the
the second ensures that it cannot execute the actions Push and Pop
proof rules by example of formalising the informal proofs of lock-
infinitely often. In order to establish this guarantee, we do not have
freedom from Section 2. In Section 5 we show how to construct
to make any liveness assumptions on the behaviour of other threads;
such proofs automatically, and in Section 6 we prove the proof rules
just knowing the actions they can execute (Push, Pop, and Id) is
sound with respect to an interleaving semantics.
enough. We therefore use the rule PAR-C to establish G. It is a
We first introduce two operations on languages used by the
circular rely guarantee rule [1] adapted for reasoning about heaps.
rules. Let L(A) denote the language of all finite and infinite words
It allows two threads to establish their guarantees simultaneously,
over an alphabet A. We denote the concatenation of a finite word
while relying on the safety closure of the other thread’s guarantee
α ∈ L(A) and a word (either finite or infinite) β ∈ L(A) with
that is being established. Note that without the safety closure the
αβ. The safety closure CL(G) of a language G ⊆ L(A) is the
circular rules like this are unsound for liveness properties [1]. Note
smallest language defining a safety property that contains G [2].
also that pre- and postconditions of threads in the premises of the
In the setting of this paper, where all the languages considered are
rule are ∗-conjoined in the conclusion: according to the semantics
prefix-closed, CL(G) can be defined as the set of words α such that
of the assertion language, this takes the disjoint composition of the
every prefix of α is in G:
local states of the threads and enforces that the threads have the
same view of the shared state. It is this feature of our proof rules
CL(G) = {α ∈ L(A) | ∀β, γ. α = βγ ⇒ β ∈ G}
that allows us to reason modularly in the presence of heap.
For two words α, β ∈ L(A), we denote the set of their fair
Applying PAR-C with G1 = G2 = G and R = CL(G), we get:
interleavings with α β (we omit the standard definition [4]). We
CL(G) CL(G), G
{Inv} op
lift this to languages G
i1 {true}
1, G2 ⊆ L(A) as follows:
CL(G) CL(G), G
{Inv} opi2 {true}
(4.2)
G1 G2 =
{α β | α ∈ G1 ∧ β ∈ G2}
CL(G), (G, G)
{Inv ∗ Inv} opi1 opi2 {true ∗ true}
4.1
Proving lock-freedom of Treiber’s non-blocking stack
Taking the safety closure of G removes the second conjunct repre-
senting the liveness part of G:
We start by proving termination of any two operations with arbi-
trary parameters (which we denote with opi1 and opi2) running in
CL(G) = (Push ∨ Pop ∨ Id)
Additionally, CL(G) CL(G) = CL(G), so that the premises sim-
This allows us to prove by induction on k that
plify to triples
CL(G), G
{Inv} opi1 . . . opik {true}
CL(G), G
{Inv} opj {true}, j ∈ {i1, i2}
(4.3)
G,
False
{Inv} opi1 . . . opik {true}
(4.10)
which ensure that the thread does not execute Push and Pop actions
is derivable in our proof system for any k ≥ 1. For k = 1
infinitely often, provided the environment executes only actions
the triples are established by (4.3) and (4.7). For the induction
Push, Pop, and Id. We show how to discharge such triples in
step, we just repeat the previous derivation with opi1 replaced by
Section 5. Their proof would formalise the informal justification
opi1 . . . opik and opi2 replaced by opi(k+1).
of Statement I given in Section 2 and would use the annotations at
Applying CONSEQ to (4.10), we get (3.4), which entails lock-
atomic blocks introduced in Section 3 to determine the splitting of
freedom of Treiber’s stack.
states into local and shared parts.
Note that instead of doing induction on the number of threads,
Since Inv restricts only the shared state, Inv ∗ Inv ⇔ Inv, hence,
we could have formulated our proof rules for k threads. To simplify
the conclusion of (4.2) is equivalent to
the presentation, we chose the minimalistic proof system.
CL(G), (G, G)
{Inv} opi1 opi2 {true}
(4.4)
4.2
Proving lock-freedom of the HSY non-blocking stack
Since G ⊆ CL(G), we can then apply a variation on the rule of con-
The action Xchg used in the informal proof of lock-freedom of the
sequence of Hoare logic, CONSEQ, which allows us to strengthen
HSY stack (Section 2) can be formally defined as follows:
the rely condition to G:
G, (G, G) {
0 ≤ i ≤ SIZE − 1 ∧ collision[i]→
collision[i]→
Inv} opi1 opi2 {true}
(4.5)
(Xchg)
Statement II.
Termination is captured by the guarantee
False,
The abridged data structure invariant is:
which says that eventually the program does not execute any tran-
sitions. To prove this guarantee, we use the non-circular rely-
∃x. &
Inv =
S → x ∗ lseg(x, NULL) ∗ true
guarantee rule PAR-NC, which allows the first thread to replace
∗ SIZE−1
i=0
collision[i] → ∗ . . .
its guarantee with a new one based on the already established
(We elided some of the data structures of the elimination scheme.)
guarantee of the other thread, and vice versa. Note that the first
We now formalise the informal proof from Section 2 for two
premise need only establish the postcondition true, since the post-
operations
condition Q
op
1 ∗ Q2 of the conclusion is implied by the other two
i1 and opi2 running in parallel.
premises. Applying PAR-NC with R = G1 = G2 = G and
Statement I.
The statement requires us to establish the guarantee
G1 = G2 =
False, we get:
G = (Push ∨ Pop ∨ Xchg ∨ Others ∨ Id) ∧ ¬
(Push ∨ Pop)
G, (G, G) {Inv} opi1 opi2 {true}
where Others describes the interference caused by the elimination
G G,
False
{Inv} opi1 {true}
scheme (elided). As before, we can do this using PAR-C. Given the
G G,
(4.6)
False
{Inv} opi2 {true}
thread-local triples (4.3) with the newly defined opi1, opi2, G, and
G, (
False,
False)
{Inv} op
Inv, we can again derive (4.5) and (4.9), where
i1 opi2 {true}
We have already derived the first premise. Since G G = G, we need
CL(G) = (Push ∨ Pop ∨ Xchg ∨ Others ∨ Id)
to discharge the following thread-local triples (again postponed to
Statement II.
Now, provided that a thread satisfies the guarantee
Section 5):
G, we have to prove that the other thread satisfies the guarantee
G,
False
{Inv} opj {true}, j ∈ {i1, i2}
(4.7)
G = (Push ∨ Pop ∨ Xchg ∨ Others ∨ Id) ∧ ¬
Xchg
We no longer need to distinguish between the guarantees of the
To this end, we use the non-circular rely-guarantee rule PAR-NC:
two threads in the conclusion of (4.6). Hence, we use the rule PAR-
M
G, (G, G) {Inv}
ERGE, which merges the guarantees provided by the threads into
opi1 opi2 {true}
a single guarantee provided by their parallel composition:
G G, G
{Inv} opi1 {true}
G G, G
{
G, (
Inv} op
False,
False)
{Inv} op
i2 {true}
i1 opi2 {true}
G, (G , G ) {
G, (
Inv} op
False) (
False)
{Inv} op
i1 opi2 {true}
i1 opi2 {true}
We thus have to establish the following thread-local triples:
The conclusion is equivalent to
G, G
{Inv} op
G,
j {true}, j ∈ {i1, i2}
(4.11)
False
{Inv} opi1 opi2 {true}
(4.8)
We can now use the conjunction rule, CONJ, to combine the guar-
from which (4.1) follows by CONSEQ. This proves termination of
antees G and G into a single guarantee G :
the two operations.
G, (G, G) {Inv} opi1 opi2 {true}
Arbitrary number of operations.
We can generalise our proof
G, (G , G ) {Inv} opi1 opi2 {true}
(4.12)
to an arbitrary number of operations as follows. First, note that
G, (G , G ) {Inv} op
applying PAR-MERGE on (4.4), we get:
i1 opi2 {true}
where
CL(G), G
{Inv} opi1 opi2 {true}
(4.9)
G = G ∧ G = (
Hence, the proof for two operations establishes (4.9) and (4.8)
Push ∨ Pop ∨ Xchg ∨ Others ∨ Id)
given (4.3) and (4.7), i.e., it shows that the parallel composition
∧ ¬
(Push ∨ Pop ∨ Xchg)
opi1 opi2 preserves the properties (4.3) and (4.7) of its constituent
This combines Statements I and II. Applying CONSEQ, we get:
operations. Note that this derivation is independent of the particular
definitions of opi1 and opi2 satisfying (4.3) and (4.7).
G , (G , G ) {Inv} opi1 opi2 {true}
procedure LOCKFREE(init, op)
defined by (2.2) for an arbitrary k. All rely and guarantee condi-
(Inv, G1) := SAFETYGUARANTEE(init, op)
tions used in the examples of such proofs in Section 4 had a re-
G
stricted form
A1 ∧ ¬
A2, where A1 and A2 are sets (disjunc-
2 := ∅
tions) of actions and A2 ⊆ A1. Here A1 is the set of all actions
do
that a thread can perform, whereas A2 is the set of actions that the
G := G1 ∧ ¬
G2
thread performs only finitely often. In fact, for all the non-blocking
if THREADLOCAL(G,
False
{Inv} op {true})
algorithms we have studied, it was sufficient to consider rely and
return “Lock-free”
guarantee conditions of this form to prove lock-freedom.
G0
We prove termination of C (k) by searching for proofs of
2 := G2
triple (3.4) in our proof system in the style of those presented in
for each A ∈ (G1 \ G02) do
Section 4 with relies and guarantees of the form
A1 ∧ ¬
A2.
if THREADLOCAL(G, ¬
A
{Inv} op {true})
There are several ways in which one can organise such a proof
G2 := G2 ∪ {A}
search. The strategy we use here is to perform forward search as
while G0
explained informally in Section 2.
2 = G2
Figure 6 contains our procedure L
return “Don’t know”
OCKFREE for proving lock-
freedom. It is parameterised with two auxiliary procedures, whose
Figure 6. Proof search procedure for lock-freedom
implementation is described later:
• SAFETYGUARANTEE(init, op) computes supporting safety
properties for our liveness proofs, namely, a data structure in-
Thus, we have strengthened the guarantee G of Statement I to the
variant Inv such that
guarantee G .
Id, True
emp
init
Inv
(5.1)
Statement III.
Finally, we can formalise Statement III by apply-
ing the rule PAR-NC to establish termination:
and an initial safety guarantee provided by every operation,
G , (G , G ) {
which is defined by a set of actions G
Inv} op
1 = {A1, . . . , An} such
i1 opi2 {true}
G G ,
that
False
{Inv} opi1 {true}
G G ,
G
False
{Inv} op
1,
G1 {Inv} op {Inv}
(5.2)
i2 {true}
G , (
False,
False)
{Inv} op
•
i1 opi2 {true}
THREADLOCAL(R, G
{Inv} op {true}) attempts to prove
the thread-local triple R, G
{Inv} op {true} valid. The no-
provided we can establish the thread-local triples
tion of validity of thread-local triples used by THREADLOCAL
G ,
False
{Inv} op
corresponds to the informal explanation given in Section 3 and
j {true}, j ∈ {i1, i2}
(4.13)
is formalised in Section 6.
By PAR-MERGE we then get:
LOCKFREE first calls SAFETYGUARANTEE to compute the
G ,
False
{Inv} opi1 opi2 {true}
(4.14)
data structure invariant Inv and the safety guarantee
G1. In our
which proves the termination of the two operations.
proofs of liveness properties, rely and guarantee conditions are
then represented using LTL formulae with actions in G1 as atomic
Arbitrary number of operations.
From the conclusion of (4.12),
propositions. A side-effect of SAFETYGUARANTEE is that it an-
by PAR-MERGE we get:
notates atomic blocks in op with actions from G1 as explained in
G, G
{
Section 3. These annotations are used by the subsequent calls to
Inv} opi1 opi2 {true}
(4.15)
THREADLOCAL, which ensures that all thread-local reasoning in
Thus, the above derivation establishes (4.9), (4.15), and (4.14)
the proof of lock-freedom uses the same splitting of the program
given (4.3), (4.11), and (4.13). As before, this allows us to prove
state into local and shared parts.
by induction on k that the following triples are derivable in our
Having computed the safety guarantee, we enter into a loop,
logic for any k ≥ 1:
where on every iteration we first attempt to prove termination of
op using the available guarantee. If this succeeds, we have proved
CL(G), G
{Inv} opi1 . . . opik {true}
lock-freedom. Otherwise, we try to strengthen the guarantee
G1∧
G, G
{Inv} opi1 . . . opik {true}
¬
G2 by considering each action in G1 \G2 and trying to prove
G ,
False
{Inv} opi1 . . . opik {true}
that it is executed only finitely often using the current guarantee as
a rely condition. If we succeed, we update the guarantee by adding
The last one implies lock-freedom of the HSY stack.
the action to the set of finitely executed actions G2. If we cannot
prove that any action from G1 \ G2 is executed only finitely often,
5.
Automation
we give up the search and exit the loop.
In this section we describe our automatic prover for liveness prop-
This procedure scales because in practice the set of actions G1
erties of non-blocking concurrent algorithms. Our tool’s input is
computed by SAFETYGUARANTEE is small. This is due to the fact
a liveness property to be proved and a program in a C-like lan-
that actions p
q are local in the sense that p and q describe only
guage consisting of the code of operations
the parts of the shared state modified by atomic blocks.
op1, . . . , opn of a non-
blocking algorithm, together with a piece of initialisation code
It is possible to show that a successful run of LOCKFREE con-
init. We remind the reader that we denote with op the command,
structs proofs of triples (3.4) for all k. We can construct the proofs
defined by (2.1), that non-deterministically executes one of the op-
for any number of threads uniformly because the guarantees G used
erations op
in them are such that G G = G. The construction follows the
i on the data structure. We first describe how our tool
handles lock-freedom.
method of Section 4. The only difference is that the proofs con-
structed by LOCKFREE first apply the rule PAR-C to triples (5.2)
Proving lock-freedom via proof search.
Recall that to prove
with G1 = G2 =
G1. Since CL( G1) = G1, this establishes
lock-freedom, we have to prove termination of the program C (k)
the initial safety guarantee
G1, which is then strengthened using
R, G {
sition of op
Inv}
async R and the automaton ¬G synchronising on ac-
op {true}
tions of op. Intuitively, fair infinite executions of the program
/.
(
()
op asyncR) sync¬G correspond to the executions of op in an
Automata-theoretic framework [36]
-,
*+
environment satisfying the rely R that violate the guarantee G.
Its fair termination implies that there are no such executions.
Fair termination of (op asyncR) sync¬G
• To check fair termination of (op asyncR) sync¬G, we analyse
it with the abstract interpreter of SMALLFOOTRG [6], which
/.
()
SMALLFOOTRG [6]
-,
*+
produces an abstract transition system over-approximating the
program’s behaviour. The interpreter uses the annotations at
atomic blocks computed by S
Abstract transition system
AFETYGUARANTEE to choose
the splitting of the heap into local and shared parts.
76
01
•
Translation to arithmetic programs [3, 21] 54
23
Using the techniques of [3, 21], from this transition system
we then extract an arithmetic program (i.e., a program with-
out the heap with only integer variables), whose fair termina-
Equiterminating arithmetic program
tion implies fair termination of (op asyncR) sync¬G. The arith-
/.
metic program makes explicit the implicit arithmetic infor-
()
TERMINATOR with fairness [8]
-,
*+
mation present in the heap-manipulating program that can be
used by termination provers to construct ranking functions. For
Valid/Don’t know
example, it contains integer variables tracking the lengths of
linked lists in the original program.
Figure 7. The high-level structure of the THREADLOCAL proce-
• Finally, we run a termination prover (TERMINATOR with fair-
dure for discharging thread-local triples
ness [8]) to prove fair termination of the arithmetic program.
We note that proofs of thread-local statements may be more
the rule PAR-NC. In the proofs of Section 4, these two steps were
complicated then the ones in the examples of Section 2, which were
performed with one application of the rule PAR-C.
based on control-flow arguments. For example, for Michael’s non-
We now describe the two auxiliary procedures used by LOCK-
blocking linked list algorithm [25] they involve reasoning about
FREE—SAFETYGUARANTEE and THREADLOCAL.
lengths of parts of the shared data structure. Furthermore, the
proofs may rely on complex supporting safety properties that en-
The SAFETYGUARANTEE procedure.
We implement the proce-
sures that the data structure is well-formed. Automatic tool support
dure using the SMALLFOOTRG tool for verifying safety properties
is indispensable in constructing such proofs.
of non-blocking algorithms [6]. SMALLFOOTRG computes a data
structure invariant and an interference specification by performing
Proving obstruction-freedom and wait-freedom.
As we showed
abstract interpretation of the code of init and op over an abstract
in Section 2, proving obstruction-freedom or wait-freedom of an
domain constructed from RGSep formulae. This abstract interpre-
operation in a non-blocking algorithm usually requires only safety
tation is thread-modular, i.e., it repeatedly analyses separate threads
guarantees provided by the operation’s environment. In our tool, we
without enumerating interleavings using an algorithm similar to the
use the guarantee
G1 inferred by SAFETYGUARANTEE. Namely,
one described in [12]. For the invariant and interference specifica-
we prove obstruction-freedom of an operation opi by establishing
tions computed by SMALLFOOTRG to be strong enough for use in
triple (3.3) with R = G1 via a call to
liveness proofs, its abstract domain has to be modified to keep track
THREADLOCAL( G1∧
False,
False
{Inv} op
of the lengths of linked lists as described in [21].
i {true})
RGSep judgements can be expressed in our logic by triples with
We can prove wait-freedom of an operation opi by establishing
rely and guarantee conditions of the form
A, where A is a set of
triple (3.2) with R = G1 via a call to
actions. SMALLFOOTRG proves the validity of RGSep judgements
THREADLOCAL( G
that, when translated to our logic in this way, yield (5.1) and (5.2).
1,
False
{Inv} opi {true})
Experiments.
Using our tool, we have proved a number of non-
The THREADLOCAL procedure.
We prove a thread-local triple
R, G
{
blocking algorithms lock-free and have found counterexamples
Inv} op {true} using a combination of several existing
demonstrating that they are not wait-free. The examples we anal-
methods and tools, as shown in Figure 7. For technical reasons,
ysed include a DCAS-based stack, Treiber’s stack [33], the HSY
in this procedure we assume that R and G consist of only infi-
stack [14], a non-blocking queue due to Michael and Scott [26] and
nite words and op has only infinite computations. This can always
its optimised version due to Doherty et al. [9], a restricted double-
be ensured by padding the finite words in R and G with a special
compare single-swap operation (RDCSS) [13], and Michael’s non-
dummy action and inserting an infinite loop at the end of op exe-
blocking linked list [25]. In all cases except Michael’s algorithm
cuting the action. To prove the triple R, G
{Inv} op {true}:
the tool found a proof of lock-freedom in less then 10 minutes.
• We first represent R and ¬G as B¨uchi automata, whose tran-
Verification of Michael’s algorithm takes approximately 8 hours,
sitions are labelled with actions from the set G1 computed by
which is due to the unoptimised arithmetic program generator and
SAFETYGUARANTEE and apply the automata-theoretic frame-
the inefficient version of the termination prover that we currently
work for program verification [36]. This reduces proving the
use.
triple to proving that the program (op asyncR) sync¬G termi-
We have also tested our tool by proving the obstruction-freedom
nates when run from states satisfying the precondition Inv un-
of the above lock-free algorithms. (Obstruction-free algorithms that
der the fairness assumptions extracted from the accepting con-
are not lock-free typically traverse arrays, handling which is be-
ditions of the automata for R and ¬G. Here op asyncR is the
yond the scope of the shape analysis that we use.) Additionally, we
asynchronous parallel composition interleaving the executions
have checked that the deletion operation of a linked list algorithm
of op and the automaton R in all possible ways. The pro-
by Vechev and Yahav [37, Figure 2] is not obstruction-free (as ob-
gram (op asyncR) sync¬G is the synchronous parallel compo-
served by the authors), even though it does not use locks.
We do not report any results for wait-free algorithms in this
6.2
Split-state semantics
paper. Operations consisting of straight-line code only are trivially
We now show that given an LTS we can construct a split LTS that
wait-free. Proving termination of wait-free find operations in non-
distinguishes between the local and the shared state. To this end,
blocking linked lists mentioned in Section 2 requires tracking the
we assume a labelling function π that maps each transition in an
keys stored in the list, which is not handled by our shape analysis.
LTS to either Local for operations that only access the local state,
or Shared(p
q) for operations that access both the local and
6.
Semantics and soundness
the shared state. Note that for a program C we can construct such a
labelling π
We give semantics to programs with respect to labelled transi-
C from the annotations we introduced in Section 3: com-
mands outside
tion systems with states representing the whole program heap. The
atomic blocks are mapped to Local and annotations
at
proof of soundness of our logic with respect to this global seman-
atomic blocks give the parameters of Shared.
Given a labelling π for an LTS (Σ, , Φ, T ), we can define the
tics is done in two steps. We first show that, given a transition sys-
corresponding split LTS as (Σ2, , Φ × Σ, T ), where T consists
tem denoting a program, we can construct another transition system
of fresh copies of transitions τ for every transition τ ∈ T . The
operating on states that distinguish between local and shared heap,
program counter of a thread is always in its local state, hence,
according to the informal description given in Section 3. Interpre-
the set of states of the split LTS in which it has the final value is
tation of judgements in this split-state semantics is straightforward.
Φ × Σ. The transition functions for the split LTS are defined as
We then relate the validity of judgements in the split-state seman-
follows. If π(τ ) = Local, then τ executes τ on the local state and
tics to the standard global notion of validity. The results in this sec-
preserves the shared state: f
tion do not follow straightforwardly from existing ones, however,
τ (l, s) = fτ (l) × {s} if fτ (l) =
,
and f
the techniques used to formulate the split-state semantics and prove
τ (l, s) =
otherwise. If π(τ ) = Shared(p
q), then the
execution of τ follows the informal description of the execution of
the soundness theorems are the same as for the RGSep logic [35].
atomic blocks in Section 3:
6.1
Global semantics
fτ (l, s) =
We represent denotations of programs as a variant of labelled tran-
{(
sition systems (LTS).
restq(u), satq(u) · restp(s)) | u ∈ fτ (l · satp(s))}
if sat
D
p(s) is defined, fτ (l · satp(s)) =
, and satq(u) is defined
EFINITION 1 (LTS). A labelled transition system (LTS) is a
for all u ∈ f
quadruple S = (Σ, , Φ, T ), where
τ (l · satp(s)); otherwise, τ faults: fτ (l, s) =
.
• Σ is the set of non-erroneous states of the transition system,
6.3
Validity in the split-state semantics
•
/∈ Σ is a distinguished error state (arising, for example, when
To define validity of triples in the split-state semantics, we have
a program dereferences an invalid pointer),
to define the meaning of interleaving computations of a split LTS
• Φ ⊆ Σ is the set of final states, and
(Σ2, , Φ × Σ, T ) with actions of an environment changing the
• T is the set of transitions such that every τ ∈ T is associated
shared state according to a rely condition R ⊆ L(Σ2). We repre-
with a transition function fτ : Σ → P(Σ) ∪ { }, where P(Σ)
sent these computations with traces α ∈ L(Σ2 × (Σ2 ∪ { }) ×
is the powerset of Σ.
({e} ∪ T )). The first two components of every letter in a trace de-
fine how the state of the LTS changes. The third component defines
DEFINITION 2 (Computation of an LTS). A computation of an
if the change was made by a transition of the LTS (τ ∈ T ) or the
LTS (Σ, , Φ, T ) starting from an initial state u0 ∈ Σ is a maxi-
environment (e). We require that the environment does not change
mal sequence u0, u1, . . . of states ui ∈ Σ ∪ { } such that for all i
the local state and does not fault, i.e., all e-letters in a trace are of
there exists a transition τ ∈ T such that ui+1 =
if fτ (ui) =
the form ((l, s), (l, s ), e).
and ui+1 ∈ fτ (ui) otherwise.
We often need to project a trace α without error states to a word
that records how the shared state is changed by a particular set of
Given a thread C in the programming language of Section 3,
transitions U ⊆ {e} ∪ T . We define such a projection α U ∈
we can construct the corresponding LTS C in the following
L(Σ2) as the image of α under the following homomorphism
way. Let us assume for the purposes of this construction that the
program counter of the thread is a memory cell at a distinguished
h : Σ2 × Σ2 × ({e} ∪ T ) → L(Σ2)
address &pc, implicitly modified by every primitive command.
As the set of states Σ of the LTS we take the one defined in
h((l, s), (l , s ), τ) =
(s, s ), τ ∈ U;
Figure 4. The final states are those in which the program counter
ε,
otherwise
has a distinguished final value. Every atomic command in the
where ε is the empty word. We write α↓σ if α is a nonempty trace
thread, including atomic blocks, corresponds to a transition in the
and its last letter is of the form (σ , σ, τ ) for some σ and τ .
LTS. Conditions in if and while commands are translated in the
standard way using assume commands. The transition functions
DEFINITION 4 (Traces). For a rely condition R ⊆ L(Σ2) and a
are then just the standard postcondition transformers [30, 5].
split LTS S = (Σ2, , Φ × Σ, T ), the set tr(S, R, σ0) of traces of
The denotation of a parallel composition of threads is the paral-
S executed in an environment satisfying R starting from an initial
lel composition of their denotations, defined as follows.
state σ0 ∈ Σ2 is defined as the set of traces α ∈ L(Σ2 × (Σ2 ∪
{ }) × ({e} ∪ T )) of the following two forms:
DEFINITION 3 (Parallel composition of LTSes). The
paral-
lel composition of two LTSes S
•
1
= (Σ, , Φ1, T1) and
finite or infinite traces α = (σ0, σ1, τ0)(σ1, σ2, τ1) . . ., where
S2 = (Σ, , Φ2, T2), where T1 ∩ T2 = ∅, is defined as the LTS
σi = , α {e} ∈ R, and if τi = e, then σi+1 ∈ fτ (σ
i
i); and
S1 S2 = (Σ, , Φ1 ∩ Φ2, T1 T2).
• finite traces α = β(σn, , τn) for some β = (σ0, σ1, τ0)
(σ1, σ2, τ1) . . . (σn−1, σn, τn−1) such that β {e} ∈ R,
As follows from Definitions 2 and 3, the parallel composition
fτ
(σi).
interleaves transitions from two LTSes on the same memory Σ
n (σn) =
, and if τi = e for i < n, then σi+1 ∈ fτi
without any fairness constraints. Note that we can always satisfy
The first case in this definition corresponds to safe traces, and
T1 ∩ T2 = ∅ by renaming transitions appropriately.
the second to unsafe traces, i.e., those in which both the program
and its environment stop executing after the program commits a
Theorem 2 and Corollary 1 show that the provability of
memory fault (the treatment of the later case relies on R being
triple (3.4) from valid thread-local triples in the proof system of
prefix-closed). Note that, since we assume that the scheduler is
Section 4 implies that the program C (k) terminates, and hence,
possibly unfair, the set of traces in this definition includes those
the corresponding algorithm is lock-free. Similar soundness results
in which S is preempted and is never executed again. Hence, the
can be formulated for obstruction-freedom and wait-freedom.
set of projections α T of traces α ∈ tr(S, R, σ0) on the transitions
of the LTS S, representing the guarantee condition provided by S,
7.
Related work
is prefix-closed.
Let F0(C), respectively Ff (C), be the ∗-conjunction over all
Our proof system draws on the classical circular and non-circular
the threads in a program C of formulae &pc → pc
rely-guarantee rules for shared-variable concurrency [18, 29, 1]
0, respectively
&pc → pc
to achieve compositionality, and on separation logic (specifically,
f , where pc is the program counter of the thread, pc0
is its initial value, and pc
RGSep—a combination of rely-guarantee and separation logic [35,
f is the final one. Note that F0(C) and
Ff (C) do not restrict the shared state.
11, 34]) to achieve modular reasoning in the presence of heap.
Its technical novelty over previous rely-guarantee proof systems
DEFINITION 5 (Validity).
lies in our method of combining applications of circular and non-
R, G |= {P } C {Q} ⇔
circular rules using judgements that distinguish between guarantees
provided by different threads in a parallel composition.
∀σ0 ∈ [P ∗ F0(C)]]. ∀α ∈ tr(S, R, σ0). ∀σ.
Colvin and Dongol [7] have recently proved the most basic non-
(α↓σ ⇒ σ = ) ∧
(safety)
blocking algorithm, Treiber’s stack [33], to be lock-free. They did
(α↓σ ∧ σ ∈ (Φ × Σ) ⇒ σ ∈ [Q ∗ Ff (C)]]) ∧ (correctness)
this by manually constructing a global well-founded ordering over
(α
program counters and local variables of all the threads in the al-
T ∈ G)
(guarantee)
gorithm’s most general client. Unfortunately, their method requires
where S = (Σ2, , Φ × Σ, T ) is the split LTS constructed out of
each operation to have at most one lock-free loop, which rules out
the LTS C using the labelling πC .
more modern non-blocking algorithms, such as the HSY stack and
R, (G
Michael’s list algorithm. Moreover, because their well-founded or-
1, G2) |= {P } C1 C2 {Q} ⇔
∀σ
dering is over the whole program, their method is non-modular and
0 ∈ [ P ∗ F0(C1 C2)]]. ∀α ∈ tr(S1 S2, R, σ0). ∀σ.
does not scale to the more realistic examples of the kind we con-
(α↓σ ⇒ σ = ) ∧
(safety)
sider in Section 5. In contrast, our method is modular, both in the
(α↓σ ∧ σ ∈ ((Φ1 ∩ Φ2) × Σ) ⇒ σ ∈ [Q ∗ Ff (C1 C2)]]) ∧
treatment of threads and heaps. We can reason about every thread
(correctness)
separately under simple assumptions about its environment that do
(α
not consider parts of the heap local to other threads. Furthermore,
T ∈ G
∈ G
1
1) ∧ (α T2
2)
(guarantee)
our method is fully automatic.
where S1 = (Σ2, , Φ1 × Σ, T1) and S2 = (Σ2, , Φ2 × Σ, T2)
Kobayashi and Sangiorgi [19] have recently proposed a type-
are the split LTSes constructed out of the LTSes C1 and C2
based method for checking lock-freedom in π-calculus. Their pro-
using the labellings πC1 and πC2 , respectively.
gramming model and the notion of lock-freedom are different
from the ones used for non-blocking data structures, which makes
THEOREM 1. The proof rules in Figure 5 preserve validity.
their results incomparable to ours. Moore [27] presents a proof of
a progress property for a non-blocking counter algorithm in the
COROLLARY 1. If R, G
{P } C {Q} is derived from valid
ACL2 proof assistant. His proof is thread-modular, but the algo-
thread-local triples using the rules in Figure 5, then R, G |=
{P } C {Q}
rithm considered is extremely simple. McMillan [24] addresses the
.
issue of circular dependencies among a class of liveness properties
in the context of finite-state hardware model checking. He takes a
6.4
Soundness
different approach from ours to resolving the circularities by doing
We now relate the notion of validity with respect to a split LTS to
induction over time.
validity with respect to the global LTS used to construct the split
one. For a closed program (i.e., a program executing in isolation),
8.
Conclusion
we can formulate a global notion of validity of triples without rely
and guarantee conditions as follows.
Wait-freedom, lock-freedom, and obstruction-freedom are the es-
sential properties that make “non-blocking algorithms” actually
DEFINITION 6 (Validity with respect to a global LTS). For p, q ⊆
non-blocking. We have proposed the first fully automatic tool that
Σ and a command C such that C = (Σ, , Φ, T ) we define
allows their developers to verify these properties. Our success was
|= {p} C {q} if for all u0 ∈ p and for any computation u0, u1, . . .
due to choosing a logical formalism in which it was easy to ex-
of C we have ui =
, and if the computation is finite and
press proofs about non-blocking algorithms and then observing that
ending with a state u ∈ Φ, then u ∈ q. We define |= [p] C [q]
proofs of the liveness properties in it follow a particular pattern.
if |= {p} C {q} and every computation of C starting from a
We conclude by noting some limitations of our tool; lifting
state in p is finite.
these presents interesting avenues for future work. First, we prove
the soundness of our logic with respect to an interleaving seman-
THEOREM 2. Let C
= (Σ, , Φ, T ) and S = (Σ2, , Φ ×
tics, which is inadequate for modern multiprocessors with weak
Σ, T ) be a corresponding split LTS with respect to any labelling
memory models. It happens that even proving safety properties of
πC. Then
programs with respect to a weak memory model is currently an
• R, G |= {P } C {Q} implies
open problem. Moreover, the published versions of concurrent al-
|= {γ([[P ∗ F
gorithms assume a sequentially consistent memory model. In fact,
0(C)]])} C {γ([[Q ∗ Ff (C)]])},
•
most of non-blocking algorithms are incorrect when run on mul-
If R,
False |= {P } C {Q} implies
|= [γ([[P ∗ F
tiprocessors with weak memory models as published: one has to
0(C)]])] C [γ([[Q ∗ Ff (C)]])],
insert additional fences or (on x86) locked instructions for them
where γ(P ) = {l · s | (l, s) ∈ P } for any assertion P .
to run correctly. In the future, we hope to address this problem,
building on a recent formalisation of weak memory model seman-
chronization: Double-ended queues as an example. In ICDCS’03:
tics [31]. Second, our tool can currently handle only list-based algo-
International Conference on Distributed Computing Systems, pages
rithms, because we use an off-the-shelf shape analysis. We believe
522–529. IEEE, 2003.
that the methods described in this paper should be applicable to
[17] M. Herlihy and N. Shavit. The Art of Multiprocessor Programming.
more complicated data structures as well, provided the necessary
Morgan Kaufmann, 2008.
shape analysis infrastructure is available.
[18] C. B. Jones. Specification and design of (parallel) programs. In IFIP
The above-mentioned limitations notwithstanding, this paper
Congress, pages 321–332. North-Holland, 1983.
presents the first successful attempt to give modular proofs of
[19] N. Kobayashi and D. Sangiorgi. A hybrid type system for lock-
liveness properties to complex heap-manipulating concurrent pro-
freedom of mobile processes. In CAV’08: Conference on Computer
grams.
Aided Verification, volume 5123 of LNCS, pages 80–93. Springer,
Acknowledgements. We would like to thank Josh Berdine, Mike
2008.
Dodds, Tim Harris, Michael Hicks, Andreas Podelski, Moshe
[20] O. Lichtenstein, A. Pnueli, and L. D. Zuck. The glory of the past.
Vardi, Eran Yahav, and the anonymous reviewers for comments and
In Conference on Logics of Programs, volume 193 of LNCS, pages
discussions that helped to improve the paper.
196–218. Springer, 1985.
[21] S. Magill, J. Berdine, E. M. Clarke, and B. Cook. Arithmetic
References
strengthening for shape analysis.
In SAS’07: Static Analysis
Symposium, volume 4634 of LNCS, pages 419–436. Springer, 2007.
[1] M. Abadi and L. Lamport. Conjoining specifications. ACM Trans.
[22] Z. Manna and A. Pnueli. The Temporal Logic of Reactive and
Program. Lang. Syst., 17(3):507–534, 1995.
Concurrent Systems: Specification. Springer-Verlag, Berlin, 1992.
[2] B. Alpern and F. B. Schneider. Defining liveness. Inf. Process. Lett.,
[23] H. Massalin and C. Pu. A lock-free multiprocessor OS kernel
21(4):181–185, 1985.
(Abstract). Operating Systems Review, 26(2):108, 1992.
[3] J. Berdine, B. Cook, D. Distefano, and P. W. O’Hearn. Automatic
[24] K. L. McMillan. Circular compositional reasoning about liveness.
termination proofs for programs with shape-shifting heaps. In
In CHARME’99: Conference on Correct Hardware Design and
CAV’06: Conference on Computer Aided Verification, volume 4144
Verification Methods, volume 1703 of LNCS, pages 342–345.
of LNCS, pages 386–400. Springer, 2006.
Springer, 1999.
[4] S. D. Brookes.
Full abstraction for a shared-variable parallel
[25] M. M. Michael. High performance dynamic lock-free hash tables and
language. Inf. Comput., 127(2):145–163, 1996.
list-based sets. In SPAA’02: Symposium on Parallelism in Algorithms
[5] C. Calcagno, P. O’Hearn, and H. Yang. Local action and abstract
and Architectures, pages 73–82. ACM, 2002.
separation logic. In LICS’07: Symposium on Logic in Computer
[26] M. M. Michael and M. L. Scott. Simple, fast, and practical non-
Science, pages 366–378. IEEE, 2007.
blocking and blocking concurrent queue algorithms. In PODC’96:
[6] C. Calcagno, M. J. Parkinson, and V. Vafeiadis. Modular safety
Symposium on Principles of Distributed Computing, pages 267–275.
checking for fine-grained concurrency. In SAS’07: Static Analysis
ACM, 1996.
Symposium, volume 4634 of LNCS, pages 233–248. Springer, 2007.
[27] J. S. Moore. A mechanically checked proof of a multiprocessor
[7] R. Colvin and B. Dongol. Verifying lock-freedom using well-founded
result via a uniprocessor view. Formal Methods in System Design,
orders. In ICTAC’07: International Colloquium on Theoretical
14(2):213–228, 1999.
Aspects of Computing, volume 4711 of LNCS, pages 124–138.
[28] P. W. O’Hearn. Resources, concurrency and local reasoning. Theor.
Springer, 2007.
Comput. Sci., 375(1-3):271–307, 2007.
[8] B. Cook, A. Gotsman, A. Podelski, A. Rybalchenko, and M. Y. Vardi.
[29] A. Pnueli. In transition from global to modular temporal reasoning
Proving that programs eventually do something good. In POPL’07:
about programs. In Logics and Models of Concurrent Systems, pages
Symposium on Principles of Programming Languages, pages 265–
123–144. Springer, 1985.
276. ACM, 2007.
[30] J. Reynolds. Separation logic: A logic for shared mutable data
[9] S. Doherty, L. Groves, V. Luchangco, and M. Moir.
Formal
structures. In LICS’02: Symposium on Logic in Computer Science,
verification of a practical lock-free queue algorithm. In FORTE’04:
pages 55–74. IEEE, 2002.
Conference on Formal Techniques for Networked and Distributed
Systems, volume 3235 of LNCS, pages 97–114. Springer, 2004.
[31] S. Sarkar, P. Sewell, F. Zappa Nardelli, S. Owens, T. Ridge,
T. Braibant, M. Myreen, and J. Alglave. The semantics of x86
[10] B. Dongol.
Formalising progress properties of non-blocking
multiprocessor machine code. This volume.
programs.
In ICFEM’06: Conference on Formal Engineering
Methods, volume 4260 of LNCS, pages 284–303. Springer, 2006.
[32] H. Simpson. Four-slot fully asynchronous communication mecha-
nism. IEE Proceedings, 137(1):17–30, 1990.
[11] X. Feng, R. Ferreira, and Z. Shao. On the relationship between
concurrent separation logic and assume-guarantee reasoning. In
[33] R. K. Treiber. Systems programming: Coping with parallelism.
ESOP’07: European Symposium on Programming, volume 4421 of
Technical Report RJ 5118, IBM Almaden Research Center, 1986.
LNCS, pages 173–188. Springer, 2007.
[34] V. Vafeiadis. Modular fine-grained concurrency verification. PhD
[12] A. Gotsman, J. Berdine, B. Cook, and M. Sagiv. Thread-modular
Thesis, University of Cambridge Computer Laboratory, 2008.
shape analysis. In PLDI’07: Conference on Programming Language
[35] V. Vafeiadis and M. J. Parkinson. A marriage of rely/guarantee
Design and Implementation, pages 266–277. ACM, 2007.
and separation logic. In CONCUR’07: Conference on Concurrency
[13] T. L. Harris, K. Fraser, and I. A. Pratt. A practical multi-word
Theory, volume 4703 of LNCS, pages 256–271. Springer, 2007.
compare-and-swap operation. In DISC’02: Symposium on Distributed
[36] M. Vardi. Verification of concurrent programs—the automata-
Computing, volume 2508 of LNCS, pages 265–279. Springer, 2002.
theoretic framework. Ann. Pure Appl. Logic, 51:79–98, 1991.
[14] D. Hendler, N. Shavit, and L. Yerushalmi. A scalable lock-free stack
[37] M. T. Vechev and E. Yahav. Deriving linearizable fine-grained
algorithm. In SPAA’04: Symposium on Parallelism in Algorithms and
concurrent objects.
In PLDI’08: Conference on Programming
Architectures, pages 206–215. ACM, 2004.
Languages Design and Implementation, pages 125–135. ACM, 2008.
[15] M. Herlihy. Wait-free synchronization. ACM Trans. Program. Lang.
[38] I. William N. Scherer, D. Lea, and M. L. Scott. Scalable synchronous
Syst., 13(1):124–149, 1991.
queues. In PPoPP’06: Symposium on Principles and Practice of
[16] M. Herlihy, V. Luchangco, and M. Moir. Obstruction-free syn-
Parallel Programming, pages 147–156. ACM, 2006.