Local View Memory Model

From X10Wiki

(Redirected from Local View Memory Models)
Jump to: navigation, search

Memory Model Table of Contents

The purpose of the framework of Local View (LV) Memory Models is to give a very elementary setting for the discussion of a variety of memory models.

The intuitive idea behind the LV memory models is to view the ensemble of n processors interacting with shared memory as an ensemble with certain rules governing their interactions. Interactions are described by events. A typical event is a load or a store performed by a processor against memory. These events may be seen by other processors in the system (be performed with respect to them) -- but not necessarily in the same order. Thus each processor has a view of the events in the system, and a memory model specifies rules that govern these views.

Let us fix some notation. These definitions are essentially due to Sebastian Burckhardt.

P -- set of processors 1..n 
A -- set of addresses
V -- set of values
i(a) in V -- initial value of location a in A
E -- set of events 
L -- set of load events
S -- set of store events
p(e) in P -- processor that executes the event e in E
a(e) in A -- storage addressed used by event e in E
v(e) in V -- value loaded or stored by event e in E

A memory model in the LV family is specified by describing the conditions for a valid execution in the memory model. An execution Z is a tuple (E, <1,..., <n, <) where

E is the set of events for the execution,
<p (for p in 1..n) <p is a total order on E. It expresses a per-processor view of the ordering between events.
< is a relation on E, the global view. It is said to be global because it must be respected by all processors: d < e implies d <p e for all processors p.

For example, < may include what is called program order, the order of events performed by a particular processor.

Below we shall abuse notation and for events d,e in E write d <d e to mean d <p(d) e.

A program is presented as a set of events E, and some facts that lie in <, e.g. facts representing the program order for all processes. An execution for the program is an valid execution in the memory model whose set of events is E and whose global view relation extends the given <.

Properties of LV Memory Models

Note: We are still experimenting with the right orthogonal decomposition of properties. So this section is changing. Please bear with us.

The following properties of an execution Z are of great interest:

Cache Coherence
< totally orders S(a), for all a in A.
v(l)=v(s) where s is the last <p-last store to a(l), if there is one, and the initial value i(a) otherwise.
Load Coherence
l <l s, a(s)=a(l) implies l < s

Load coherence implies that {l} u S(a(l)) is totally ordered by <, for all l in L.

Cumulativity
Say that l in loads(s) if s in S, l in L(a(s)) and phi(s)=phi(l).
(A-cumulativity) c <d d < e implies c < e, for e in E(p(d))
(B-cumulativity) c < s,l < e implies c < e, for l in loads(s), e in E(p(l))
Causal transitivity
c < d < e implies c < e, for c,d,e in E.
Pairwise Causal transitivity
c < d < e implies c < e, for c,d,e in E(p) u E(q), p,q in P
Causality = Cumulativity + Causal Transitivity
Pairwise Causality = Cumulativity + Pairwise Causal Transitivity
SL-restricted Causality
s < l implies there exists k in L(a(s)): s < k < l
Consistency
There is a single total order in which events are executed by a processor, and stores are performed wrt to it.
<p totally orders E(p) u S, for each p in P.
Total consistency
There is a single total order in which events are executed by a processor, or performed wrt to it.
<p totally orders E, for each p in P.
Load Completeness
When a load completes (returns its value to the processor executing it), it has been performed with respect to all processors. Once a load has been performed with respect to a processor, no store by the processor can affect the value returned by the load.
k <k l implies k <l l, for all loads k,l
s <l l implies s <s l, for all stores s and loads l in L(a(s))

See the discussion in PPCL. This is a property of the PowerPC model.

Various LV Memory Models

We say that a memory model is sequentially consistent (SC, for short) if from every permissible execution we can create a total ordering of all events which respects < and produces the same values for all loads.

Definitions
We define the following local view memory models:
SC: executions must satisfy the property that < is a total order of E. (SC is sequentially consistent by definition.)
CCC: executions must satisfy Cache-coherence, Consistency.
CCCC: executions must satisfy CCC and Causality.
Strong CCCC: executions must satisfy CCC, Pairwise Transitivity and Load Coherence.
Weak CCCC: executions must satisfy CCCC and SL-restricted Causality.
PPCL-HW: executions must satisfy CCCC, Total Consistency, and Load Completion. (This models the PowerPC with hwsyncs between each pair of loads/stores.)

It is hard to argue that a memory model should not satisfy Cache-coherence or Consistency. So all models of interest will be CCC.

Theorem
CCCC and Load Coherence is SC
Theorem
PPCL-HW is SC
Theorem
Strong CCCC is not SC. It gives a non-SC execution for IRIW.
Theorem
CCCC and Weak CCCC are not SC. They give non-SC executions for Dekker (and, therefore, IRIW).

So we see that there is a tradeoff between Causal Transitivity and Load Coherence. If we add both to CCC+Cumulativity, we get SC. If we add the first we get (effectively) Weak CCCC. If we add the second (and Binary Transitivity) we get Strong CCCC.

Q1: Is there any practical difference between CCCC and Weak CCCC? Does the SL-restricted causality condition in Weak CCCC prevent any useful idioms? (The SL-restricted causality condition is present to accurately model the PPC with lwsync fences.)

(Need to revise PPCL to permit hwsync and lwsync. Now we can prove Weak CCCC can be realized with the load lwsync and lwsync store pattern.)

Q2: Can Strong CCCC be realized by succeeding each lwsync load and lwsync store hwsync?

Q3: Is CCCC+Load Completion SC? (i.e. is total consistency really needed)

Cache-coherent local view memory models

We shall only be concerned with local view memory models that are Cache Coherent. For such models one can define a notion of phase phi(e) for each event e. The phase of a store s, phi(s), is the rank of the store in the total order on S(a). The phase of a load l, phi(l), is 0 if the value returned is the initial value, otherwise it is the phase of the store whose value was returned by the load.

Let Z = (E,<1,..., <n,<) be an execution.

We label each element in E with a phase. The phase of a store s is its rank in the total order of S(a(s)) given by <. The phase of a load l is 0 if it reads i(a(l)), otherwise it is the rank of the store s whose value l returns. For each event e we let phi(e) denote its phase.

Execution graph G(Z) of an execution Z=(E,<1,...,<n,<).

The set of nodes of G(Z) is E. There are two kinds of edges in G(Z):

cross-edges
for all stores s, t and loads l: s -> t if a(s)=a(t) and phi(t)=phi(s)+1,
s -> l if a(s)=a(l) and phi(t)=phi(s),
l -> s if a(s) = a(l)+1 and phi(s)=phi(l)+1
down-edges
for all processors p and events d, e in E(p): d -> e if d <p e (and there is no other event in between).

Say that a total ordering << of E is valid if d << e implies d ->* e, and d < e implies d << e.

Proposition: Given an execution Z=(E,<1,...,<n,<) and a valid total ordering <<.Then Z1=(E,<1,...,<n,<<) is an SC execution with the same value for each load as Z.

Therefore for any cache-coherent local view memory model MM in order to show that MM is sequentially consistent we merely need to show that for any execution Z of MM, G(Z) does not have a cycle. For, if it doesnt have a cycle we can enumerate all its vertices in some total order

The following definition is often useful in proving sequential consistency.

Definition
Phase-order. Say that a sequence of events Q is in phase order if the phase of each address does not decrease as the sequence is traversed and stays the same only at loads. Formally, for any d,e, if d precedes e in Q, then either phi(d) < phi(e) or (phi(d)=phi(e) and d,e in L).
Personal tools