Multiprocessor Architectures Using Multi Hop Multi Ops Lightwave ...
Multiprocessor Architectures Using Multi-Hop Multi-OPS
Lightwave Networks and Distributed Control
David Coudert
Afonso Ferreira
Xavier Muñoz
LIP ENS-Lyon
CNRS-Projet SLOOP
Matemàtica Aplicada i Telemàtica
46, allée d’Italie
Laboratoire I3S & INRIA
UPC
69364 Lyon Cedex 07, France
Sophia-Antipolis, France
08034 Barcelona, Spain
David.Coudert@lip.ens-lyon.fr
Afonso.Ferreira@sophia.inria.fr
xml@mat.upc.es
Abstract
termediate nodes to reach the destination. This allows the
usage of statically tuned transmitters and receivers, but on
Advances in optical technology have increased the inter-
the other hand the processing of the information by the in-
est for multiprocessor architectures based on lightwave net-
termediate nodes causes a loss of speed. Furthermore, it
works because of the vast bandwidth available. In this pa-
is clear that the smaller the number of intermediate nodes,
per we propose a passive star multi-hop lightwave network
the faster communications take place in the network. The
called stack-Kautz, based on the Kautz graph. We show that
choice of the topology for the network, at both physical and
this architecture is very cost-effective with respect to its re-
logical levels, is thus crucial. For a fixed number of trans-
sources requirements. We also propose control protocols for
mitters and receivers per node, and a fixed number of nodes
accessing the optical passive star couplers, which improve
in the network, the number of intermediate nodes a mes-
on the bit complexity of the control sequence proposed in
sage is required to hop through should be minimized. How-
the literature for the Partitioned Optical Passive Star net-
ever, other parameters have to be taken into account, like
work. Finally, we show through simulation that these con-
the simplicity of control and routing protocols, as well as
trol protocols efficiently implement shortest path routing on
the easiness of the optical realization.
the stack-Kautz network.
Networks based on OPS’s can be further classified ac-
cording to the number of optical couplers used, being
single-OPS or multi-OPS [6]. Under current optical tech-
1
Introduction
nology, however, the latter seem more viable and cost-
effective. Therefore, in this paper we address design issues
in multi-hop multi-OPS architectures and introduce a new
The advances in optical technology, such as low energy
optical interconnection network for multiprocessor systems,
loss optical passive star couplers (OPS) [8], as well as tun-
called stack-Kautz, allowing one-to-many communications
able optical transmitters and receivers [13], have increased
at every communication step.
It is based on the Kautz
the interest for optical interconnection networks for multi-
graphs (constant degree
and diameter
, for
processor systems because of their large bandwidth [4, 5].
d
blog
N
c
N
d
nodes) [1, 7, 9, 12] and on the stack-graphs [4, 6]. We study
The topologies proposed for such networks can be di-
design characteristics of the stack-Kautz network, its scal-
vided in two classes, according to the number of inter-
ability, and give control protocols for accessing the shared
mediate processors a message has to visit before deliv-
OPS’s. Finally, we show through simulation that these con-
ery [10, 11]. In a single-hop network, the nodes commu-
trol protocols efficiently implement shortest path routing on
nicate with each other in only one step. Such topologies
this network.
require either a large number of transmitters and receivers
per node, or rapidly tunable transmitters and receivers. Un-
fortunately, the delay which is presently needed to tune the
transmitters is quite large (roughly a micro-second) [13]
2
Preliminaries
compared to the transmission delay for a message, which
causes important latencies in communication.
In a multi-hop architecture, there is no direct path be-
In this section we present the OPS coupler, the POPS
tween all pairs of nodes: a communication should use in-
network, stack-graphs and the Kautz-graph.
2.1
Optical passive star
2.3
A model for multi-OPS networks
An optical passive star coupler is a single-hop one-to-
In this section we recall a model which allows us to study
many optical transmission device. An OPS(
) has
in-
more easily multi-OPS networks. The stack-graphs were
s;
z
s
puts and
outputs. In the case where
equals , the OPS is
first defined in an early version of [4] as hyper-graphs built
z
s
z
said to be of degree
(see Figure 1). When one of the input
from graphs. Briefly, stack-graphs are obtained by piling
s
processors sends a message through an OPS coupler, the
up copies of the original graph and subsequently replacing
s
output processors have access to it. Throughout this paper,
each stack of edges by one hyper-edge. A formal definition
we will use single-wavelength OPS couplers of degree .
is as follows.
s
Consequently, only one processor can send an optical sig-
nal through it per time step. An OPS coupler is a passive
Definition 1 [4] Let
be a directed graph. The
GV
;
A
optical system, i.e. it requires no power source. It is com-
stack-graph
is as follows,
being
&
s;
G
=
V
;
A
s
&
&
posed of an optical multiplexer followed by an optical fiber
called stacking-factor of the stack-graph.
or a free optical space and an optical de-multiplexer that di-
vides the incoming light signal into
equal signals of a -th
1. The set of nodes
of
is
V
&
s;
G
V
=
f0;
;
s
,
s
s
&
&
of the incoming optical power. Note that only one optical
,
.
1g
V
s
1
beam has to be guided through the circuit [8]. A practical
realization of an OPS coupler using a hologram at the out-
2. Let
be the projection function defined from
onto
V
&
puts, is described in [3].
such that
, for
and
.
V
i;
v
=
v
0
i
s
v
2
V
0
4
def
Optical passive star coupler
3. The set of hyper-arcs
of
is then
A
&
s;
G
A
=
&
&
,1
,1
.
fa
=
u;
v
ju;
v
2
Ag
&
1
5
Sources
An OPS coupler can thus be modeled as a hyper-arc as
2
6
Destinations
shown below.
3
7
Optical Passive Star coupler
0
4
0
4
1
5
Figure 1. An OPS coupler of degree 4.
1
5
Sources
2
6
2
6
Destinations
3
7
3
7
2.2
A single-hop multi-OPS network
Figure 3. Modeling an OPS by a hyper-arc.
The Partitioned Optical Passive
Star network
We refer the interested reader to [2], where modeling
, introduced in [5], is composed of
P
O
P
S
t;
g
N
=
tg
POPS by stack-graphs can be found, as well as further ap-
processors and 2 OPS couplers of degree . The processors
g
t
plications.
are divided into
groups of size (see Figure 2). Each OPS
g
t
coupler is labeled by a pair of integers
,
.
i;
j
0
i;
j
g
2.4
Kautz graphs
The input of the OPS
is connected to the -th group of
i;
j
i
processors, and the output to the -th group of processors.
j
Stack-graphs represent a powerful tool for modeling
Optical passive star couplers
multi-OPS networks and also for guiding their design. In-
0
0
deed, we can use them to build one-to-many lightwave net-
(0,0)
1
1
works based on graphs having good properties, like the con-
nection of a large number of nodes with respect to the con-
Group 0
2
2
(0,1)
stant degree and a small diameter. The Kautz graph has such
3
3
good properties and is defined as follows.
Sources
4
4
(1,0)
Destinations
5
5
Definition 2 [9] The directed Kautz graph
of de-
K
d;
k
gree
and diameter
is the digraph defined as follows (see
d
k
Group 1
6
6
(1,1)
Figure 4).
7
7
1. A vertex is labeled with a word of length
,
k
Figure 2. POPS(4,2).
, on the alphabet
,
x
;
;
x
=
f0;
;
dg
jj
=
1
k
, in which
, for
.
d
+
1
x
6
=
x
1
i
k
,
1
i
i+1
and
is the label of a processor in this group. Since the
y
stack-Kautz network inherits most of the properties of the
20
01
Kautz graph, like shortest path routing, fault tolerance and
others, we chose it as a good candidate for the topology of
02
10
an OPS-based lightwave network. In the following we will
investigate further these properties.
21
0 1 2
10
12
01
3 4 5
Figure 4. The Kautz graph
.
K
2;
2
12
20
6 7 8
9 10 11
2. There is an arc from a vertex
to
21
02
x
=
x
;
;
x
1
k
all vertices
such that
,
,
12 13 14
15 16 17
y
y
=
x
;
;
x
;
z
z
2
2
k
.
z
6
=
x
k
The Kautz graph
has
k ,1
nodes,
Figure 5. The stack-Kautz network
.
S
K
3;
2;
2
K
d;
k
N
=
d
d
+
1
constant degree
and diameter
. It is both
d
k
=
blog
N
c
d
Eulerian and Hamiltonian and the best known with respect
to the number of nodes if
[9]. As an example,
d
2
K
5;
4
has
nodes, degree
and diameter .
3.1
Characteristics
N
=
3750
5
4
It is important to note that routing on the Kautz graph is
very simple, since a shortest path routing algorithm (every
An OPS-based network with the topology of a stack-
path is of length at most ) is induced by the label of the
k
Kautz network
has
k ,1
proces-
S
K
s;
d;
k
N
=
sd
d
+
1
nodes [7, 12].
sors divided in
k ,1
groups of size . It is possi-
g
=
d
d
+
1
s
ble to preserve a small diameter
and have a large number
k
3
A multi-hop multi-OPS network
of nodes. For instance,
has
pro-
S
K
12;
5;
5
N
=
45000
cessors and diameter 5.
We now have of a good model for multi-OPS networks
Each group of
processors has an output degree
,
s
d
+
1
(the stack-graphs) and also a graph having good properties
hence it is connected to the input of
OPS couplers
d
+
1
as a multi-hop network model (the Kautz graph). In this sec-
of degree . The stack-Kautz network
requires
s
S
K
s;
d;
k
tion we introduce a multi-hop multi-OPS architecture based
2
k ,1
OPS couplers of degree . Notice that the
d
d
+
1
s
on the stack-Kautz.
number of OPS’s is independent of the stacking-factor.
Let the reflective Kautz graph
be the Kautz
Each processor has one transceiver per link. So there are
K
d;
k
r
graph
of degree
and diameter
in which we add
transmitters and receivers per processor and a total of
K
d;
k
d
k
d
+
1
an arc from every node to itself (loop). The number of nodes
2
k ,1
transceivers in the network.
sd
d
+
1
in
is the same as in
, each node has degree
For the sake of illustration, the network
K
d;
k
K
d;
k
r
P
O
P
S
t;
g
, and the node labels are the same. A reflective Kautz
with
groups of size
has 2 OPS couplers for
d
+
1
g
t
g
N
=
tg
graph has the same properties as a Kautz graph
.
processors and each processor has
transmitters and re-
K
d;
k
g
Thus, we can define the stack-Kautz graph as follows.
ceivers. It is clear that, for a same number of nodes, the
number of OPS couplers in
is smaller than in
S
K
s;
d;
k
Definition 3 The stack-Kautz graph
is the
S
K
s;
d;
k
and analogously for the individual and total
P
O
P
S
t;
g
stack-graph
of stacking-factor , degree
&
s;
K
d;
k
s
d
+
1
r
number of transceivers.
and diameter .
k
The stack-Kautz network has the topology of the stack-
3.2
Power budget and scalability
Kautz graph
and
k ,1
nodes (see
S
K
s;
d;
k
N
=
sd
d
+
1
Figure 5). Each node is a processor labeled by a pair
The power budget corresponds to the cost of sending a
x;
y
where
is the label of the stack in
and
is an
message from one processor to another, in terms of energy.
x
K
d;
k
y
integer
, i.e.,
is the label of a processor group
The goal is to minimize this value, that roughly equals the
0
y
s
x
number of OPS couplers crossed by the message times the
A simple control protocol can be implemented in any
degree of each coupler (i.e.,
in the stack-Kautz).
multi-OPS network with a bit complexity of
sk
s
log
d
+
1
+
s
In order to proportionally decrease the power budget in
bits: The processor which is in charge of the control of the
a stack-Kautz network
, the number of groups
group has
counters, one for each processor. Each counter
s
S
K
s;
d;
k
must be large with respect to the group size. Also, it is
is increased of 1 when the corresponding processor receives
better to increase the diameter of the network in order to
a refusal. A counter is set to 0 when the corresponding pro-
minimize the number of transmitters and receivers per pro-
cessor receives an acknowledgment. The larger the counter
cessor. Thus, by increasing the diameter of the network,
value, the higher the corresponding processor priority. Its
the power budget and the resources are proportionally re-
time complexity for a group of size
and degree
is
s
d
+
1
duced with respect to the number of processors. However it
.
O
s
is necessary to preserve
to have more processors in
An advanced control protocol for multi-OPS networks
s
d
the network than OPS couplers.
can be considered under the following hypothesis.
Table 1 gives numerical evidence of the resources re-
Each processor has
buffers of messages to be trans-
d
+
1
quired by the two multi-OPS networks (POPS and stack-
mitted, one per OPS coupler. At most
messages can
d
+
1
Kautz) in order to interconnect 1800 processors with 900
be proposed per processor and per communication step.
OPS’s.
The processor in charge of the control of the group has
counters (
per processor), i.e., 1 per OPS
sd
+
1
d
+
1
coupler for each processor. It adds 1 to each counter which
corresponds to a processor receiving a refusal and sets it to
S
K
12;
5;
3
P
O
P
S
60;
30
Groups
0 when the processor receives an acknowledgment.
150
30
Processors
Let
be the processor in charge of the control of a group
p
1800
1800
0
Degree OPS’s
of
processors. The protocol is as follows.
s
12
60
OPS’s
900
900
All processors of the group send successively a word
Tr.|Rec. per proc.
6
30
of
bits to
, encoding the presence or not of a
Tr.|Rec. total
d
+
1
p
0
10800
54000
message to be transmitted in each of its
buffers
d
+
1
Power budget
48
60
of messages (one for each OPS).
Broadcast (time steps)
4
2
Processor
realizes a maximum matching between
p
Advanced control
0
the processors and the OPS couplers using the weight
of bits
106
561
induced by the counters. This maximum matching is
realized using a standard algorithm.
Table 1. Examples of required resources.
Processor
sends a word of
bits to all
p
s
logd
+
2
0
processors in its group, encoding for each processor
an acknowledgment (index of an OPS coupler) or a
refusal (special word).
3.3
Control protocols
The bit complexity of this control protocol is sd + 1 +
A single wavelength OPS coupler can transmit only one
bits. Since the time complexity of the maximum
s
logd
+
2
2
message per communication step. Since
processors share
matching algorithm is
, we have
O
sd
+
1
s
OPS couplers,
, an efficient control protocol
d
+
1
s
d
Proposition 1 The time complexity of the advanced con-
is required. One such protocol was proposed in [5] for the
trol protocol for a group of size
and degree
is
POPS network
with
groups of size . It sup-
s
d
+
1
P
O
P
S
t;
g
g
t
2
.
poses that a processor can receive messages on all its re-
O
sd
+
1
ceivers at the same time and that it can send a message on
only one link per communication step. Each group of pro-
4
Simulation
t
cessors contains one node in charge of the control of the
group. The total bit complexity of the control protocol is
We built a simulator for the stack-Kautz network which
bits.
implements a shortest path routing algorithm and guaran-
t
log
g
+
g
log
t
+
t
+
g
For the control protocols of our mono-wavelength stack-
tees that a path is shorter or equal to the diameter of the
Kautz network, we suppose, as in [5], that a processor can
network. Our simulator can use either the simple or the ad-
receive messages on all its links at the same time. By read-
vanced control protocol.
ing the header of a message, a processor can decide whether
We kept the load
of the network
at
S
K
12;
5;
3
=
it has to process it or not. The control protocol has thus just
by “injecting” new messages, during 1000 communi-
0:5
to avoid local conflicts, inside a group of processors.
cation steps. Figure 6 shows the accumulated percentage of
15
4
delivered messages out of the total number of messages (left
SK(12,5,3)
SK(12,5,2)
SK(12,5,2)
SK(12,5,3)
3.5
curve), as a function of the number of steps needed, as well
3
as the total number of the delivered messages (right curve).
10
2.5
We remark that the percentages are the same for the two
2
load
load
protocols, but not the total number of delivered messages.
1.5
5
1
The difference between the total number of delivered mes-
0.5
sages in the right curve is explained by the fact that even
0
0
0
100
200
300
400
500
600
700
800
900
1000
0
50
100
150
200
250
300
350
400
450
communication steps
communication steps
though the load is kept at the same value for the two pro-
tocols, the “speed” of the messages is not the same, and
Figure
7.
Network
load
evolution
for
therefore, the total number of injected messages is not the
and
, when the prob-
S
K
12;
5;
2
S
K
12;
5;
3
same either. Thus, the advanced control protocol is much
ability of message creation changes.
better than the simple control protocol, with respect to the
number of delivered messages.
4
x 10
100
18
[4] H. Bourdin, A. Ferreira, and K. Marcus. A performance
Simple control
90
16
Advanced control
comparison between graps and hypergraph topologies for
80
14
passive star WDM lightwave networks. Computer Networks
70
12
60
Simple control
and ISDN Systems, To appear. A preliminary version ap-
10
Advanced control
50
8
peared in the Proceedings of the 2nd IEEE MPPOI, pp 257-
40
6
264.
30
Cumulative % messages completed
Cumulative number messages completed
4
20
[5] D. Chiarulli, S. Levitan, R. Melhem, J. Teza, and G. Graven-
10
2
0
0
streter.
Partitioned Optical Passive Star (POPS) Topolo-
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
35
40
sequence step
sequence step
gies Multiprocessor Interconnection Networks with Dis-
Figure 6. Cumulative percentage and number
tributed Control. IEEE Journal of Lightwave Technology,
14(7):1601–1612, 1996.
of messages completed.
[6] A. Ferreira. Towards effective models for optical passive star
based lightwave networks. In P. Berthomé and A. Ferreira,
Finally, it is also interesting to study the load of the net-
editors, Optical Interconnections and Parallel Processing:
works
and
, when the probability
Trends at the Interface. Kluwer Academic Publisher, Boston
S
K
12;
5;
3
S
K
12;
5;
2
of having a new message is sharply increasing. This models,
(USA), 1997.
for instance, cases where a global exchange is performed in
[7] M.A. Fiol, J.L.A. Yebra, and I. Alegre. Line digraphs itera-
the middle of a normal state of the network.
tions and the (d,k) digraph problem. IEEE Trans. on Com-
In Figure 7, the load
of the network is induced by the
puters C-33, 400-403, 1984.
probability
that a processor creates a new message at ev-
[8] S. Gardner, P. Harvey, L. Hendrick, P. Marchand, and S. Es-
p
ery step. For the curve on the left, we set
during
ener. Photorefractive Beamsplitter For Free Space Optical
p
=
0:1
200 steps, then
during another 200 steps, and finally
Interconnections.
SPIE Photonics West’97, Optoelectron-
p
=
0:2
back to
. For the curve on the right, we set
,
ics’97, Diffractive and Holographic Optics Technology IV,
p
=
0:1
p
=
0:1
San Jose (USA), February 1997.
then
for three steps and back to
. The results
p
=
1
p
=
0:1
show that the stack-Kautz is not blocked by either slow or
[9] W.H. Kautz. Bounds on directed (d,k) graphs. Theory of cel-
sharp rises of the network load, and that the load stabilizes
lular logic networks and machines. AFCRL-68-0668 Final
report, 20-28, 1968.
again in time.
[10] B. Mukherjee. WDM-Based Local Lightwave Networks Part
I: Single-Hop Systems. IEEE Networks, 6(3):12–27, may
References
1992.
[11] B. Mukherjee. WDM-based local lightwave networks part
[1] C. Berge. Hypergraphs. North Holland, 1989.
II: Multihop systems. IEEE Networks, pp. 20-32, July 1992.
[2] P. Berthomé and A. Ferreira. Improved embeddings in POPS
[12] G. Smit, P. Havinga, and P.Jansen. An Algorithm for Gen-
networks through stack-graph models. Third Internationnal
erating Node Disjoint Routes in Kautz Digraphs. Proceed-
Workshop on Massively Parallel Processing using Optical
ing Fifth International Parallel Processing Symphosium, pp.
Interconnections,pp. 130-135, July, 1996. IEEE Press.
102-107, 1991.
[3] M. Blume, F. McCormick, P. Marchand, and S. Esener. Ar-
[13] F. Sugihwo, M. Larson, and J. Harris. Low Threshold Con-
ray interconnect systems based on lenslets and CGH. SPIE
tinuously Tunable Vertical-Cavity-Surface-Emitting-Lasers
International Symposium on Optical Science, Engineering
with 19.1nm Wavelength Range. Applied Physics Letters 70,
and Instrumentation, Paper 2537-22, San Diego (USA),
page pp. 547, 1997.
1995.