Original PDF Flash format understanding-availability  


Understanding Availability

Understanding Availability
 
Ranjita Bhagwan, Stefan Savage and Geoffrey M. Voelker
Department of Computer Science and Engineering
University of California, San Diego
Abstract
who join and leave the system independently of their own
volition. Moreover, these components can be time varying.
This paper addresses a simple, yet fundamental question
For example, a peer-to-peer system may replicate some file
in the design of peer-to-peer systems: What does it mean
¢
on
machines at time . However, by time
some
£
¤
¤¦¥¨§
when we say “availability” and how does this understand-
©
machines may be turned off as their owners go to work,
ing impact the engineering of practical systems? We ar-
returning at some later time. The availability of the hosts
gue that existing measurements and models do not capture
is therefore dependent on time of day, and hence, the avail-
the complex time-varying nature of availability in today’s
¢
ability of the file
is a function of time. Another issue is
peer-to-peer environments. Further, we show that unfore-
whether the availability of a host is dependent on the avail-
seen methodological shortcomings have dramatically biased
ability of another host, or, whether two host availabilities are
previous analyses of this phenomenon. As the basis of our
interdependent. This issue is important since many peer-to-
study, we empirically characterize the availability of a large
peer systems [4, 12] are designed on the assumption that a
peer-to-peer system over a period of 7 days, analyze the de-
random selection of hosts in a P2P network do not all fail
pendence of the underlying availability distributions, mea-
together at the same time.
sure host turnover in the system, and discuss how these re-
Consequently, host availability is not well modeled as
sults may affect the design of high-availability peer-to-peer
a single stationary distribution, but instead is a combina-
services.
tion of a number of time-varying functions, ranging from
the most transient (e.g., packet loss) to the most permanent
(e.g., disk crash). Traditionally, distributed systems have as-
1
Introduction
sumed that transient failures are short enough to be transpar-
ently masked and only the long-term components of avail-
Inevitably, real systems stop working. At some point, disks
ability require explicit system engineering. In peer-to-peer
fail, hosts crash, networks partition, software miscalculates,
systems, though, this abstraction is grossly insufficient. A
administrators misconfigure or users misuse. Consequently,
new “intermittent” component of availability is introduced
the principal challenge in designing highly available sys-
by users periodically leaving and joining the system again at
tems is to tolerate each failure as it occurs and recover from
a later time. Moreover, the set of hosts that comprise the sys-
its effects. However, engineering such systems efficiently
tem is continuously changing, as new hosts arrive the system
requires the designer to make informed decisions about the
and existing hosts depart it permanently on a daily basis.
availability of individual system components.
A peer-to-peer system designed on this substrate will need
Webster’s dictionary defines availability as “the quality
to incorporate arriving hosts into it without much overhead,
of being present or ready for immediate use”. However,
while being able to provide all the functionality it promises
this seemingly simple definition can conceal tremendous
to provide in the face of regular departures.
complexity. In traditional data storage systems, the compo-
We were motivated to study peer-to-peer host availability
nents of interest are devices like disks, SCSI interfaces, and
in part to shape the design and evaluation of a highly avail-
NVRAM buffers, each of which have well-understood sta-
able, wide-area peer-to-peer storage system [15]. A primary
tistical failure properties that are usually assumed fail-stop
goal of the system is to provide efficient, highly available file
and independent (e.g., redundant disk arrays [5]). In peer-
storage even when the system is comprised of hosts with rel-
to-peer storage systems, however, the component of interest
atively poor and highly variable availability. Even so, our re-
is the host, whose availability is poorly understood by com-
sults can apply to any peer-to-peer system constructed from
parison.
a similar collection of hosts.
While the failure of individual hardware components can
The remainder of this paper examines these issues em-
still compromise the availability of a host, a peer-to-peer
pirically by characterizing host availability in a large de-
system designer must also anticipate transient software fail-
ployed peer-to-peer file sharing system over a 7 day period.
ures, partial or total communication interruption, and users
We make four principal contributions in this work: First,
¡
Department of Electrical and Computer Engineering, UCSD
we show that a minor methodological limitation of previous
1

availability measurements has lead to underestimated P2P
3.1
Overnet
host availability [3,13,14]. Second, we show that host avail-
The Overnet peer-to-peer file-sharing system is based on
ability in peer-to-peer systems is a complex enough metric
Kademlia [9] and is structured on a distributed hash table
to warrant specification by more than just a fractional value
(DHT). Overnet has no hierarchy, i.e., all hosts have identi-
between 0 and 1. Thirdly, we show that, for the purposes of
cal functionality. When a new client joins Overnet, it ran-
storage placement, the availability of hosts is dependent on
domly generates an ID for itself. This is the client’s ID, and
time-of-day, but is roughly independent of the availability
it remains unchanged on subsequent joins and leaves of the
of other hosts. Finally, we measure the system-wide dynam-
client until the user deletes the file containing the client’s
ics of the P2P network by calculating the rate at which new
preferences. For lookup and routing purposes, each host
hosts arrive the system for the first time, and existing ones
maintains a list of neighbors and their IP addresses. The
depart. We conclude with a summary of our findings.
details of this list can be found in [9].
We measure the Overnet system to model host availability
2
Related Work
for two reasons:
 
Overnet users are identified by immutable IDs, en-
Saroiu et al. [13] and Chu et al. [3] have characterized host
abling us to track hosts by ID rather than by IP address.
availability in the Gnutella [6] peer-to-peer network by ac-
Using IDs eliminates the problem of host aliasing via
tively probing TCP/IP addresses gathered using a Gnutella
DHCP and NATs.
crawler. Sen and Wang [14] described a similar study us-
ing passive measurement of flow-level data from multiple
 
Host availability studies need to use a sufficiently
routers across a large tier-1 ISP backbone. Finally, Long
widely-deployed peer-to-peer network for measure-
et al. [8] measured workstation availability in the Internet
ments to be valid and acceptable. To our knowledge,
using an active probing methodology. Unfortunately, all of
Overnet is the only widely deployed DHT-based peer-
these approaches rely on IP addresses to uniquely identify
to-peer network.
individual hosts over time. This assumption was likely ac-
curate for Long’s 1995 study, but modern dynamic address
Note that Overnet is not an open-source system or proto-
assignment protocols such as DHCP can easily cause the
col. As a result, to perform our availability measurements
same host to be counted multiple times and thereby under-
we had to reverse engineer various aspects of the Overnet
estimate host availability. Moreover, the growth in the use of
protocol. Other popular open-source systems would have
NAT boxes can affect the correctness of a TCP/IP address-
been more convenient to measure, but they do not meet the
probing technique.
requirements of our study (in particular, identifying hosts by
Weatherspoon et al. [16] analyzed the impact of failure
unique ID).
on peer-to-peer block storage systems using a model based
on disk mean time-to-failure. In a separate paper, the au-
3.2
Methodology
thors also address the issue of independence between host
failures. They mention the need to quantify dependence
Our measurement infrastructure consists of two compo-
between failures at a coarse-grained level, i.e., failures due
nents; the crawler and the prober. The crawler provides us
to network disconnectivity, OS version, etc. [17] However,
with a global view of host membership in the system. The
these efforts do not capture the complexity of the peer-to-
prober allows us to get detailed and fine-grained information
peer environment – particularly the user-controlled transient
on individual host behaviour.
outages that dominate host availability in most real sys-
Crawler: The purpose of the crawler is to collect a snap-
tems.
The dependence of host availabilities on time and
shot of the IDs of the active hosts in the network at a par-
whether there is any interdependence between host avail-
ticular point in time. It does so by repeatedly requesting
abilities stemming from user behavior have yet to be studied
50 randomly generated IDs. These requests lead to the dis-
in detail.
covery of some number of hosts. The crawler repeats the
process by sending requests for the same 50 IDs to all the
newly discovered hosts. Thus the crawler uses a recursive
3
Experimental Methodology
algorithm to discover the IDs of hosts in the network by per-
forming lookups to as many hosts as it can find. The crawler
To study host availability in peer-to-peer systems, we ac-
runs once every 4 hours to minimize impact on the system
tively measure the availability of hosts in the Overnet file-
as it locates these hosts.
sharing network [11]. In this section, we describe Overnet
Prober: The purpose of the prober is to periodically
and our reasons for choosing it over other popular systems.
probe a set of hosts to determine whether they are available
We then describe our experimental methodology for period-
in the system or not at that particular time. It uses a ran-
ically identifying hosts and subsequently probing them.
dom subset of host IDs discovered by the crawler and probes
2

them every 20 minutes. We chose only a subset of hosts to
60
probe because the overhead of probing hosts limits the fre-
quency at which we can cycle through them. The prober
50
 
determines the availability of a host with ID
by perform-
 
ing a lookup for . The lookup succeeds only if the host with
40
 
ID
responds to the probes. So a successful lookup implies
an available host running an Overnet peer.
30
All our probes look exactly like normal Overnet protocol
traffic. This is in contrast to previous measurement studies
20
of peer-to-peer networks [3, 13] that use TCP SYN packets.
Percentage of hosts (%)
10
This strategy has two main advantages. First, it eliminates
the problem of IP address aliasing due to the use of DHCP,
0
NAT, and multiple users using the same machine. Second,
0
1
2
3
4
5
6
7
8
due to Overnet’s lookup procedure, we do not repeatedly
No. of days
send probes to hosts that have been unavailable for long pe-
Figure 1: Percentage of hosts that have more than one IP
riods of time, thus keeping our experiments non-intrusive.
address across different periods of time.
4
Results
the main cause is the use of DHCP. It is common that, when
In this section, we present the results of our measurements
a host leaves the system and joins it at a later time, it does
and the inferences that we can draw from them.
First,
so with a different IP address. NATs also introduce alias-
we summarize the data obtained from the crawler and the
ing into the network by making use of private IP addresses
prober.
Then, we show the effects of aliasing on mea-
for hosts behind them. Another possible cause of aliasing is
sured availability. Next, we show how the distribution of
multiple users using the same machine; they will have dif-
host availability varies depending on the time over which
ferent unique IDs, but the same IP address.
it is calculated. We then characterize time-of-day effects
Figure 1 provides more insight into the nature of aliasing.
on host availability and characterize host availability inter-
It shows the percentage of hosts that have more than one
dependence. Finally, we measure global host membership
IP address over varying periods of time. For example, even
turnover in terms of host arrivals and departures in Overnet.
over just one day, almost 40% of all probed hosts use more
than one IP address. This number increases to 50% after
4.1
Experiment summary
4 days. These results show that measuring host availability
by probing hosts using IP addresses can be very mislead-
The crawler ran every four hours for a period of 15 days,
ing. In fact, 32% of all probed hosts used five or more IP
from January 14 through January 28, 20031. Each pass of
addresses, and 12% used 10 or more! These numbers will
the crawler yielded approximately 40,000 host IDs, while in
only get larger with longer periods of probing. So, probing
a single day, or six passes of the crawler, between 70,000
by IP address does not accurately capture the availability
and 90,000 host IDs were seen
characteristics of the hosts. IP address probing would con-
Out of the roughly 84,000 IDs that the crawler gathered
sider each IP address a new host, thus greatly overestimating
on the first day, We chose 2,400 at random for the prober to
the number of hosts in the system and underestimating their
trace at fine-grained time intervals. . It probed these hosts
availability.
every 20 minutes for 7 days, from January 15 to January 21,
To evaluate the implications of IP address aliasing on
2003. Out of the 2,400 hosts, 1,468 responded at least once
modeling host availability, we derive the host availability
to the probes.
distributions for both probing techniques. Figure 2 shows
the cumulative distribution of host availability calculated
4.2
Aliasing effects
over seven days. We calculate host availability for each host
by dividing the number of probes that a host responds to
Although only 1,468 unique hosts responded to the prober,
by the total number of probes. The darker line shows the
a total of 5,867 unique IP addresses responded to the prober.
accurate distribution obtained by using host IDs to identify
This results in a unique host ID to IP address ratio of approx-
individual hosts, while the lighter line shows the distribution
imately 1:4. Clearly, host IP address aliasing is a significant
calculated by using the first IP address that was seen for each
issue in deployed peer-to-peer systems such as Overnet. The
host ID; the lighter curve is reminiscent of the availability
aliasing effects could be due to various reasons. Most likely,
curve in the popular Gnutella study [13]. There is a signifi-
1 There was a disruption on of January 21 for roughly 24 hours due to
cant difference between the two curves, with the IP address-
storage problems.
based calculation greatly underestimating host availability.
3

100
1
80
0.8
60
0.6
40
0.4
Host availability
Percentage of hosts (%)
20
0.2
10 hours
Considering Aliasing
4 days
Ignoring Aliasing
7 days
0
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
10
20
30
40
50
60
70
80
90
100
Host availability
Percentage of hosts
Figure 2: Host availability derived using unique host ID
Figure 3: The dynamic nature of the availability distribu-
probes vs. IP address probes.
tion. It varies with the time period over which availability is
calculated.
The difference in these distributions demonstrates the ex-
tent to which using IP addresses to identify hosts is inac-
tinuing our probing measurements for longer periods of time
curate. For example, the darker curve shows that 50% of
to validate this hypothesis.
hosts have availability 0.3 or less, but, if we had used the
The implication is that, when using an availability distri-
IP address-based probing denoted by the lighter curve, we
bution to characterize hosts in the system or simply using a
would conclude that 50% of all hosts have availability 0.07
fractional value to reflect mean host availability in models of
or less. Using IP addresses would thus understimate avail-
system behavior (e.g., [4]), one needs to also specify the pe-
ability by a factor of four. If we had used the IP address
riod of time over which the availability measurement holds.
methodology to parameterize the design of a highly avail-
Also, the fact that host availability decreases over longer pe-
able storage system, we would make more replicas of files
riods of time motivates the need for periodic file refreshes
than required to compensate for the apparently low avail-
(redistributions, or re-insertions) in the system.
ability and waste storage space. For example, in one model
of peer-to-peer file availability [2], the number of file repli-
cas required to maintain a 99% file availability given a mean
4.4
Time-of-day effects
host availability of 0.07 is five times the storage overhead
Next, we characterize the effect of time-of-day on host avail-
compared to the number of replicas required given a mean
ability. To do this we need to see how the host availability
host availability of 0.3.
pattern varies with local time, where local time is based on
the geographic location of each host. To calculate local time
4.3
Host availability
for each host, we use CAIDA’s Netgeo tool [10] to deter-
mine the longitude of the host using its current IP address at
We calculated host availability in Figure 2 over seven days,
the time at which it was probed. We then map host longitude
the entire period of our active probe measurements. How-
to a local time zone.
ever, the period of time over which host availability is cal-
Figure 4 shows the number of available hosts as a func-
culated can change the distribution. To determine the extent
tion of local time at the hosts’ geographic location. The ticks
of this effect, we varied the time period over which we cal-
on the x-axis correspond to midnight on the days that are la-
culate host availability. Figure 3 shows the results of this ex-
beled, and this applies to all following time-series graphs.
periment. Over a period of 10 hours, the distribution curve
As with other studies of peer-to-peer systems, the graph
is slightly concave, while for a period of 4 days, the dis-
shows a diurnal pattern [3]. The difference between the
tribution curve becomes convex. Over 7 days, the convex-
maximum and minimum number of available hosts in a sin-
ity of the curve increases. And we suspect that with longer
gle day is roughly 100.
periods of measurement this will only increase. Put differ-
We also found that on average, there were 9413 host joins
ently, the distribution curve moves more and more to the left
and leaves per day, or 6.4 joins and leaves per host per day.
as the period over which availability is calculated increases.
This figure is considerable, given that the number of hosts
This stems from the fact that we are probing the same hosts
that were probed and responded to probes were only 1468.
over the entire period, and the longer the period of time, the
In a system such as CFS, which actively replicates objects
greater the chances of a host being unavailable. We are con-
immediately when it learns of a host join or leave, this could
4

700
35
600
30
500
25
400
20
300
15
No. of hosts up 200
10
100
Percentage of host pairs (%)
5
0
0
-1
-0.5
0
0.5
1
01/15
01/16
01/17
01/18
01/19
01/20
01/21
01/22
01/23
Local time
Difference between P(Y=1/X=1) and P(Y=1)
Figure 4: Diurnal patterns in number of available hosts.
Figure 5: Probability density function of the difference be-
tween P(Y=1/X=1) and P(Y=1).
cause a large amount of overhead in terms of the amount of
data transferred between hosts.
pick a small subset of hosts randomly, it is highly unlikely
The other feature to notice in this graph is the steady de-
that the availabilities of all of them are strongly dependent
crease in the total number of hosts that are available over
on each other, even though each may show a strong correla-
subsequent days, which was reflected in our availability dis-
tion with time of day. For example, in CFS, the size of this
tribution measurement. Although limited by a short trace
subset is 6, while in Kademlia, it is 20. The probability of
duration, the trend indicates a decay of about 32 hosts per
all these hosts failing together would be very low.
day.
The fact that there is a steady decay in the num-
ber of hosts with time indicates that in a system such as
Oceanstore [7], frequent and periodic file refreshes are re-
4.6
Arrivals and departures
quired to maintain high file availability.
Host turnover is important for peer-to-peer systems that rely
upon long-term host membership. For example, archival
4.5
Host availability interdependence
peer-to-peer storage systems like Oceanstore use a high de-
gree of redundancy to mask host failures and departures over
The diurnal pattern indicates that availability varies with
time. The rate of host turnover fundamentally determines
time-of-day. At non-peak hours a number of hosts become
the rate at which the system must refresh and restore the re-
unavailable. Most structured peer-to-peer storage systems,
dundancy in the system to maintain file availability [2, 16],
e.g., [4, 9], assume that this happens with very low proba-
and the overhead that this process entails.
bility, failing which, objects stored in the system could be
To characterize host membership turnover in Overnet, we
lost forever. To our knowledge, this is the first study to in-
would like to determine the rate at which new hosts enter
vestigate the extent that this assumption holds.
the system for the first time (arrive) and the rate at which
We characterize the dependence between every host pair
existing hosts leave the system permanently (depart). Note
using conditional probabilities. Consider two hosts X and
the distinction with host joins and leaves, which refer to in-
Y. We need to determine the conditional probability of Y
termittent disconnections of hosts from the system. We esti-
being available given that X is available for a given time-of-
mate arrival and departure rates in Overnet using the 15-day
day . Call this value P(Y=1/X=1). If this is equal to the
crawler trace of active hosts. We consider the first occur-
¤
probability that Y is available whether or not X is available,
rence of a host ID in the trace as an arriving host, and the
or P(Y=1), then X and Y are independent. If independent,
last occurrence of a host ID as a departing host.
then the availability of X at time
does not imply anything
Figure 6 shows host arrivals and departures as a fraction
¤
about the availability of Y at that time.
of the number of active hosts in the system for each day
We calculated P(Y=1/X=1) and P(Y=1) for every host
in the trace. For perspective, during this period the crawler
pair from our empirical data for each hour in the trace. Fig-
found that roughly 85,000 hosts in Overnet were active each
ure 5 shows the probability density function of the difference
day. From the graph, we see that Overnet has a significant
between these two values. The graph shows that more than
degree of turnover. Each day, new hosts never seen before
30% of all pairs have 0 difference. Further, 80% of all host
in the trace comprise over 20% of the system (or roughly
pairs lie between +0.2 and -0.2, indicating that there is sig-
17,000 hosts/day). At the same time, existing hosts are de-
nificant independence between host pairs. So if we were to
parting at roughly the same rate. As a result, the overall
5

1
Proceedings of FuDiCo: Future directions in Distributed
Arrivals
Computing, June 2002.
0.9
Departures
[2] R. Bhagwan, S. Savage, and G. M. Voelker.
Replication
0.8
strategies for highly available peer-to-peer systems. Tech-
0.7
nical Report CS2002-0726, University of California, San
Diego, Nov 2002.
0.6
[3] J. Chu, K. Labonte, and B. Levine. Availability and locality
0.5
measurements of peer-to-peer file systems. In Proceedings of
ITCom: Scalability and Traffic Control in IP Networks
, July
0.4
Fraction of hosts
2002.
0.3
[4] F. Dabek, M. Kaashoek, D. Karger, R. Morris, and I. Stoica.
0.2
Wide-area cooperative storage with cfs. In proceedings of
the 18th ACM Symposium on Operating System Principles

0.1
01/15
01/17
01/19
01/21
01/23
01/25
01/27
01/29
(SOSP) , 2001.
Time
[5] G. Gibson. Redundant Disk Arrays: Reliable, Parallel Sec-
ondary Storage.
PhD thesis, University of California at
Figure 6: New host arrivals and existing host departures in
Berkeley, 1990. Report UCB/CSD 91/613.
Overnet as a fraction of all hosts in the system ( approxi-
[6] Gnutella homepage, http://www.gnutella.com.
mately 85,000 during this period). The high values at the
[7] J. Kubiatowicz, D. Bindel, Y. Chen, P. Eaton, D. Geels,
beginning and end of the period are artifacts of starting and
R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer,
ending the trace.
C. Wells, and B. Zhao.
Oceanstore: An architecture for
global-scale persistent storage. In Proceedings of ACM ASP-
LOS
, 2000.
size of Overnet stayed constant over our trace period. Since
[8] D. Long, A. Muir, and R. Golding. A longitudinal study of in-
our trace is only 15 days, though, these results only capture
ternet host reliability. In Proceedings of the Fourteenth Sym-
posium on Reliable Distributed Systems
, September 1995.
short-term turnover. We are continuing our trace to capture
[9] P. Maymounkov and D. Mazieres. Kademlia: A peer-to-peer
long-term turnover as well.
information system based on the xor metric. In Proceedings
of the 1st International Workshop on Peer-to -Peer Systems
(IPTPS’02)
, March 2002.
5
Summary
[10] Netgeo
-
the
internet
geographic
database,
http://www.caida.org/tools/utilities/netgeo/.
In this paper we studied several characteristics of host avail-
[11] Overnet website, http://www.overnet.com.
ability in the Overnet peer-to-peer file sharing system, and
[12] A. Rowstron and P. Druschel.
Storage management and
discussed the implications of our findings on the design and
caching in past, a large-scale, persistent peer-to-peer storage
operation of peer-to-peer systems. We found that IP address
utility. In Proceedings of the 18th ACM Symposium on Oper-
aliasing is a significant problem in these systems, and that
ating Systems Principles(SOSP ’01), 2001.
[13] S. Saroiu, P. K. Gummadi, and S. D. Gribble. A measurement
measurements according to host IP address significantly un-
study of peer-to-peer file sharing systems. In Proceedings of
derestimate peer-to-peer host availability. We also argue that
MMCN, 2002.
availability is not well-modeled by a single-parameter distri-
[14] S. Sen and J. Wang. Analyzing peer-to-peer traffic over large
bution, but instead is a combination of two time-varying dis-
networks. In Proceedings of ACM SIGCOMM Internet Mea-
tributions: (1) short-term daily joins and leaves of individ-
surement Workshop, November 2002.
ual hosts, and (2) long-term host arrivals and departures. In
[15] Total recall website, http://ramp.ucsd.edu/projects/recall/.
our Overnet trace, both behaviors significantly impact host
[16] H. Weatherspoon and J. Kubiatowicz. Erasure coding v/s
availability. For a given set of hosts probed at a fine time
replication: a quantitative approach. In Proceedings of the
granularity, each host joined and left the system 6.4 times
First International Workshop on Peer-to-peer Systems, 2002.
a day on average. For a global crawl of all active hosts in
[17] H. Weatherspoon, T. Moscovitz, and J. Kubiatowicz. Intro-
the system at a coarser granularity, we also found that host
spective failure analysis: Avoiding correlated failures in peer-
to-peer systems. In Proceedings of the International Work-
turnover in the system is considerable: over 20% of the hosts
shop on Reliable Peer-to-peer Distributed Systems, October
in system arrive and depart every day. Peer-to-peer systems
2002.
must take into account both sources of host unavailability to
gracefully and efficiently provide highly available service.
References
[1] R. Bhagwan, D. Moore, S. Savage, and G. M. Voelker. Repli-
cation strategies for highly available peer-to-peer storage. In
6