Yale University
Department of Computer Science
Inoculation Strategies for Victims of Viruses
and the Sum-of-Squares Partition Problem
James Aspnes
12
Kevin Chang
13
Aleksandr Yampolskiy
14
YALEU/DCS/TR-1295
July 2004
1
Department of Computer Science, Yale University, New Haven, CT 06520-8285, USA.
2
Email: aspnes@cs.yale.edu. Supported in part by NSF grants CCR-0098078 and CNS-
0305258.
3
Email: kevin.chang@yale.edu. Supported by NSF grant CCR-0331548.
4
Email: aleksandr.yampolskiy@yale.edu. Supported by NSF grants CCR-0098078 and ANI-
0207399.
Inoculation Strategies for Victims of Viruses
and the Sum-of-Squares Partition Problem
James Aspnes†
Kevin Chang‡
Aleksandr Yampolskiy§
Abstract
We propose a simple game for modeling containment of the spread of viruses in a graph of n
nodes. Each node must choose to either install anti-virus software at some known cost C, or risk
infection andaloss Lifavirusthat starts at a random initial point in the graph can reach it
without being stopped by some intermediate node. The goal of individual nodes is to minimize
their individual expected cost. We prove many game theoretic properties of the model, including
an easily applied characterization of Nashequilibria, culminating in our showing that allowing
selfish users to choose Nashequilibrium strategies is highly undesirable, because the price of
anarchy is an unacceptable
(
n) in the worst case. This shows in particular that a centralized
solution can give a much better total cost than an equilibrium solution. Though it is NP-hard
to compute such a social optimum, we show that the problem can be reduced to a previously
unconsidered combinatorial problem that we call the sum-of-squares partition problem.
Using a greedy algorithm based on sparse cuts, we show that this problem can be approximated
to within a factor of O (log2 n), giving the same approximation ratio for the inoculation game.
1
Introduction
Consider a system in whichnmachines, each of which mayor may not be protected against
viruses, are connected by a network in the form of a graph, and any virus that infects some
machine eventually infects all of its unprotected neighbors. If anti-virus software is available, a
natural response would be to protect all the machines—but perhaps the anti-virus software itself
creates costs, both in money and time to purchase and install the software and in reducedeciency
or usability of the protected machine. Suppose that protectinga machine by installing anti-virus
software costs the owner of the machine C, but that having the machine be infected costs L, which
mayor may not be greater than C. If the virus spreads from some initial machine chosen uniformly
at random, on which machines does it make sense to install the anti-virus software?
The answer will depend on whether we adopt the perspective of the owner of a single machine
or of the society as a whole. When the anti-virus software costs more than the loss from infection,
no economically rational machine owner will install the anti-virus software, every machine will be
infected, and the system will incur asocial cost of Ln. But for many graphs, selective inoculation
of a few central machines can limit the spread of infection to a small subset of the graph, greatly
reducing the total cost of infection in return fora small investment in anti-virus software. We can
ask how much of an improvementa centralized solution can provide, and how easy it is to find a
good centralized solution.
Department of Computer Science, Yale University, New Haven, CT 06520-8285, USA.
†
Email: aspnes@cs.yale.edu. Supported in part by NSF grants CCR-0098078 and CNS-0305258.
‡
Email: kevin.chang@yale.edu. Supported by NSF grant CCR-0331548.
§
Email: aleksandr.yampolskiy@yale.edu. Supported by NSF grants CCR-0098078 and ANI-0207399.
1
After discussing some previous work on related problems (in Section 2), we give a complete
characterization of the Nashequilibria for an anti-virus software installation game in which each
machine'sownerseparately chooses whether or not to install the software, without regard to the
eect
on other machines. (This game is defined in Section 3.) We show (in Section 4) that
finding either the most or least expensive equilibrium is NP-hard, but that some Nashequilibrium
can be computed in O (n3) time and that any population of nodes will quickly converge toa Nash
equilibrium by updating their strategies locally based on the other nodes'strategies. Unfortunately,
the cost of any such Nash equilibrium maybe badly suboptimal; the price of anarchy for this
game is
(
n) in the worst case. This shows that for many graphs and values of Cand L, letting
the users choose individually whether or not to inoculate their machines will give bad results.
We then consider (in Section 5) the possibility of a centralized solution in which a dictator
computes and enforces an optimal inoculation plan. We show that essentially the same argument
that shows that extreme Nashequilibria are hard to find applies to the optimal solution as well.
However, we show that the problem of finding an optimal inoculation plan reduces to a graph
partition problem in which we are asked to partition the graph by removingmnodes; the quality
of the partition is measured by the sum of the squares of the sizes of its components. We give (in
Section6) a polynomial-time approximation algorithm that removes O (log2 n) mnodesinorderto
achievea partition with quality within O (1) of the optimum.
Conclusions and open problems appear in Section 7.
2
Related work
In this section, we describe three classes of work related to this paper: virus propagation models,
economic models of investment insecurity, and game-theoretic models of security. We then discuss
some work on the graph partition problem that is related to the sum-of-squares partition problem
we consider in Section 6.
2.1 Viruspropagation models
Traditional epidemiological models characterize the viral infection in terms of birthrate and death
rate of the virus[4,9]. Usually, these models assume that an infected individual is equally likely
to infect any other individual in the population; meanwhile, computer viruses usually spread via
localized interactions. Kephartand White extended the traditional model by transferring it onto
a directed random graph[16]. Later works (e.g. , [15,17,25]) studied virus propagation over other
kinds of graphs, including Internet-like power-law graphs[22,23,26]. We do not restrict the network
topology in anyway and consider a general undirected graph. Our model is in someways closer to
models in percolation theory (see[18]): an infected node infects all of its unprotected neighbors,
spreading infection throughout the graph until it is blocked by an anti-virus software.
2.2 Economicmodels of security
Our work is motivated in part by an observation that security technologies exhibit network externalities[1].
Specifically, the benefit obtained by using security technology (anti-virus software in
our case) does not accrue only to the user of the security technology but rather to all users of the
network. Aspnesetal. [3]also consider anti-virus immunization, and proposed studying how to
encourage highly connected nodes touse anti-viral techniques.
2
We assume that costs of installation and infection are known. Alternatively, one could use risk
analysis to estimate the costs and benefits from installing a security technology (see, for example,
[13]), or estimate values based on empirical studies of the costs of security breaches[6,10].
2.3 Game-theoretic models of security
Application of game theory to network security has yielded interesting results [11,12,24]. For
example, Bell uses a simple game to study network reliability. In the game, the router tries to find
a least cost path and a network tester tries to maximize this cost by failing links[5]. Kunreuther
and Heal recently introduced the notion of interdependent security (IDS) games, in which
decisions to adopt security technology by one agentaect
other agents[20]. Kearnsand Ortiz
subsequently extended their paper and gave an algorithm for finding approximate Nashequilibria
in this model[14].
Our work is similar to work on IDS games in certain respects: each agent in both our game and
an IDS game makes a decision whether or not to invest money in a security technology, and this
decisionaects
other agents. The maindierences
are that we assume that installing anti-virus
software protects against all badeects
of viruses, while the IDS work concentrates on negative
side-eects
of security breaches even on protected parties; and we assumea restricted network
topology that contains the spread of viruses, while the IDS work assumes a complete topology.
2.4 Graphpartition problems
In Section6, we describe and provide an approximate solution fora new graph partitioning
problem. Previous work on other forms of graph partitioning includes the approximation algo-
rithmofLeighton and Rao[21]for sparsest cut, from which the same authors derive apseudo-
approximation algorithm forb-balanced cuts, where each side of the cut must have sizeb|V|or
greater. The case ofb=1/2 is graph bisection, for which Feigeand Krauthgamer[8]giveagood
approximation algorithm. Even et al. [7]give O (logn)-ratio pseudo-approximation algorithms for
several balanced partitioning problems, including the-separator problem and thek-balanced
partitioning problem.
3
Our model
We represent network topology by an undirected graph G= (V,E), where V={0,1,...,n 1}is
a set of network hosts and E
V?Visasetof (bidirectional) communication links. Our basic
model for installing anti-virus software is a one-round game:
Players. Our game hasnplayers corresponding to nodes of the graph. Initially, all nodes are
insecure and vulnerable to infection.
Strategies. We denote the strategy of i by a
i
. Each node i has two possible actions: do nothing
and risk being infected or inoculate itself by installing anti-virus software. Node i 'sstrategy
a
i
is the probability that it inoculates itself.
Nodes'choices can be summarized in a strategy profile~a 2[0,1]n. Ifa
i
is 0or1, we say
that node i adopts a pure strategy; otherwise, its strategy is mixed. We call nodes that
install anti-virus software secure and denote the set of such nodes by I
~a
. We associate an
3
attack graph G
~a
=GI
~a
with~a. It is essentially the network graph with secure nodes
and their edges removed (see also Figure 1). Note that both I
~a
andG
~a
are random variables
unless all strategies are pure.
Attack model. After the nodes made their choices, the adversary picks some node uniformly
at random as a starting point for infection. Infection then propagates through the network
graph. Node i gets infected if it has no anti-virus software installed and if any of its neighbors
become infected.
Individual costs. Suppose it costs Ctoinstallanti-virus software. If anode is infected, it suers
alossequaltoL. For simplicity, we assume that both Cand Lareknownandare the same
for all nodes; we discuss possible consequences of removing these assumptions in Section 7.
The cost of a mixed strategy~a 2 [0,1]ntonodeiis
cost
i
(~a) =a
i
C+ (1 a
i
) L·p
i
(~a) .
Herep
i
(~a) is the probability of node i being infected given the strategy profile~a, conditioned
on the event that node i does not install the anti-virus software. It is equal to the probability
that some vulnerable node reachable from i without passing through a secure node is the
initial point of infection. For pure strategies, this is justk
i
/n, where k
i
is the size of the
connected component containing i in the attack graph G
~a
.
1
2
3
0
4
5
G
1
2
3
0
4
5
G
a
Figure1: Sample graph Ganditsattack graph G
~a
for~a=010100.
Social cost. The total social cost of a strategy profile is the sum of the individual costs. For pure
strategies, there is a simple characterization of the total social cost in terms of the attack
graphG
~a
. Because each node in the same component of G
~a
has the same chance of infection,
thek
i
nodes in the i -thcomponent between them face a loss of k
i
· (Lk
i
/n) = (L/n) k2
i
. So
the social cost is
cost(~a) =
n
1
X
j=0
cost
j
(~a)
=
n
1
X
j=0
a
j
C+ (1 a
j
) L·p
j
(~a)
=C|I
~a
|+
L
n
l
X
i=1
k2
i
,
4
wherek
1
,k
2
,...k
'
are the sizes of the components in G
~a
.
4
Nashequilibria
We consider first the choices that the nodes will make in the absence of coordination, by examining
theNash equilibria of the game defined in Section 3. The assumption that the nodes will reach
a Nashequilibrium is a very strong one, as it requires assuming that the nodes are aware of each
other'schoicesto install or not and that the nodes can evaluate C (printed on the box for the antivirus
software) and L (which is more problematic). It also assumes that the nodes can compute
a Nashequilibrium in a reasonable amount of time, which is not always possible for some games.
However, we can show that Nashequilibria for our game are easily characterized in terms of the
sizes of the components of the attack graph (Section 4.1), and that a population will converge to
someNash equilibrium quickly even though finding the best or worst pure equilibrium as measured
by total cost is NP-hard (Section 4.2).
We can further imagine that some of thediculties
of limited information could be overcome
by considering an iterated game where nodes pay Ctorenttheanti-virus software in each round
and update their strategies based on observations of losses to viruses and the strategies of other
nodes in previous rounds; though we do not analyze this multi-round game formally, a simplified
version is implicit in our convergence result. We also show that the hardness of finding the worst-
case equilibrium does not prevent obtaining further information about its behavior; for example,
its total cost is nondecreasing as a function of the inoculation cost C (Section 4.3).
Unfortunately, selfish behavior proves to be highly undesirable, because the cost ofa Nash
equilibrium solution maybe very far from the social optimum. In Section 4.4, we prove that while
the price of anarchy, defined as the ratio of total cost between the worst Nashequilibrium and
the social optimum never exceedsn, this bound is tight up to constant factors for some graphs and
choicesofCand L.
4.1 Characterization of mixed and pure equilibria
A useful feature of the Nashequilibrium for our game is its simple characterization: there is always
a component-size thresholdt=Cn/Lsuchthateach node will install the anti-virus software if it
would otherwise end up in a component of vulnerable nodes with expected size greater thant, and
will not install the software if it would otherwise end up in a component with expected size less
thant. When the expected component size equalst, the node isindierent
between installing and
not installing and may adoptamixed strategy. The threshold arises in a natural way: it is the
break-even point at which the cost Cofinstallingthe software equals the expected loss L (t/n) of
not installing.
We define~a[i/x]to be the strategy vector that is identical to~a, except the i 'thcomponenta
i
isreplacedbyx. Note that attack graph G
~a[i/0]
is the attack graph in which player i never installs
the anti-virus software.
Theorem1 (Characterization of mixed equilibria): Suppose S (i) is the expected size of the insecure
component that contains node i of the attack graph G
~a[i/0]
, (i.e. S (i) =np
i
(~a) ).
FixC,L. Let the threshold bet=Cn/L. A strategy profile ~aisa Nashequilibrium if and only
if
5
(a) For all i such that a
i
=1, S (i) t.
(b) For all i such that a
i
=0, S (i) t.
(c) For all i such that 0<a
i
<1, S (i) =t.
Proof:
Suppose~aisa Nashequilibrium and consider node i. The expected cost to node i is a
i
C+(1
a
i
)(L/n) S (i).
1. Supposea
i
=0. Then nodeihasexpected cost (L/n) S (i). If (L/n) S (i) >C, then i will
want find the situationa
i
=1 withcostCpreferable. Thus, we must have S (i) CL/n=t.
2. Supposea
i
=1. Then nodeihasexpected cost C. If (L/n) S (i) <C, then i would find
the situationa
i
=0 withexpected cost (L/n) S (i) <Cpreferable. Thus, we must have
S (i) CL/n=t.
3. Suppose0<a
i
<1. If (L/n) S (i) >C, then i will find the situationa
i
=1 preferable.
If (L/n) S (i) <C, then i will find the situationa
i
=0 preferable. Thus, we must have
S (i) =CL/n=t.
Thus, ~asatisfies condition (a), (b), and (c) above.
Conversely, suppose ~asatisfies conditions (a), (b) and (c) of the theorem. Consider node i.
1. Supposea
i
=0. Then nodeiwillhave expected cost (L/n) S (i) <C, and thus will not want
to switch to a dierenta
i
that puts any weight on installing at cost C.
2. Supposea
i
=1. Then nodeiwillhave cost C, and thus will not want to switch to a dierent
a
i
that puts any weight on being insecure at expected cost (L/n) S (i) C.
3. Suppose0<a
i
<1. Then nodeiwillhave expected costa
i
C+ (1 a
i
)(L/n) S (i) =C.
Switching to any other strategy will have the same expected cost.
Thus, ~aisaNash equilibrium.
A special case of Theorem 1 is the following characterization for pure Nashequilibria. Because
nodes in a pure Nash equilibrium do not make randomized choices, the attack graph is nota random
object, but a determined graph. We have the same threshold conditions as before, but the removal
of randomness simplifies the situation.
Corollary2 (Characterization of pure equilibria) Fix C,L. Let the threshold bet=Cn/L. A
strategy profile~aisapure Nashequilibrium if and only if
(a) Every component in attack graph G
~a
hassizeatmostt.
(b) Inserting any secure nodej 2 I
~a
and its edges into G
~a
yields a component of size at leastt.
For example, let C=0.5 and L=1, and consider the graph in Figure 1. The threshold for this
graph is t=Cn/L=3. Then Corollary 2 tells us that pure strategy ~a=010100 must bea Nash
equilibrium for these Cand L.
6
4.2 Computing pure Nashequilibria
Designing algorithms for finding mixed Nashequilibriaor proving hardness results for finding
optimized mixed equilibria would most likely involve estimating or otherwise manipulating the
expected value of the sizes of components in the attack graph, which is at the very least anon-
trivial problem. Furthermore, in the absence of central control, nodes attempting to calculate their
best strategy based on a mixed strategy paradigm would possibly run into similar computational
issues.
Thus, we turn our attention to the computation and hardness of pure Nashequilibria. Corollary
2givesusa powerful tool with which to reason about pure Nashequilibria. We now show
that computing the best or worst pure Nashequilibriais hard, but that finding some intermediate
Nashequilibrium is easy. A consequence of this algorithm is that the existence of a pure Nash
equilibrium is always guaranteed. (The existence of a mixed Nashequilibrium is a consequence of
Nash'stheorem.)
Theorem 3Both computing the pure Nashequilibrium with lowest cost and computing the pure
Nashequilibrium with highest cost are NP-hard problems.
Proof: We reduce vertex cover to the decision problem"Does there exist a pure Nashequilib-
riumwithcostless thanc?"and we reduce independent dominating set to"Does there exist
a pure Nash equilibrium with cost greater thanc?"
Fix some graph G= (V,E), and set C/L=1.5/nsothatt=Cn/L=1.5, where tis the
component size threshold from Corollary 2. Then from Corollary 2, in any Nashequilibrium the
components of the attack graph all have size at most 1, and any secure node is adjacent to some
insecure node (as otherwise it coulduninstall its software and be in a component of size at most
1). It follows that ina Nashequilibrium (a) every vulnerable node is either isolated or has all
neighbors secure, and (b) every secure node has an insecure neighbor.
We now argue that Ghasavertexcover of size kif and only if the inoculation game on
Gwiththeabove settings of Cand Lhasa Nashequilibrium with kor fewer secure nodes, or
equivalently an equilibrium with social cost Ck+(n
k) L/nor less, as each insecure node must be
in a component of size 1 and contribute exactly L/nexpectedcost. Given a minimal vertex cover
V0
V, observe that installing the software on all nodes in V0 satisfies condition (a) because V0
is a vertex cover, and (b) because V0 is minimal. Conversely, if V0 is the set of secure nodes ina
Nashequilibrium, then V0 is a vertex cover by condition (a). This concludes the proof that finding
a minimum-cost Nashequilibrium is NP-hard.
Fora maximum cost equilibrium, consider the set of insecure vertices. These consist of isolated
nodes (which are already in components of size 1) and nodes that do not install the software because
all their neighbors do. Given an independent dominating set V0
V, installing the software on all
nodes except the nodes in V0 satisfies condition (a) because V0 is independent and (b) because V0
is a dominating set. Conversely, the insecure nodes in any Nashequilibrium are independent by
condition (a) and dominating by condition (b). This shows that Ghasan independent dominating
set of size kif and only if it hasa Nashequilibrium with no more thank insecure nodes, which
occurs only if it hasa Nashequilibrium with at leastn
ksecurenodesor, equivalently, a cost of
at least C (n
k) + (L/n)(k).
Theorem 3saysthatfinding extreme pure equilibria is hard. But what if we just want to
converge to some equilibrium, but we don'tcarewhichone? Suppose we implement the process
7
of convergence implied by the Nashequilibrium: at each step, some participant, whose current
strategy is suboptimal given the others'strategies, switches. This is an easy process to implement
because each participant can detect if its strategy is suboptimal using the t=Cn/Lcomponent
size threshold from Corollary 2.1 But does this process converge toa Nashequilibrium? If it does,
how long does it take?
By choosing an appropriate potential function, we can show that this process does indeed
converge to a Nash equilibrium quickly:
Theorem 4Starting from any pure strategy profile~a, if at each step some participant witha
suboptimal strategy switches its strategy, the system converges to a pure Nashequilibrium in no
more than 2nsteps.
Proof: Lett=Cn/L. For any strategy profile~a, consider the set S
big
(~a) of "big"components
ofG
~a
of size greater thantandtheset S
small
(~a) of "small"components of G
~a
of size less than or
equal tot. Define a potential function
by
(~a) =
X
A 2S
big
(~a)
|A|
X
A 2S
small
(~a)
|A|.
It is easy to see thatn
(~a) nforany~a. We will now show that each step of the process
reduces
by at least one. There are two main cases:
1. Some node i switches from insecure to secure. In this case i was previously an element
of a component inS
big
ofsizem>t. This former component becomes one or more new
components with total sizem
1; if all of the resulting components are big,
is reduced by
exactly one; otherwise,
is reduced by more than one as some components move from the
positive to the negative side of the ledger.
2. Some node i switches from secure to insecure. In this case the resulting component containing
ihasmtelements, and it replaces one or more old components with total sizem
1. As
both the new component and the old components are small, the neteect
on
it by one.
If each step reduces
by one, the number of steps must be less than thedierence
initial and final value of
, which is at most
n
(n) =2n.
As a special case, we can start with ~a=1nandconvergeto an equilibrium from above by
checking each node once. Each such test requires computing the size of the component in the
attack graph, which takes time O (|V|+|E|) =O (n2) using depth-first search; this gives:
Corollary 5ANash equilibrium can be computed intime O (n3) .
It is not hard to see that the 2 nin Theorem 4 is close to the best possible bound, although
a more careful analysis might reduce it slightly. A lower bound of n steps is trivial: in a system
with C<L/nandnoplayers secure in the initial strategy profile, it takesnstepsfor all players to
1
We must assume in this implementation either that the choice to install software or not is reversible, or that each
player can observe the other players'intended actions and respond accordingly.
8
install the anti-virus software. To get closer to 2 n, consider aline witht=
p
n
1
2
. Now consider
an execution of the process where initially players 1 throughn
p
n, in increasing order, install
to escape the single overlarge component; but then all players not at positionsk
p
nforsomek
uninstall; this takes 2 n
2
p
nsteps.
We also have:
Corollary 6Apure Nashequilibrium always exists.
4.3 Consequences of changes in the inoculation cost
Though Theorem 3 suggests that we cannot hope to characterize the worst pure Nashequilibrium
exactly, we can giveadescription of how it reacts to changes in the inoculation cost C.
Theorem 7Thecost of the worst pure Nashequilibrium is anon-decreasing function of Cwhen
Crangesover[2 L/n,L) .
Proof: Fix some price of anti-virus software, C
2L/nsothatb Cn/Lc
2.
Suppose we increase the price from Cto C0=C+ (
>
0). We denote the worst-cost Nash
equilibrium when the price is Cby~aandtheworst-cost equilibrium when the price is C0 by
~
b.
If price increment is
L/n, then the threshold (in Theorem 1) increases by at most one; that
is, bC0n/Lc
b
Cn/Lc+1. We consider two cases:
Case1: ~aisa Nashequilibrium for C0. This case is easy. Because
~
bis a worst-cost Nash
equilibrium forC0, we have:
cost
C
(~a) <cost
C
0
(~a) cost
C
0
(
~
b) .
Case2: ~ais nota Nashequilibrium for C0. This can happen only ifb C0 n/Lc=bCn/Lc+1.
Specifically, there must exist anodew 2 I
~a
such that adding it into attack graph G
~a
yieldsa
component of size bCn/Lcbutnotb C0 n/Lc. Let us denote the sizes of components adjacent
towinG by k
1
,...,k
s
.2 Wethenhave:
P
s
i=1
k
i
=bCn/Lc1.
We define anew strategy ~a0=~a[w/0], which is the same as~aexceptweno longer install
anti-virus software on nodew. Moreover,
cost
C0
(
~
a0) cost
C
(~a) =
L
n
bCn/Lc2
C+
L
n
s
X
i=1
k2
i
!
L
n
0
@
bCn/Lc2
s
X
i=1
k
i
!
2
1
A
C
=
L
n
bCn/Lc2
(bCn/Lc
1)2C. (1)
Equation (1) is non-negative whenever
2b Cn/Lc
1
Cn/L,
2
We say that a component K
Vis adjacent to nodewif 9 v 2 Ks.t. (v,w) 2 E.
9
which always holds by assumption for all C
2L/n.
We repeat this process until there do not exist any nodes violating Nashequilibrium condition.
At each step, the cost of our new strategy does not decrease. Therefore, if at the end we get
a Nashequilibrium
~
d, then
cost
C
(~a) cost
C0
(~a) cost
C0
(
~
d) cost
C0
(
~
b) .
Hence,
cost
C0
(
~
b) cost
C
(~a)
cost
C0
(
~
d) cost
C
(~a)
cost
C
0
(
~
a0) cost
C
(~a)
0.
Because we chose Carbitrarily, our argument holds for all values of.
4.4 Priceofanarchy
The notion of the price of anarchy was introduced by Koutsoupiasand Papadimitriouin[19].
It is defined as the worst-case ratio between the cost ofa Nashequilibrium and the cost of the
optimal solution, and is thus a measure of how faraway a Nashequilibrium can be from the social
optimum.3 Inourgame, we show that the price of anarchy is quite high,
(
n).
This is a consequence of two simple lemmas:
Lemma8 (Lower bound). Let Gbethestargraph K
1,n
. Let the price of the anti-virus software
beC=L (n
1) /n. Then
(G,C,L) =n/2.
Proof: The given Cand Lsatisfyt=Cn/L=n
1. From Corollary 2, it follows that installing
anti-virus software on exactly one node isa Nashequilibrium. If pure Nashstrategy~ainstallsanti-
virus software on some node that is not the center node, the cost will be C+L (n
1)2/n=L (n1).
An optimal strategy for the star with the given Cand Lis~a= (1,0,..., 0) (i.e. , only the
center node installs anti-virus software.) Its cost is C+L (n
1) /n=2L (n
1) /n.
The price of anarchy is therefore
L (n
1)
2L (n
1) /n
=
n
2
.
Lemma9 (Upper bound). Fix any graph Gandcosts C,L. Then
(G,C,L) n.
3
Because our game has a random component, the cost is an expected cost.
10
Proof: Let~adenotethe optimum solution.
IfC>L, no node ina Nashequilibrium will install anti-virus software. Hence, there is only one
Nashequilibrium~a=0n, whose cost is Ln. If the optimum solution contains at least one secure
node, then then cost(~a) C>L. (Otherwise, ~a=0nand (G,C,L) =1.) We thus have:
(G,C,L)
Ln
L
=n.
IfC
L, then the expected cost of the worst Nashequilibrium~ais at most Cn, because the
expected cost to each node is at most C (if the expected cost to anode is greater than C, then it
will want to switch to installing the software with probability 1.) If the optimum solution contains
at least one secure node, then cost(~a) C. Otherwise, the optimum solution contains no secure
nodes and hence cost(~a) LC.
(G,C,L)
Cn
C
=n.
5
Optimization
Allowing users to selfishly choose whether or not to install anti-virus software may be grossly inefficient,
relative to the social optimum. An alternative approach to this problem is fora benevolent
dictator to attempt to maximize social welfare by centrally computinga solution and imposing it
on all nodes.
Diculties
with this approach arise from the hardness of computing the optimum solution to the
inoculation problem. In the first two sections, we givea characterization of the optimum solution
and use it to show that the inoculation problem is NP-hard.
This suggests computing an approximate solution. We can find in polynomial time a solution
with approximation ratio at most O (log2 n); such a solution is substantially better than the
(
n)
ratio derived from the worst Nashequilibrium.
5.1 Characterization
We have a graph-theoretic characterization of optimum strategies, similar to our characterization
of Nashequilibria in Theorem 1:
Theorem 10Fix C,Land lett=Cn/L. If~ais an optimum strategy, then every component in
attack graph G
~a
has size at most max(1, (t+1) /2) .
Proof: Strategy ~apartitions Gintodisjoint components. Pick some component from attack
graph with at least two nodes; we call it K
V (k=|K|). (If we can'tfindacomponent with at
least two nodes, then all components in attack graph must have size one, and the theorem follows.)
If we install an anti-virus on some node of K, we may getmnew components in G
~a
, where
0
m
k
1. We denote component sizes byk
1
,...,k
m
, where
P
m
i=1
k
i
=k1. Because~ais an
11
optimal strategy, installing anti-virus on an extra node cannot improve the total cost. Therefore,
we have:
C+
L
n
m
X
i=1
k2
i
!
Lk2
n
,
k2
m
X
i=1
k2
i
!
t.
(2)
Ifm=0, then Equation (2) becomes:
k
p
t
(t+1) /2.
Meanwhile, form>0,
k2
m
X
i=1
k2
i
!
k2
m
X
i=1
k
i
!
2
=k2 (k1)2
=2 k1.
(3)
Combining Equations (2) and (3), we get:
k
(t+1) /2.
5.2 Hardnessof optimal solution
Unfortunately, the optimal solution is hard to compute.
Theorem 11Itis NP-hard to compute an optimal strategy.
Proof: The proof is by reduction from vertex cover and is similar to the proof of Theorem 3.
5.3 Reduction to sum-of-squares partition
Because it is unlikely that we can find an optimal solution, we naturally consider approximation
algorithms.
The optimization problem that defines the inoculation problem can be posed as follows: choose
the set of secure nodes I
~a
that minimizes the following objective function:
C|I
~a
|+
L
n
X
V2 (I
~a
)
|V|2,
where (I
~a
) is the set of connected components created by the removal of nodes in I
~a
.
12
For the purposes of our approximation algorithm for the inoculation problem, we assume that
we can"guess"m=|I
~a
|, the number of secure nodes in an optimum configuration. This assumption
is without loss of generality, because we can run our algorithm on all possible choices ofm=1,...n
and take the best solution.
Thus, a solution to the inoculation problem is reduced to finding a solution to the problem
of removing mnodes from a given graph to minimize the sum of the squares of the sizes of the
surviving components. We discuss this problem in Section 6.
6
Sum-of-squares partitions
In Section5.3, we came across the following problem, which we now analyze in more detail.
Problem 12 (Sum-of-Squares Partition Problem) By removing a set Fofatmostmnodes, par-
titionthegraph into disconnected components H
1
,...,H
k
, such that
P
i
|H
i
|2 isminimum.
Though we have arrived at this combinatorial optimization problem via our study of containing
computer viruses, it maybe of independent interest. Note that it inherits NP-hardness from the
inoculation problem. The edge version of the sum-of-squares-partition problem is similar, but asks
for the removal of m edges, rather than nodes, to disconnect the graph.
Our goal in this section is to prove the following theorem:
Theorem 13Let OPT be the optimum objective function value for the Sum-of-Squares Partition
Problem on Gwith the removal of at mostmnodes. We can find a set Fof O (log2 n) mnodes, such
that their removal creates disconnected components H
1
,...,H
l
such that
P
i
|H
i
|2O (1) ·OPT.
An immediate consequence of this theorem is the existence of an approximation algorithm for
the inoculation problem:
Corollary 14If OPT is the cost of the optimum solution for the inoculation problem, there exists
a polynomial-time approximation algorithm that finds a solution with cost at most O (log2 n) ·OPT.
Proof: Suppose an optimum solution contains msecure nodes, and the sizes of the insecure node
components arek
1
,...,k
p
, sothatOPT=Cm+L/n
P
i
k2
i
. Using our approximation algorithm
for the sum-of-squares partition problem, we can find a set of O (log2 n) msecurenodessuch that
the sum of the squares of the corresponding insecure components is at most O (1)
P
i
k2
i
. Thus, the
cost of the approximate solution is:
O (log2n) ·Cm+O (1) ·
L
n
X
i
k2
i
O (log2n) ·Cm+O (log2 n) ·
L
n
X
i
k2
i
=O (log2n) ·OPT.
13
6.1 ProofofTheorem 13
Our proof of Theorem 13 is based on the algorithm PartitionGraph given in Figure 2. It uses
known algorithms for sparse cuts. The sparse cut literature has focused on edge cuts. However, in
the case of our problem, cuts that involve removing nodes in order to disconnect the graph are more
applicable. Fortunately, the Leighton-Raoapproximation algorithm for finding sparse cuts extends
to node cuts: there is a well known process for converting anode cut algorithm in an undirected
graph to an edge cut algorithm in a directed graph. Since the Leighton-Raosparsecut algorithm
extends to edge cuts for directed graphs, it can be extended to node cuts. In particular, we will
use the following fact, which is implicit in the discussion of balanced node cuts in[21].
Theorem 15There exists an O (logn) -approximation algorithm for the following problem: Given
graphG, find anode cut that partitions the nodes Gintothreesets: two sets defining disconnected
subgraphs with node sets, V
1
andV
2
, and a set of removed nodes R, such that the quantity
(|V
1
|+
|R|
2
)(|V
2
|+
|R|
2
)
|R|
(4)
is maximized.
We refer to the quantity in expression (4) as the sparsity of the cut. In the literature, sparsity
is usually given as the the inverse of expression (4), and finding the sparsest cutis cast as a
minimization problem. We have presented it as a maximization problem, since this is more natural
for our application.
Improvements to the approximation ratio for sparsest (edge) cut have recently been discovered
[2]. However, the authors do not mention whether or not their results extend to sparse cuts for
directed graphs.
Our algorithm for solving the sum-of-squares partition problem, PartitionGraph (see Fig-
ure2), achieves the approximation results claimed in Theorem 13. The general approach of the
algorithm is similar to the greedy logn-approximation algorithm for set cover. A high-leveldescrip-
tionisthatwe repeatedly remove the node cut that gives us the best per-node-benefit, quantified
as its cost-eectiveness.
Suppose we have a connected subgraph Hwithknodes. If node cut Rcreates connected
components with node sets V
1
andV
2
, this cut has decreased the objective function value (
P
size
of connected component2) byk2|V
1
|2|V
2
|2. We thus define the cost-eectivenessof node
cutRby (k2|V
1
|2|V
2
|2) /|R|. The cost-eectiveness
of
Risequalto
k2|V
1
|2|V
2
|2
|R|
=
(|V
1
|+|V
2
|+|R|)2|V
1
|2|V
2
|2
|R|
=
|R|2+2|V
1
||V
2
|+2|R| (|V
1
|+|V
2
|)
|R|
=
2|V
1
||V
2
|
|R|
+|R|+2(k|R|)
=
2|V
1
||V
2
|
|R|
+2k|R|.
We then have the following relation between finding sparse cuts and cost-eectiveness
.
14
Lemma 16LetHbeagraph withknodes. If is the maximum cost-eectiveness
of all node
cutsofH, the Leighton-Raosparsestnode cut algorithm will find a cut with cost-eectiveness
at
least/ (clogk) , for some constantc.
Proof: The sparsity of anode cut that removes node set Rand partitions the remaining nodes
of Hintoconnected components with node sets V
1
andV
2
is given by:
(|V
1
|+
|R|
2
)(|V
2
|+
|R|
2
)
|R|
=
|V
1
||V
2
|+
|R|2
4
+
|R|
2
(|V
1
|+|V
2
|)
|R|
=
|V
1
||V
2
|
|R|
+
|R|
4
+
1
2
(k|R|)
=
|V
1
||V
2
|
|R|
+
k
2
|R|
4
.
We then have the following relationships between the cost-eectiveness
of a cut,
, and its
sparsity, .
=
|V
1
||V
2
|
|R|
+
k
2
|R|
4
=/2k/2+|R|/4/4.
and
>
2.
Thus, we know there exists anode cut with sparsity at least/4 (i.e. the cut with the highest
cost-eectiveness
). The sparsest cut algorithm on
Hwillfindanode cut with sparsity at least
/ (clogk), for some constantc. This node cut will have cost-eectiveness
at least 2
/ (clogk).
Input: AGraphG. A set of marked nodes F.
1. Use the Leighton-Raoalgorithmto find an approximate most cost-eective
cut in eachcon-
nectedcomponent of G.
2. LetH
1
,...,H
k
be the components of Gin which the Leighton-Raoalgorithm found a cut
that removes at most (20 clogn) mnodes, where cis the constant from Lemma 16. If no such
component exists, then halt and output Fas the final set of marked nodes.
3. Otherwise, choose the component H
j
from among those considered in Step 2, for which the
cost-eectiveness
is highest. Let the cut be
H
j
=V
1
[V
2
[R, where Risthefinal set of
marked nodes.
4. SetF=F[R. Replace H
j
byV
1
andV
2
inG. If|F|> (36 clog2 n) m, then halt and output
Fas the set of marked nodes.
5. Otherwise, repeat.
Figure2: Algorithm PartitionGraph
We now give some lemmas that characterize the behavior of the PartitionGraph algorithm.
Fix an optimum solution for the sum-of-squares partition problem; let Fbeanoptimumset ofm
removed nodes.
Lemma 17PartitionGraph outputs at most O (log2 n) mmarkednodes.
15
Proof: Since the algorithm halts as soon as we augment the set of marked nodes such that|F|>
(36 clog2n) m, we know that at the beginning of each iteration, Fcontainsatmost (36 clog2 n) m
marked nodes. The lemma follows, because we add at most (20 clogn) mmarkednodesper iteration.
In the next few proofs, we will abuse notation slightly and denote the order of graph G (i.e. the
number of nodes) by|G|=|V (G) |. We will also denote an "intersection"ofagraph Gandanode
setVasGV, which is the set of nodes that Gand Vshare.
Lemma 18Suppose after a number of iterations, our graph Gconsistsofk connected components
H
1
,...,H
k
, and letS=
P
|H
i
|2.
Either there existsacomponent H
i
such that the Leighton-Raoalgorithmwill find anode cut
inH
i
with at most 20 cmlognremoved nodes and cost-eectiveness
at least
S/ (18 cmlogn) ,
orS 72OPT (or possibly both).
Proof: We assume that S>2OPT. Note that the node cut defined by the set FGdivides G
into a graph with objective function value at most OPT. The node cut thus induces a cost decrease
ofatleastS/2.
DefineF
i
=FH
i
andm
i
=|F
i
|. Also, let the subgraph induced by removing vertices in
F
i
H
i
fromH
i
be composed of connected components Hj
i
forj=1,...,r
i
. (i.e, the optimum set
of marked nodes partitions H
i
into these components.) Note that
P
i
P
j
|Hj
i
|2 OPT.
Since the total reduction in our objective function value from removing[
i
F
i
fromGisatleast
S/2 duetoourassumption that S>2OPT, we have:
X
i
0
@
|H
i
|2
X
j
|Hj
i
|2
1
A
S/2,
(5)
because the outer summand on the left hand side of the inequality is the amount the objective
function is reduced in each component.
Let Ibethesetof indices i for which |H
i
|2
P
r
i
j=1
|Hj
i
|2/m
i
S/ (4m) (i.e. the per node
benefit is at leastS/ (4 m).)
We have two cases.
1. There exists ani 2 I such that for all j=1,...,r
i
, |Hj
i
|1/3|H
i
|. We assume that
m
i
<1/50|H
i
|, because otherwise removing all nodes in H
i
will give usa trivial node cut
with cost-eectiveness
at least
|H
i
|2/ (50m
i
) >S/ (18 cmlogn) for suciently
large
n. With
this assumption, we know that there exists a set R
F
i
that defines anode cut of Hj
i
that
creates two connected components, V
1
andV
2
suchthat1/3|H
i
|
|
V
1
|and1/3|H
i
|
|
V
2
|.
The cost-eectiveness
of this cut will be
2
|V
1
||V
2
|
|R|
+2|H
i
||R|
2|H
i
|2
9m
i
2|H
i
|2
P
r
i
j=1
|Hj
i
|2
9m
i
S/ (18m) .
Lemma 16guarantees that the Leighton-Raoalgorithmwill find a cutin H
i
with cost-
eectiveness
at least
S/ (18 cmlogn). The node cut output by the algorithm cannot contain
more than 20 cmlognnodes. Such anode cut would have cost-eectiveness
at most
S/ (20 cmlogn), since any cutin Gcandecreasethe objective function value by at most S,
which is less than the guaranteed cost-eectiveness
of
S/ (18 cmlogn).
16
2. For all i 2I, there exists a jsuchthat|Hj
i
|>1/3|H
i
|. Also, note that OPT>
P
i2 I
|Hj
i
|2.
Claim:
P
i2 I
|H
i
|2
P
j
|Hj
i
|2
S/8.
Proof of claim: Let I be the set of intervals such that|H
i
|2
P
r
i
j=1
|Hj
i
|2/m
i
S/ (4m).
Recalling equation (5), we get
S/2
X
i2 I
0
@
|H
i
|2
X
j
|Hj
i
|2
1
A
+
X
i2 I
0
@
|H
i
|2
X
j
|Hj
i
|2
1
A
.
Also, we have
X
i2 I
0
@
|H
i
|2
X
j
|Hj
i
|2
1
A
=
X
i2 I
m
i
|H
i
|2
P
j
|Hj
i
|2
m
i
X
i2 I
m
i
S/ (4m)
S/ (4m)
X
i2 I
m
i
S/4.
Combining these two inequalities proves the claim. We have the inequalities:
OPT>
X
i2 I
|Hj
i
|2
X
i2 I
1
9
|H
i
|2
1
9
S
8
,
where we used our claim for the last inequality. Thus, OPT 72 S.
We now give the proof of Theorem 13.
Leta
j
be the number of connected components that comprise the graph at the beginning of
the jthiteration, and let those connected components be Hj
1
,...,Hj
a
j
. LetS
j
=
P
a
j
i=1
|Hj
i
|2 be
the value of the objective function at the beginning of the j'thiteration; thus S
0
n2 isitsinitial
value. Let lbethenumber of iterations the algorithm needs to terminate, and S
l+1
be the objective
function'sfinalvalue.
We wish to show that after the algorithm terminates, we have reduced the objective function
value to S
l+1
=O (1) ·OPT. Let Fbethefinalsetof marked nodes removed from G. If the algorithm
terminates at Step 2 of the l'thiteration because the Leighton-Raoalgorithmonly found node cuts
with more than (20 clogn) mremovednodes, then from Lemma 18 we know that S
l+1
72OPT.
Thus, we assume this does not occur. Furthermore, we assume that S
l+1
72OPT (in order to
apply the"either"part of Lemma 18 to alliterations.)
In order to reason about the decrease in the objective function value after each iteration, we
impute to each node in Fa per-node-decrease in the objective function value, given by the cost-
eectiveness
of its node cut. We then show that the total imputed decrease in the objective function
is at least n2
O (1) ·OPT, from which the theorem will follow.
More formally, suppose the set of marked nodes is given by the sequence F={f
1
,...,f
k
},
where the removed nodes are ordered in the order in which they were removed from the graph:
nodes removed at an earlier iteration occur earlier in the sequence. We havek=|F|=
(log
2
n) m.
17
Letb
j
be the iteration in whichf
j
was removed. We impute tof
j
the value
j
=cost-eectiveness
of cut removed in iterationb
j
. From Lemma18, we know that
j
S
b
j
/ (18 cmlogn).
SetT
0
=S
0
andT
i
=T
i1i
to be the value of the objective function after nodef
i
's
per-node-decrease contribution has been made. Note T
k
=S
l+1
.
Claim: For alli, T
i
T
i1
T
i1
/ (18 cmlogn)
Proof of claim: Proving the claim reduces to proving that
i
=T
i1
TiT
i1
/ (18 cmlogn).
Fix ani. We have two cases.
1. b
i
=b
i1
(i.e. f
i
andf
i1
were removed in the same iteration.) Then
i
S
b
i
/ (18 cmlogn),
butS
b
i
>T
i
, sinceS
b
i
is the objective function value at the beginning of iterationb
i
, whereas
T
i
is the objective function value "during"iterationb
i
.
2. b
i
=b
i1
+1 (i.e. f
i
was removed in the iteration afterf
i
was removed.) Then
i
S
b
i
/ (18 cmlogn) =T
i
/ (18 cmlogn), since in this case T
i
is the objective function value at the
start of iteration b
i
.
This proves the claim.
We therefore have T
k
T
0
(11/ (18 cmlogn))kn2 (11/ (18 cmlogn))k. Sincek>36 cmlog2 n,
it follows thatS
l+1
=T
k
=O (1) O (1) ·OPT.
The algorithm given above can be adapted ina straightforward way to yield an algorithm for
the edge cut version of the sum-of-squares partition problem (instead of taking sparse node cuts,
take sparse edge cuts), from which an analog to Theorem 13 maybe derived. The analysis of an
edge cut algorithm would perhaps be slightly easier than our analysis for node cuts, since node
cuts modify the node set, causing minor complications. Furthermore, such an algorithm can use
sparse cut algorithms with better approximation ratios, and thus remove only O (log
3/2
n) mnodes
to achieve the same constant approximation ratio for the sum of squares.
7
Conclusions and future research
We have described a simple economic game that represents thedicult
problem of choosing on
which nodes to install anti-virus software to contain the spread of computer viruses in a network.
The Nashequilibria of this game have a simple characterization, and we can show that in the worst
case, the ratio between the social cost ofa Nashequilibrium and asocial optimum can be linear in
the number of nodes.
Our model makes some very strong simplifying assumptions: every infected node eventually
infects all unprotected neighbors; the costs of installing the anti-virus software and becoming in-
fectedareknown and equal for all nodes; the virus imposes no costs on protected nodes; and
nodes can observe which of the other nodes intend to install the anti-virus software and adjust
their own strategies in response. None of these assumptions correspond completely to reality, but
we believe that as a first step the resulting model is a reasonable compromise between accuracy
and analyzability, and that the results obtained with the model (especially the characterization of
Nashequilibria) are similar to what one might expect with a more complex model that took into
account limited information and learning by individual nodes. The natural next step is to incor-
poratemore details in the model and see if such changes aect
the results; this might involve both
theoretical work to predict theeect
of changes and experimental or observational work to study
how real-world decision-makers choose whether or not to deploy specific security mechanisms.
18
We have also shown how anear-optimal deployment of anti-virus software can be computed by
reduction to the sum-of-squares partition problem, anew variant of classical graph partitioning
problems where the goal is to removemvertices so as to minimize the sum of the squares of the
sizes of surviving components. Though it is NP-hard to solve this problem exactly, we give a
polynomial-time O (log2 n)-approximation algorithm for sum-of-squares partition, which yields a
corresponding approximation algorithm for anti-virus software deployment. This algorithm maybe
of use as a network administration tool for choosing how to deploy anti-virus software to minimize
the combined costs of deployment and infection and as a public-health tool for designing inoculation
strategies for containing outbreaks of highly-infectious diseases when a good approximation to the
interaction graph can be computed but the initial source of contagion is unknown. Whether or not
a polynomial-time algorithm with abetter approximation ratio exists remains open.
8
Acknowledgments
The authors would like to thank Joan Feigenbaum, Hong Jiang, and Yang Richard Yang for many
useful discussions.
References
[1]R. Anderson. Why information security is hard-an economic perspective, 2001. Available at
http://www.cl.cam.ac.uk/~rja14/econsec.html.
[2]S. Arora, S. Rao, and U. Vazirani. Expander flows, geometric embeddings and graph partitioning.
In Proceedings of the 36thAnnualACM Symposium on the Theory of Computing,
pages 222-231,2004.
[3]J. Aspnes, J. Feigenbaum, M. Mitzenmacher, and D. Parkes. Towards better definitions and
measures of Internet security. In Workshop on Large-Scale Network Security and Deployment
Obstacles, 2003.
[4]N. T. Bailey. The Mathematical theory of infectious diseases and its applications. Hafner Press,
1975.
[5]M. G. H. Bell. The measurement of reliability in stochastic transport networks. In Proceedings
of 2001Intelligent Transportation Systems, pages 1183-1188,2001.
[6]K. Campbell, L. A. Gordon, M. P. Loeb, and L. Zhou. The economic cost of publicly announced
information security breaches: Empirical evidence from the stock market. Journal of Computer
Security, 2003.
[7]G. Even, J. Naor, S. Rao, and B. Schieber. Fast approximate graph partitioning algorithms.
SIAMJournalon Computing, 28:2187-2214,1999.
[8]U. Feigeand R. Krauthgamer. Apolylogarithmic approximation of the minimum bisection.
SIAMJournalon Computing, 31:1090-1118,2002.
[9]J. C. Frauenthal. Mathematical modeling in epidemiology. Springer-Verlag, New York, 1980.
19
[10]L. A. Gordonand M. P. Loeb. The economics of information security investment. ACM
transactions on information and system security, pages 438-457,2002.
[11]S. N. Hamilton, W. L. Miller, A. Ott, and O. S. Saydjari. Challenges in applying game
theory to the domain of information warfare. In 4th Information survivability workshop (ISW-
2001/2002) , Vancouver, Canada, 2002.
[12]S. N. Hamilton, W. L. Miller, A. Ott, and O. S. Saydjari. The role of game theory in information
warfare. In 4th Information survivability workshop (ISW-2001/2002) , Vancouver, Canada,
2002.
[13]K. Hoo. How much is enough? A risk-management approach to computer security. Consortium
for Research on Information Security Policy (CRISP) Working Paper., 2000.
[14]M. Kearnsand L. Ortiz. Algorithms for interdependent security games. InS. Thrun, L. Saul,
andB. Sch¨olkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press,
Cambridge, MA, 2004.
[15]J. O. Kephart, D. M. Chess, and S. R. White. Computers and epidemiology. In IEEE Spectrum,
pages 20-26,1993.
[16]J. O. Kephartand S. R. White. Directed-graph epidemiological models of computer viruses.
In IEEESymposium on Security and Privacy, pages 343-361,1991.
[17]J. O. Kephartand S. R. White. Measuring and modeling computer virus prevalence. In
Proceedings of the IEEE Symposium on Security and Privacy, 1993.
[18]H. Kesten. Percolation Theory for Mathematicians, volume 2. Birkh¨auser, Boston, 1982.
[19]E. Koutsoupiasand C. Papadimitriou. Worst-case equilibria. In Proceedings of the 16thannual
symposium on theoretical aspects of computer science, pages 403-413,1999.
[20]H. Kunreutherand G. Heal. Interdependent security. Journal of Risk and Uncertainty (Special
Issue on Terrorist Risks) , 2003.
[21]T. Leightonand S. Rao. Multicommodity max-flowmin-cut theorems and their use in designing
approximation algorithms. Journal of the Association for Computing Machinery, 46(6):787-
832,1999.
[22]R. Pastor-Satorrasand A. Vespignani. Epidemics and immunization scale-free networks. In
S. Bornholdtand H. Schuster, editors, Handbook of graphs and networks: from the genome to
theInternet, pages 113-132, Berlin, 2002. Wiley-VCH.
[23]R. Pastor-Satorrasand A. Vespignani. Immunization of complex networks. In Physical Review
Letters, volume 65,2002.
[24]P. F. Syverson. A dierent
look at secure distributed computation. In
IEEE Computer Security
Foundations Workshop (CSFW 10) , pages 109-115. IEEE Computer Society Press, June 1997.
[25]C. Wang, J. C. Knight, and M. C. Elder. On computer viral infection and the eect
of
immunization. In ACSAC, pages 246-256,2000.
20
[26]C. Zou, D. Towsley, and W. Gong. Emailvirus propagation modeling and analysis. Technical
Report CSE-03-04, University of Massachusetts, Amherst, 2002.
21