随机算法 (Fall 2011)/Probability Space and 高级算法 (Fall 2016)/Nonconstructive Proof of Lovász Local Lemma: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>WikiSysop
No edit summary
 
imported>Etone
(Created page with "== Lovász Local Lemma== Given a sequence of events <math>A_1,A_2,\ldots,A_n</math>, we use the '''dependency graph''' to describe the dependencies between these events. {{Th...")
 
Line 1: Line 1:
In probability theory class we have learned the basic concepts of '''events''' and '''random variables'''. The following are some examples:
== Lovász Local Lemma==
::{|border="1"
Given a sequence of events <math>A_1,A_2,\ldots,A_n</math>, we use the '''dependency graph''' to describe the dependencies between these events.
| '''event''' <math>\mathcal{E}</math>:
| '''random variable''' <math>X</math>:
|-
|<math>\mathcal{E}_1</math>: 75 out of 100 coin flips are HEADs
| <math>X_1</math> is the number of HEADs in the 100 coin flips
|-
|<math>\mathcal{E}_2</math>: the result  of rolling a 6-sided die is an even number (2, 4, 6)
| <math>X_2</math> is the result of rolling a die
|-
| <math>\mathcal{E}_3</math>: tomorrow is rainy
| <math>X_3=1</math> if tomorrow is rainy and <math>X_3=0</math> if otherwise
|}


Events and random variables can be related:
{{Theorem
* An event can be defined from random variables <math>X, Y</math> by a predicate <math>P(X, Y)</math>. (<math>\mathcal{E}</math>: X>Y)
|Definition|
* A boolean  random variable <math>X</math> can be defined from an event <math>\mathcal{E}</math> as follows:
:Let <math>A_1,A_2,\ldots,A_n</math> be a sequence of events. A graph <math>D=(V,E)</math> on the set of vertices <math>V=\{1,2,\ldots,n\}</math> is called a '''dependency graph''' for the events <math>A_1,\ldots,A_n</math> if for each <math>i</math>, <math>1\le i\le n</math>, the event <math>A_i</math> is mutually independent of all the events <math>\{A_j\mid (i,j)\not\in E\}</math>.
:<math>\begin{align}X=\begin{cases}1& \mathcal{E} \text{ occurs}\\
}}
0& \text{ otherwise}\end{cases}\end{align}</math>
We say that <math>X</math> ''indicates'' the event <math>\mathcal{E}</math>. We will later see that this indicator trick is very useful for probabilistic analysis.


The notion of mutual independence between an event and a set of events is formally defined as follows.
{{Theorem|Definition|
:An event <math>A</math> is said to be '''mutually independent''' of events <math>B_1,B_2,\ldots, B_k</math>, if for any disjoint <math>I^+,I^-\subseteq\{1,2,lots,k\}</math>, it holds that
::<math>\Pr\left[A\mid \bigwedge_{i\in I^+}B_i\wedge \bigwedge_{i\in I^-}\overline{B_i}\right]=\Pr[A]</math>.
}}


The formal definitions of events and random variables are related to the foundation of probability theory, which is not the topic of this class. Here we only give an informal description (you may ignore them if you feel confused):
;Example
* A '''probability space''' is a set <math>\Omega</math> of elementary events, also called '''samples'''. (6 sides of a die)
:Let <math>X_1,X_2,\ldots,X_m</math> be a set of ''mutually independent'' random variables. Each event <math>A_i</math> is a predicate defined on a number of variables among <math>X_1,X_2,\ldots,X_m</math>. Let <math>v(A_i)</math> be the unique smallest set of variables which determine <math>A_i</math>. The dependency graph <math>D=(V,E)</math> is defined by
* An '''event''' is a subset of <math>\Omega</math>, i.e. a set of elementary events. (<math>\mathcal{E}_2</math>: side 2, 4, 6 of the die)
:::<math>(i,j)\in E</math> iff <math>v(A_i)\cap v(A_j)\neq \emptyset</math>.
* A '''random variable''' is not really a variable, but a function! The function maps events to values. (<math>X_2</math> maps the first side of the die to 1, the second side to 2, the third side to 3, ...)


The axiom foundation of probability theory is laid by [http://en.wikipedia.org/wiki/Andrey_Kolmogorov Kolmogorov], one of the greatest mathematician of the 20th century, who advanced various very different fields of mathematics.
The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.


=Axioms of Probability=
{{Theorem
 
|Lovász Local Lemma (symmetric case)|
{{Theorem|Definition (Probability space)|
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events, and assume that the following hold:
A '''probability space''' is a triple <math>(\Omega,\Sigma,\Pr)</math>.
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
*<math>\Omega</math> is a set, called the '''sample space'''.
:#the maximum degree of the dependency graph for the events <math>A_1,A_2,\ldots,A_n</math> is <math>d</math>, and  
*<math>\Sigma\subseteq 2^{\Omega}</math> is the set of all '''events''', satisfying:
:::<math>ep(d+1)\le 1</math>.
*:(A1) <math>\Omega\in\Sigma</math> and <math>\empty\in\Sigma</math>. (The ''certain'' event and the ''impossible'' event.)
:Then
*:(A2) If <math>A,B\in\Sigma</math>, then <math>A\cap B, A\cup B, A-B\in\Sigma</math>. (Intersection, union, and diference of two events are events).
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
* A '''probability measure''' <math>\Pr:\Sigma\rightarrow\mathbb{R}</math> is a function that maps each event to a nonnegative real number, satisfying
*:(A3) <math>\Pr(\Omega)=1</math>.
*:(A4) If <math>A\cap B=\emptyset</math> (such events are call ''disjoint'' events), then
*::<math>\Pr(A\cup B)=\Pr(A)+\Pr(B)</math>.  
}}
}}


= Independent events =
We will prove a general version of the local lemma, where the events <math>A_i</math> are not symmetric. This generalization is due to Spencer.
{{Theorem
{{Theorem
|Definition (Independent events)|
|Lovász Local Lemma (general case)|
:Two events <math>\mathcal{E}_1</math> and <math>\mathcal{E}_2</math> are '''independent''' if and only if
:Let <math>D=(V,E)</math> be the dependency graph of events <math>A_1,A_2,\ldots,A_n</math>. Suppose there exist real numbers <math>x_1,x_2,\ldots, x_n</math> such that <math>0\le x_i<1</math> and for all <math>1\le i\le n</math>,
::<math>\begin{align}
::<math>\Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
\Pr\left[\mathcal{E}_1 \wedge \mathcal{E}_2\right]
:Then
&=
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)</math>.
\Pr[\mathcal{E}_1]\cdot\Pr[\mathcal{E}_2].
\end{align}</math>
}}
This definition can be generalized to any number of events:
{{Theorem
|Definition (Independent events)|
:Events <math>\mathcal{E}_1, \mathcal{E}_2, \ldots, \mathcal{E}_n</math> are '''mutually independent''' if and only if, for any subset <math>I\subseteq\{1,2,\ldots,n\}</math>,
::<math>\begin{align}
\Pr\left[\bigwedge_{i\in I}\mathcal{E}_i\right]
&=
\prod_{i\in I}\Pr[\mathcal{E}_i].
\end{align}</math>
}}
}}


Note that in probability theory, the "mutual independence" is <font color="red">not</font> equivalent with "pair-wise independence", which we will learn in the future.
To see that the general LLL implies symmetric LLL, we set <math>x_i=\frac{1}{d+1}</math> for all <math>i=1,2,\ldots,n</math>. Then we have <math>\left(1-\frac{1}{d+1}\right)^d>\frac{1}{\mathrm{e}}</math>.
 
Assume the condition in the symmetric LLL:
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
:#<math>ep(d+1)\le 1</math>;
then it is easy to verify that for all <math>1\le i\le n</math>,
:<math>\Pr[A_i]\le p\le\frac{1}{e(d+1)}<\frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
Due to the general LLL, we have
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n>0</math>.
This proves the symmetric LLL.
 
Now we prove the general LLL by the original induction proof.
{{Proof|
First, apply the chain rule. We have
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]=\prod_{i=1}^n\left(1-\Pr\left[{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)</math>.
 
Next we prove by induction on <math>m</math> that for any set of <math>m</math> events <math>i_1,\ldots,i_m</math>,
:<math>\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1}</math>.
The local lemma follows immediately by the above chain rule.
 
For <math>m=1</math>, this is obvious because
:<math>\Pr[A_{i_1}]\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)\le x_{i_1}</math>.
 
For general <math>m</math>, let <math>i_2,\ldots,i_k</math> be the set of vertices adjacent to  <math>i_1</math> in the dependency graph, i.e. event <math>A_{i_1}</math> is mutually independent of <math>A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}}</math>.
By conditional probability, we have
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
=\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
{\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
</math>.
First, we bound the numerator. Due to that <math>A_{i_1}</math> is mutually independent of <math>A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}}</math>, we have
:<math>
\begin{align}
\Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
&\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\
&=\Pr[A_{i_1}]\\
&\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j).
\end{align}
</math>


= The union bound =
Next, we bound the denominator. Applying the chain rule, we have
We are familiar with the [http://en.wikipedia.org/wiki/Inclusion–exclusion_principle principle of inclusion-exclusion] for finite sets.
:<math>
{{Theorem
\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
|Principle of Inclusion-Exclusion|
=\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right]
:Let <math>S_1, S_2, \ldots, S_n</math> be <math>n</math> finite sets. Then
</math>
::<math>\begin{align}
which by the induction hypothesis, is at least
\left|\bigcup_{1\le i\le n}S_i\right|
:<math>
&=
\prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j)
\sum_{i=1}^n|S_i|
</math>
-\sum_{i<j}|S_i\cap S_j|
where <math>E</math> is the set of edges in the dependency graph.
+\sum_{i<j<k}|S_i\cap S_j\cap S_k|\\
& \quad -\cdots
+(-1)^{\ell-1}\sum_{i_1<i_2<\cdots<i_\ell}\left|\bigcap_{r=1}^\ell S_{i_r}\right|
+\cdots
+(-1)^{n-1} \left|\bigcap_{i=1}^n S_i\right|.
\end{align}</math>
}}


The principle can be generalized to '''probability events'''.
Altogether, we prove the induction hypothesis
{{Theorem
:<math>
|Principle of Inclusion-Exclusion|
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
:Let <math>\mathcal{E}_1, \mathcal{E}_2, \ldots, \mathcal{E}_n</math> be <math>n</math> events. Then
\le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}.
::<math>\begin{align}
</math>
\Pr\left[\bigvee_{1\le i\le n}\mathcal{E}_i\right]
&=
\sum_{i=1}^n\Pr[\mathcal{E}_i]
-\sum_{i<j}\Pr[\mathcal{E}_i\wedge \mathcal{E}_j]
+\sum_{i<j<k}\Pr[\mathcal{E}_i\wedge \mathcal{E}_j\wedge \mathcal{E}_k]\\
& \quad -\cdots
+(-1)^{\ell-1}\sum_{i_1<i_2<\cdots<i_\ell}\Pr\left[\bigwedge_{r=1}^\ell \mathcal{E}_{i_r}\right]
+\cdots
+(-1)^{n-1}\Pr\left[\bigwedge_{i=1}^n \mathcal{E}_{i}\right].
\end{align}</math>
}}


The principle is implied by the axiomatic formalization of probability. The following inequality is implied (nontrivially) by the principle of inclusion-exclusion:
Due to the chain rule, it holds that
{{Theorem
:<math>
|Theorem (the union bound)|
\begin{align}
:Let <math>\mathcal{E}_1, \mathcal{E}_2, \ldots, \mathcal{E}_n</math> be <math>n</math> events. Then
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]
::<math>\begin{align}
&=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\
\Pr\left[\bigvee_{1\le i\le n}\mathcal{E}_i\right]
&=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\
&\le
&\ge\prod_{i=1}^n\left(1-x_i\right).
\sum_{i=1}^n\Pr[\mathcal{E}_i].
\end{align}
\end{align}</math>
</math>
}}
}}
The name of this inequality is [http://en.wikipedia.org/wiki/Boole's_inequality Boole's inequality]. It is usually referred by its nickname "the '''union bound'''". The bound holds for arbitrary events, even if they are dependent. Due to this generality, the union bound is extremely useful for analyzing the union of events.

Revision as of 09:30, 3 October 2016

Lovász Local Lemma

Given a sequence of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math], we use the dependency graph to describe the dependencies between these events.

Definition
Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a sequence of events. A graph [math]\displaystyle{ D=(V,E) }[/math] on the set of vertices [math]\displaystyle{ V=\{1,2,\ldots,n\} }[/math] is called a dependency graph for the events [math]\displaystyle{ A_1,\ldots,A_n }[/math] if for each [math]\displaystyle{ i }[/math], [math]\displaystyle{ 1\le i\le n }[/math], the event [math]\displaystyle{ A_i }[/math] is mutually independent of all the events [math]\displaystyle{ \{A_j\mid (i,j)\not\in E\} }[/math].

The notion of mutual independence between an event and a set of events is formally defined as follows.

Definition
An event [math]\displaystyle{ A }[/math] is said to be mutually independent of events [math]\displaystyle{ B_1,B_2,\ldots, B_k }[/math], if for any disjoint [math]\displaystyle{ I^+,I^-\subseteq\{1,2,lots,k\} }[/math], it holds that
[math]\displaystyle{ \Pr\left[A\mid \bigwedge_{i\in I^+}B_i\wedge \bigwedge_{i\in I^-}\overline{B_i}\right]=\Pr[A] }[/math].
Example
Let [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math] be a set of mutually independent random variables. Each event [math]\displaystyle{ A_i }[/math] is a predicate defined on a number of variables among [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math]. Let [math]\displaystyle{ v(A_i) }[/math] be the unique smallest set of variables which determine [math]\displaystyle{ A_i }[/math]. The dependency graph [math]\displaystyle{ D=(V,E) }[/math] is defined by
[math]\displaystyle{ (i,j)\in E }[/math] iff [math]\displaystyle{ v(A_i)\cap v(A_j)\neq \emptyset }[/math].

The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.

Lovász Local Lemma (symmetric case)
Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events, and assume that the following hold:
  1. for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
  2. the maximum degree of the dependency graph for the events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] is [math]\displaystyle{ d }[/math], and
[math]\displaystyle{ ep(d+1)\le 1 }[/math].
Then
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\gt 0 }[/math].

We will prove a general version of the local lemma, where the events [math]\displaystyle{ A_i }[/math] are not symmetric. This generalization is due to Spencer.

Lovász Local Lemma (general case)
Let [math]\displaystyle{ D=(V,E) }[/math] be the dependency graph of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math]. Suppose there exist real numbers [math]\displaystyle{ x_1,x_2,\ldots, x_n }[/math] such that [math]\displaystyle{ 0\le x_i\lt 1 }[/math] and for all [math]\displaystyle{ 1\le i\le n }[/math],
[math]\displaystyle{ \Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].
Then
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i) }[/math].

To see that the general LLL implies symmetric LLL, we set [math]\displaystyle{ x_i=\frac{1}{d+1} }[/math] for all [math]\displaystyle{ i=1,2,\ldots,n }[/math]. Then we have [math]\displaystyle{ \left(1-\frac{1}{d+1}\right)^d\gt \frac{1}{\mathrm{e}} }[/math].

Assume the condition in the symmetric LLL:

  1. for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
  2. [math]\displaystyle{ ep(d+1)\le 1 }[/math];

then it is easy to verify that for all [math]\displaystyle{ 1\le i\le n }[/math],

[math]\displaystyle{ \Pr[A_i]\le p\le\frac{1}{e(d+1)}\lt \frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].

Due to the general LLL, we have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n\gt 0 }[/math].

This proves the symmetric LLL.

Now we prove the general LLL by the original induction proof.

Proof.

First, apply the chain rule. We have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]=\prod_{i=1}^n\left(1-\Pr\left[{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right) }[/math].

Next we prove by induction on [math]\displaystyle{ m }[/math] that for any set of [math]\displaystyle{ m }[/math] events [math]\displaystyle{ i_1,\ldots,i_m }[/math],

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1} }[/math].

The local lemma follows immediately by the above chain rule.

For [math]\displaystyle{ m=1 }[/math], this is obvious because

[math]\displaystyle{ \Pr[A_{i_1}]\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)\le x_{i_1} }[/math].

For general [math]\displaystyle{ m }[/math], let [math]\displaystyle{ i_2,\ldots,i_k }[/math] be the set of vertices adjacent to [math]\displaystyle{ i_1 }[/math] in the dependency graph, i.e. event [math]\displaystyle{ A_{i_1} }[/math] is mutually independent of [math]\displaystyle{ A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}} }[/math]. By conditional probability, we have

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] =\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} {\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} }[/math].

First, we bound the numerator. Due to that [math]\displaystyle{ A_{i_1} }[/math] is mutually independent of [math]\displaystyle{ A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}} }[/math], we have

[math]\displaystyle{ \begin{align} \Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] &\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\ &=\Pr[A_{i_1}]\\ &\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j). \end{align} }[/math]

Next, we bound the denominator. Applying the chain rule, we have

[math]\displaystyle{ \Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] =\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right] }[/math]

which by the induction hypothesis, is at least

[math]\displaystyle{ \prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j) }[/math]

where [math]\displaystyle{ E }[/math] is the set of edges in the dependency graph.

Altogether, we prove the induction hypothesis

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] \le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}. }[/math]

Due to the chain rule, it holds that

[math]\displaystyle{ \begin{align} \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right] &=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\ &=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\ &\ge\prod_{i=1}^n\left(1-x_i\right). \end{align} }[/math]
[math]\displaystyle{ \square }[/math]