# Conditional Probability

In probability theory, the word "condition" is a verb. "Conditioning on the event ..." means that it is assumed that the event occurs.

 Definition (conditional probability) The conditional probability that event ${\displaystyle {\mathcal {E}}_{1}}$ occurs given that event ${\displaystyle {\mathcal {E}}_{2}}$ occurs is ${\displaystyle \Pr[{\mathcal {E}}_{1}\mid {\mathcal {E}}_{2}]={\frac {\Pr[{\mathcal {E}}_{1}\wedge {\mathcal {E}}_{2}]}{\Pr[{\mathcal {E}}_{2}]}}.}$

The conditional probability is well-defined only if ${\displaystyle \Pr[{\mathcal {E}}_{2}]\neq 0}$.

For independent events ${\displaystyle {\mathcal {E}}_{1}}$ and ${\displaystyle {\mathcal {E}}_{2}}$, it holds that

${\displaystyle \Pr[{\mathcal {E}}_{1}\mid {\mathcal {E}}_{2}]={\frac {\Pr[{\mathcal {E}}_{1}\wedge {\mathcal {E}}_{2}]}{\Pr[{\mathcal {E}}_{2}]}}={\frac {\Pr[{\mathcal {E}}_{1}]\cdot \Pr[{\mathcal {E}}_{2}]}{\Pr[{\mathcal {E}}_{2}]}}=\Pr[{\mathcal {E}}_{1}].}$

It supports our intuition that for two independent events, whether one of them occurs will not affect the chance of the other.

# Law of total probability

The following fact is known as the law of total probability. It computes the probability by averaging over all possible cases.

 Theorem (law of total probability) Let ${\displaystyle {\mathcal {E}}_{1},{\mathcal {E}}_{2},\ldots ,{\mathcal {E}}_{n}}$ be mutually disjoint events, and ${\displaystyle \bigvee _{i=1}^{n}{\mathcal {E}}_{i}=\Omega }$ is the sample space. Then for any event ${\displaystyle {\mathcal {E}}}$, ${\displaystyle \Pr[{\mathcal {E}}]=\sum _{i=1}^{n}\Pr[{\mathcal {E}}\mid {\mathcal {E}}_{i}]\cdot \Pr[{\mathcal {E}}_{i}].}$
Proof.
 Since ${\displaystyle {\mathcal {E}}_{1},{\mathcal {E}}_{2},\ldots ,{\mathcal {E}}_{n}}$ are mutually disjoint and ${\displaystyle \bigvee _{i=1}^{n}{\mathcal {E}}_{i}=\Omega }$, events ${\displaystyle {\mathcal {E}}\wedge {\mathcal {E}}_{1},{\mathcal {E}}\wedge {\mathcal {E}}_{2},\ldots ,{\mathcal {E}}\wedge {\mathcal {E}}_{n}}$ are also mutually disjoint, and ${\displaystyle {\mathcal {E}}=\bigvee _{i=1}^{n}\left({\mathcal {E}}\wedge {\mathcal {E}}_{i}\right)}$. Then ${\displaystyle \Pr[{\mathcal {E}}]=\sum _{i=1}^{n}\Pr[{\mathcal {E}}\wedge {\mathcal {E}}_{i}],}$ which according to the definition of conditional probability, is ${\displaystyle \sum _{i=1}^{n}\Pr[{\mathcal {E}}\mid {\mathcal {E}}_{i}]\cdot \Pr[{\mathcal {E}}_{i}]}$.
${\displaystyle \square }$

The law of total probability provides us a standard tool for breaking a probability into sub-cases. Sometimes, it helps the analysis.

# A Chain of Conditioning

By the definition of conditional probability, ${\displaystyle \Pr[A\mid B]={\frac {\Pr[A\wedge B]}{\Pr[B]}}}$. Thus, ${\displaystyle \Pr[A\wedge B]=\Pr[B]\cdot \Pr[A\mid B]}$. This hints us that we can compute the probability of the AND of events by conditional probabilities. Formally, we have the following theorem:

 Theorem Let ${\displaystyle {\mathcal {E}}_{1},{\mathcal {E}}_{2},\ldots ,{\mathcal {E}}_{n}}$ be any ${\displaystyle n}$ events. Then {\displaystyle {\begin{aligned}\Pr \left[\bigwedge _{i=1}^{n}{\mathcal {E}}_{i}\right]&=\prod _{k=1}^{n}\Pr \left[{\mathcal {E}}_{k}\mid \bigwedge _{i
Proof.
 It holds that ${\displaystyle \Pr[A\wedge B]=\Pr[B]\cdot \Pr[A\mid B]}$. Thus, let ${\displaystyle A={\mathcal {E}}_{n}}$ and ${\displaystyle B={\mathcal {E}}_{1}\wedge {\mathcal {E}}_{2}\wedge \cdots \wedge {\mathcal {E}}_{n-1}}$, then {\displaystyle {\begin{aligned}\Pr[{\mathcal {E}}_{1}\wedge {\mathcal {E}}_{2}\wedge \cdots \wedge {\mathcal {E}}_{n}]&=\Pr[{\mathcal {E}}_{1}\wedge {\mathcal {E}}_{2}\wedge \cdots \wedge {\mathcal {E}}_{n-1}]\cdot \Pr \left[{\mathcal {E}}_{n}\mid \bigwedge _{i Recursively applying this equation to ${\displaystyle \Pr[{\mathcal {E}}_{1}\wedge {\mathcal {E}}_{2}\wedge \cdots \wedge {\mathcal {E}}_{n-1}]}$ until there is only ${\displaystyle {\mathcal {E}}_{1}}$ left, the theorem is proved.
${\displaystyle \square }$