随机算法 (Spring 2013)/Introduction and Probability Space and Fitness: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
 
imported>Macdonald-ross
mNo edit summary
 
Line 1: Line 1:
=Introduction=
{{otheruse|biological fitness}}
This course will study ''Randomized Algorithms'', the algorithms that use randomness in computation.  
'''Fitness''' in [[biology]] is the relative ability of an organism to survive and pass on its [[gene]]s to the next generation.<ref>King R.C. Stansfield W.D. & Mulligan P.K. 2006. ''A dictionary of genetics'', 7th ed. Oxford.</ref><sup>p160</sup> It is a central idea in [[evolution|evolutionary theory]]. Fitness is usually equal to the proportion of the individual's [[gene]]s in all the genes of the next generation.
;Why do we use randomness in computation?
* Randomized algorithms can be simpler than deterministic ones.  
:(median selection, load balancing, etc.)
* Randomized algorithms can be faster than the best known deterministic algorithms.  
:(min-cut, checking matrix multiplication, primality testing, etc.)
* Randomized algorithms can do things that deterministic algorithms cannot do.  
:(routing, volume estimation, communication complexity, data streams, etc.)
* Randomized algorithms may lead us to smart deterministic algorithms.  
:(hashing, derandomization, SL=L, Lovász Local Lemma, etc.)
* Randomness is presented in the input.
:(average-case analysis, smoothed analysis, learning, etc.)
* Some deterministic problems are random in nature.
:(counting, inference, etc.)
* ...


;How is randomness used in computation?
Like all terms in evolutionary biology, fitness is defined in terms of an interbreeding [[Population genetics|population]], which might or might not be a whole [[species]]. If differences in individual genotypes affect fitness, then the frequencies of the genotypes will change over generations; the genotypes with higher fitness become more common. This is the process called [[natural selection]].
* To hit a witness/certificate.  
:(identity testing, fingerprinting, primality testing, etc.)
* To avoid worst case or to deal with adversaries.  
:(randomized quick sort, perfect hashing, etc.)
* To simulate random samples.
:(random walk, Markov chain Monte Carlo, approximate counting etc.)
* To enumerate/construct solutions.
:(the probabilistic method, min-cut, etc.)
* ...


== Principles in probability theory  ==
An individual's fitness is caused by its [[phenotype]], and passed on by its [[genotype]]. The fitnesses of different individuals with the same genotype are not necessarily equal. It depends on the [[environment]] in which the individuals live, and on accidental [[event]]s. However, since the fitness of the genotype is an [[average]]d quantity, it reflects the reproductive outcomes of ''all'' individuals with that genotype.
The course is organized by the advancedness of the probabilistic tools. We do this for two reasons: First, for randomized algorithms, analysis is usually more difficult and involved than the algorithm itself; and second, getting familiar with these probability principles will help you understand the true reasons for which the smart algorithms are designed.
* '''Basic probability theory''': probability space, events, the union bound, independence, conditional probability.
* '''Moments and deviations''': random variables, expectation, linearity of expectation, Markov's inequality, variance, second moment method.
* '''The probabilistic method''': averaging principle, threshold phenomena, Lovász Local Lemma.
* '''Concentrations''': Chernoff-Hoeffding bound, martingales, Azuma's inequality, bounded difference method.
* '''Markov chains and random walks''': Markov chians, random walks, hitting/cover time, mixing time.


=Probability Space=
== Relatedness ==
The axiom foundation of probability theory is laid by [http://en.wikipedia.org/wiki/Andrey_Kolmogorov Kolmogorov], one of the greatest mathematician of the 20th century, who advanced various very different fields of mathematics.
Fitness measures the number of the ''copies'' of the genes of an individual in the next generation. It doesn't really matter how the genes arrive in the next generation. For an individual, it is equally "beneficial" to reproduce itself, or to help relatives with similar genes to reproduce, ''as long as similar number of copies of individual's genes get passed on to the next generation''. Selection which promotes this kind of helper behaviour is called [[kin selection]].


{{Theorem|Definition (Probability Space)|
Our closest relatives (parents, siblings, and our own children) share on average 50% (half) of our genes. One step further removed are grandparents. With each of them we share on average 25% (a quarter) of our genes. That is a measure of our relatedness to them. Next come first cousins (children of our parents' siblings). We share 12.5% (1/8) of their genes.<ref name=JMS>Maynard Smith, John. 1999. ''Evolutionary genetics''. 2nd ed, Cambridge University Press.</ref><sup>p100</sup>
A '''probability space''' is a triple <math>(\Omega,\Sigma,\Pr)</math>.
*<math>\Omega</math> is a set, called the '''sample space'''.
*<math>\Sigma\subseteq 2^{\Omega}</math> is the set of all '''events''', satisfying:
*:(K1). <math>\Omega\in\Sigma</math> and <math>\empty\in\Sigma</math>. (The ''certain'' event and the ''impossible'' event.)
*:(K2). If <math>A,B\in\Sigma</math>, then <math>A\cap B, A\cup B, A-B\in\Sigma</math>. (Intersection, union, and diference of two events are events).
* A '''probability measure''' <math>\Pr:\Sigma\rightarrow\mathbb{R}</math> is a function that maps each event to a nonnegative real number, satisfying
*:(K3). <math>\Pr(\Omega)=1</math>.
*:(K4). If <math>A\cap B=\emptyset</math> (such events are call ''disjoint'' events), then <math>\Pr(A\cup B)=\Pr(A)+\Pr(B)</math>.  
*:(K5*). For a decreasing sequence of events <math>A_1\supset A_2\supset \cdots\supset A_n\supset\cdots</math> of events with <math>\bigcap_n A_n=\emptyset</math>, it holds that <math>\lim_{n\rightarrow \infty}\Pr(A_n)=0</math>.
}}


;Remark
=== Hamilton's rule ===
* In general, the set <math>\Omega</math> may be continuous, but we only consider '''discrete''' probability in this lecture, thus we assume that <math>\Omega</math> is either finite or countably infinite.
[[W.D. Hamilton|William Hamilton]] added various ideas to the notion of fitness. His rule suggests that a costly action should be performed if:
* Sometimes it is convenient to assume <math>\Sigma=2^{\Omega}</math>, i.e. the events enumerates all subsets of <math>\Omega</math>. But in general, a probability space is well-defined by any <math>\Sigma</math> satisfying (K1) and (K2). Such <math>\Sigma</math> is called a <math>\sigma</math>-algebra defined on <math>\Omega</math>.
:<math>C < R \times B </math>   where:
* The last axiom (K5*) is redundant if <math>\Sigma</math> is finite, thus it is only essential when there are infinitely many events. The role of axiom (K5*) in probability theory is like [http://en.wikipedia.org/wiki/Zorn's_lemma Zorn's Lemma] (or equivalently the [http://en.wikipedia.org/wiki/Axiom_of_choice Axiom of Choice]) in axiomatic set theory.
* <math>c \ </math> is the reproductive cost to the altruist,
* <math>b \ </math> is the reproductive benefit to the recipient of the altruistic behavior, and
* <math>r \ </math> is the probability, above the population average, of the individuals sharing an altruistic gene – the "degree of relatedness".
Fitness costs and benefits are measured in [[fecundity]].<ref>Hamilton W.D. 1964. The genetical evolution of social behavior. ''Journal of Theoretical Biology'' '''7''' (1): 1–52. doi:10.1016/0022-5193(64)90038-4.</ref>


Useful laws for probability can be deduced from the ''axioms'' (K1)-(K5).
=== Inclusive fitness ===
{{Theorem|Proposition|
Inclusive fitness is a term which is essentially the same as fitness, but emphasises the group of genes rather than individuals.  
# Let <math>\bar{A}=\Omega\setminus A</math>. It holds that <math>\Pr(\bar{A})=1-\Pr(A)</math>.
# If <math>A\subseteq B</math> then <math>\Pr(A)\le\Pr(B)</math>.
}}
{{Proof|
# The events <math>\bar{A}</math> and <math>A</math> are disjoint and <math>\bar{A}\cup A=\Omega</math>. Due to Axiom (K4) and (K3), <math>\Pr(\bar{A})+\Pr(A)=\Pr(\Omega)=1</math>.
# The events <math>A</math> and <math>B\setminus A</math> are disjoint and <math>A\cup(B\setminus A)=B</math> since <math>A\subseteq B</math>. Due to Axiom (K4), <math>\Pr(A)+\Pr(B\setminus A)=\Pr(B)</math>, thus <math>\Pr(A)\le\Pr(B)</math>.
}}


;Notation
Biological fitness says how well an organism can reproduce, and spread its genes to its offspring. The theory of inclusive fitness says that the fitness of an organism is also increased to the extent that its close relatives also reproduce. This is because relatives share genes in proportion to their relationship.
An event <math>A\subseteq\Omega</math> can be represented as <math>A=\{a\in\Omega\mid \mathcal{E}(a)\}</math> with a predicate <math>\mathcal{E}</math>.  


The predicate notation of probability is  
Another way of saying it: ''the inclusive fitness of an organism is not a property of itself, but a property of its set of [[genes]]''. It is calculated from from the reproductive success of the individual, plus the reproductive success of its relatives, each one weighed by an appropriate coefficient of relatedness.<ref>Adapted from Dawkins R. 1982. ''The extended phenotype''. Oxford: Oxford University Press, p186.  ISBN 0-19-288051-9</ref>
:<math>\Pr[\mathcal{E}]=\Pr(\{a\in\Omega\mid \mathcal{E}(a)\})</math>.


During the lecture, we mostly use the predicate notation instead of subset notation.
== History ==
The [[British]] [[Sociology|social]] [[philosopher]] [[Herbert Spencer]] coined the phrase ''[[survival of the fittest]]'' in his 1864 work ''Principles of biology'' to mean what [[Charles Darwin]] called [[natural selection]].<ref> Herbert Spencer 1864. ''Principles of Biology'' London, vol 1, 444, wrote “This survival of the fittest, which I have here sought to express in mechanical terms, is that which Mr. Darwin has called ‘natural selection’, or the preservation of favoured races in the struggle for life. </ref> The original phrase was "survival of the best fitted".  


== Independence ==
== References ==
{{Theorem
{{Reflist}}
|Definition (Independent events)|
:Two events <math>\mathcal{E}_1</math> and <math>\mathcal{E}_2</math> are '''independent''' if and only if
::<math>\begin{align}
\Pr\left[\mathcal{E}_1 \wedge \mathcal{E}_2\right]
&=
\Pr[\mathcal{E}_1]\cdot\Pr[\mathcal{E}_2].
\end{align}</math>
}}
This definition can be generalized to any number of events:
{{Theorem
|Definition (Independent events)|
:Events <math>\mathcal{E}_1, \mathcal{E}_2, \ldots, \mathcal{E}_n</math> are '''mutually independent''' if and only if, for any subset <math>I\subseteq\{1,2,\ldots,n\}</math>,
::<math>\begin{align}
\Pr\left[\bigwedge_{i\in I}\mathcal{E}_i\right]
&=
\prod_{i\in I}\Pr[\mathcal{E}_i].
\end{align}</math>
}}


Note that in probability theory, the "mutual independence" is <font color="red">not</font> equivalent with "pair-wise independence", which we will learn in the future.
[[Category:Classical genetics]]
 
[[Category:Evolutionary biology]]
= Model of Computation =
 
=  Checking Matrix Multiplication=
Let <math>\mathbb{F}</math> be a feild (you may think of it as the filed <math>\mathbb{Q}</math> of rational numbers, or the finite field <math>\mathbb{Z}_p</math> of integers modulo prime <math>p</math>). We suppose that each field operation (addition, subtraction, multiplication, division) has unit cost. This model is called the '''unit-cost RAM''' model, which is an ideal abstraction of computers.
 
Consider the following problem:
* '''Input''': Three <math>n\times n</math> matrices <math>A</math>, <math>B</math>, and <math>C</math> over the field <math>\mathbb{F}</math>.
* '''Output''': "yes" if <math>C=AB</math> and "no" if otherwise.
 
A naive method is to multiply <math>A</math> and <math>B</math> and compare the result with <math>C</math>. The [http://en.wikipedia.org/wiki/Strassen_algorithm Strassen's algorithm] discovered in 1969 now implemented by many numerical libraries runs in time <math>O(n^{\log_2 7})\approx O(n^{2.81})</math>. Strassen's algorithm starts the search for fast matrix multiplication algorithms. The [http://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_algorithm Coppersmith–Winograd algorithm] discovered in 1987 runs in time <math>O(n^{2.376})</math> but is only faster than Strassens' algorithm on extremely large matrices due to the very large constant coefficient. This has been the best known for decades, until recently Stothers got an <math>O(n^{2.3737})</math> algorithm in his PhD thesis in 2010, and independently Vassilevska Williams got an <math>n^{2.3727}</math> algorithm in 2012. Both these improvements are based on generalization of Coppersmith–Winograd algorithm. It is unknown whether the matrix multiplication can be done in time <math>O(n^{2+o(1)})</math>.
 
== Freivalds Algorithm ==
The following is a very simple randomized algorithm due to Freivalds, running in <math>O(n^2)</math> time:
 
{{Theorem|Algorithm (Freivalds, 1979)|
*pick a vector <math>r \in\{0, 1\}^n</math> uniformly at random;
*if <math>A(Br) = Cr</math> then return "yes" else return "no";
}}
The product <math>A(Br)</math> is computed by first multiplying <math>Br</math> and then <math>A(Br)</math>.
The running time of Freivalds algorithm is <math>O(n^2)</math> because the algorithm computes 3 matrix-vector multiplications.
 
If <math>AB=C</math> then <math>A(Br) = Cr</math> for any <math>r \in\{0, 1\}^n</math>, thus the algorithm will return a "yes" for any positive instance (<math>AB=C</math>).
But if <math>AB \neq C</math> then the algorithm will make a mistake if it chooses such an <math>r</math> that <math>ABr = Cr</math>. However, the following lemma states that the probability of this event is bounded.
 
{{Theorem|Lemma|
:If <math>AB\neq C</math> then for a uniformly random <math>r \in\{0, 1\}^n</math>,
::<math>\Pr[ABr = Cr]\le \frac{1}{2}</math>.
}}
{{Proof| Let <math>D=AB-C</math>. The event <math>ABr=Cr</math> is equivalent to that <math>Dr=0</math>. It is then sufficient to show that for a <math>D\neq \boldsymbol{0}</math>, it holds that <math>\Pr[Dr = \boldsymbol{0}]\le \frac{1}{2}</math>.
 
Since <math>D\neq \boldsymbol{0}</math>, it must have at least one non-zero entry. Suppose that <math>D_{ij}\neq 0</math>.
 
We assume the event that <math>Dr=\boldsymbol{0}</math>. In particular, the <math>i</math>-th entry of <math>Dr</math> is
:<math>(Dr)_{i}=\sum_{k=1}^n D_{ik}r_k=0.</math>
The <math>r_j</math> can be calculated by
:<math>r_j=-\frac{1}{D_{ij}}\sum_{k\neq j}^n D_{ik}r_k.</math>
Once all other entries <math>r_k</math> with <math>k\neq j</math> are fixed, there is a unique solution of <math>r_j</math>. Therefore, the number of <math>r\in\{0,1\}^n</math> satisfying <math>Dr=\boldsymbol{0}</math> is at most <math>2^{n-1}</math>. The probability that <math>ABr=Cr</math> is bounded as
:<math>\Pr[ABr=Cr]=\Pr[Dr=\boldsymbol{0}]\le\frac{2^{n-1}}{2^n}=\frac{1}{2}</math>.
}}
 
When <math>AB=C</math>, Freivalds algorithm always returns "yes"; and when <math>AB\neq C</math>, Freivalds algorithm returns "no" with probability at least 1/2.
 
To improve its accuracy, we can run Freivalds algorithm for <math>k</math> times, each time with an ''independent'' <math>r\in\{0,1\}^n</math>, and return "yes" if and only if all running instances returns "yes".
 
{{Theorem|Freivalds' Algorithm (multi-round)|
*pick <math>k</math> vectors <math>r_1,r_2,\ldots,r_k \in\{0, 1\}^n</math> uniformly and independently at random;
*if <math>A(Br_i) = Cr_i</math> for all <math>i=1,\ldots,k</math> then return "yes" else return "no";
}}
 
If <math>AB=C</math>, then the algorithm returns a "yes" with probability 1. If <math>AB\neq C</math>, then due to the independence, the probability that all <math>r_i</math> have <math>ABr_i=C_i</math> is at most <math>2^{-k}</math>, so the algorithm returns "no" with probability at least <math>1-2^{-k}</math>. For any <math>0<\epsilon<1</math>, choose <math>k=\log_2 \frac{1}{\epsilon}</math>. The algorithm runs in time <math>O(n^2\log_2\frac{1}{\epsilon})</math> and has a one-sided error (false positive) bounded by <math>\epsilon</math>.

Latest revision as of 14:26, 23 June 2016

Template:Otheruse Fitness in biology is the relative ability of an organism to survive and pass on its genes to the next generation.[1]p160 It is a central idea in evolutionary theory. Fitness is usually equal to the proportion of the individual's genes in all the genes of the next generation.

Like all terms in evolutionary biology, fitness is defined in terms of an interbreeding population, which might or might not be a whole species. If differences in individual genotypes affect fitness, then the frequencies of the genotypes will change over generations; the genotypes with higher fitness become more common. This is the process called natural selection.

An individual's fitness is caused by its phenotype, and passed on by its genotype. The fitnesses of different individuals with the same genotype are not necessarily equal. It depends on the environment in which the individuals live, and on accidental events. However, since the fitness of the genotype is an averaged quantity, it reflects the reproductive outcomes of all individuals with that genotype.

Relatedness

Fitness measures the number of the copies of the genes of an individual in the next generation. It doesn't really matter how the genes arrive in the next generation. For an individual, it is equally "beneficial" to reproduce itself, or to help relatives with similar genes to reproduce, as long as similar number of copies of individual's genes get passed on to the next generation. Selection which promotes this kind of helper behaviour is called kin selection.

Our closest relatives (parents, siblings, and our own children) share on average 50% (half) of our genes. One step further removed are grandparents. With each of them we share on average 25% (a quarter) of our genes. That is a measure of our relatedness to them. Next come first cousins (children of our parents' siblings). We share 12.5% (1/8) of their genes.[2]p100

Hamilton's rule

William Hamilton added various ideas to the notion of fitness. His rule suggests that a costly action should be performed if:

[math]\displaystyle{ C \lt R \times B }[/math] where:
  • [math]\displaystyle{ c \ }[/math] is the reproductive cost to the altruist,
  • [math]\displaystyle{ b \ }[/math] is the reproductive benefit to the recipient of the altruistic behavior, and
  • [math]\displaystyle{ r \ }[/math] is the probability, above the population average, of the individuals sharing an altruistic gene – the "degree of relatedness".

Fitness costs and benefits are measured in fecundity.[3]

Inclusive fitness

Inclusive fitness is a term which is essentially the same as fitness, but emphasises the group of genes rather than individuals.

Biological fitness says how well an organism can reproduce, and spread its genes to its offspring. The theory of inclusive fitness says that the fitness of an organism is also increased to the extent that its close relatives also reproduce. This is because relatives share genes in proportion to their relationship.

Another way of saying it: the inclusive fitness of an organism is not a property of itself, but a property of its set of genes. It is calculated from from the reproductive success of the individual, plus the reproductive success of its relatives, each one weighed by an appropriate coefficient of relatedness.[4]

History

The British social philosopher Herbert Spencer coined the phrase survival of the fittest in his 1864 work Principles of biology to mean what Charles Darwin called natural selection.[5] The original phrase was "survival of the best fitted".

References

Template:Reflist

  1. King R.C. Stansfield W.D. & Mulligan P.K. 2006. A dictionary of genetics, 7th ed. Oxford.
  2. Maynard Smith, John. 1999. Evolutionary genetics. 2nd ed, Cambridge University Press.
  3. Hamilton W.D. 1964. The genetical evolution of social behavior. Journal of Theoretical Biology 7 (1): 1–52. doi:10.1016/0022-5193(64)90038-4.
  4. Adapted from Dawkins R. 1982. The extended phenotype. Oxford: Oxford University Press, p186. ISBN 0-19-288051-9
  5. Herbert Spencer 1864. Principles of Biology London, vol 1, 444, wrote “This survival of the fittest, which I have here sought to express in mechanical terms, is that which Mr. Darwin has called ‘natural selection’, or the preservation of favoured races in the struggle for life.