随机算法 (Fall 2011)/Checking distinctness and Carmichael number: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
No edit summary
 
imported>Addbot
m (Bot: 21 interwiki links moved, now provided by Wikidata on d:q849530)
 
Line 1: Line 1:
= The idea of fingerprinting =
In [[number theory]] a '''Carmichael number''' is a [[composite number|composite]] positive [[integer]] <math>n</math>, which satisfies the [[congruence]] <math>b^{n-1}\equiv 1\pmod{n}</math> for all integers <math>b</math> which are [[coprime|relatively prime]] to <math>n</math>. Being relatively prime means that they do not have common divisors, other than 1. Such numbers are named after [[Robert Daniel Carmichael|Robert Carmichael]].  
Suppose we want to compare two items <math>Z_1</math> and <math>Z_2</math>. Instead of comparing them directly, we compute random '''fingerprints''' <math>\mathrm{FING}(Z_1)</math> and <math>\mathrm{FING}(Z_2)</math> and compare these. The fingerprints has the following properties:
* <math>\mathrm{FING}(\cdot)</math> is a function, which means that if <math>Z_1= Z_2</math> then <math>\mathrm{FING}(Z_1)=\mathrm{FING}(Z_2)</math>.
* If <math>Z_1\neq Z_2</math> then <math>\Pr[\mathrm{FING}(Z_1)=\mathrm{FING}(Z_2)]</math> is small.
* It is much more to compute and compare the fingerprints than to compare <math>Z_1</math> and <math>Z_2</math> directly.


For Freivald's algorithm, the items to compare are two <math>n\times n</math> matrices <math>AB</math> and <math>C</math>, and given an <math>n\times n</math> matrix <math>M</math>, its random fingerprint is computed as <math>\mathrm{FING}(M)=Mr</math> for a uniformly random <math>r\in\{0,1\}^n</math>.
All [[prime number]]s <math>p</math> satisfy <math>b^{p-1}\equiv 1\pmod{p}</math> for all integers <math>b</math> which are relatively prime to <math>p</math>. This has been proven by the famous mathematician [[Pierre de Fermat]]. In most cases if a number <math>n</math> is composite, it does '''not''' satisfy this congruence equation. So, there exist not so many Carmichael numbers. We can say that Carmichael numbers are composite numbers that behave a little bit like they would be a prime number.  
 
{{Math-stub}}
For the Schwartz-Zippel algorithm, the items to compare are two polynomials <math>P_1(x_1,\ldots,x_n)</math> and <math>P_2(x_1,\ldots,x_n)</math>, and given a polynomial <math>Q(x_1,\ldots,x_n)</math>, its random fingerprint is computed as <math>\mathrm{FING}(Q)=Q(r_1,\ldots,r_n)</math> for <math>r_i</math> chosen independently and uniformly at random from some fixed set <math>S</math>.
[[Category:Number theory]]
 
For different problems, we may have different definitions of <math>\mathrm{FING}(\cdot)</math>.
 
 
= Checking distinctness =
Consider the following problem:
*Given a sequence <math>x_1,x_2,\ldots,x_n\in\{1,2,\ldots,n\}</math>, check whether every member of <math>\{1,2,\ldots,n\}</math> appears '''exactly''' once.
 
This problem is called detecting duplicate or checking distinctness. It can be solved in linear time and linear space by a straightforward algorithm.
 
For many real applications, the <math>n</math> is enormously large, and we would like to have an algorithm using very limited extra space.
 
== Fingerprinting multisets ==
A randomized algorithm due to Lipton, checks distinctness by solving a more general problem '''fingerprinting multisets'''.
 
Given a multiset (each member may appear more than once) <math>M=\{x_1,x_2,\ldots,x_n\}</math>, its fingerprint is defined as
:<math>
\mathrm{FING}(M)=\prod_{i=1}^n(r-x_i)\bmod p,
</math>
where <math>p</math> is a random prime chosen from the interval <math>[(n\log n)^2,2(n\log n)^2]</math>, and <math>r</math> is chosen uniformly at random from <math>[p]</math>.
 
We first see that the space reuqired to compute <math>\mathrm{FING}(M)</math> is only <math>O(\log n)</math> bits. We do not need to compute the value of the product <math>\prod_{i=1}^n(r-x_i)</math>. Instead, all the computation can be done in the finite field <math>\mathbb{Z}_p</math> (you need some knowledge in algebra to understand this). So the space requirement is only <math>O(\log p)=O(\log n)</math>.
 
It is easy to see that the above fingerprint function is invariant under permutations, thus one multiset has only one fingerprint. The next theorem due to Lipton, states that the probability that two distinct multisets have the same fingerprint is small.
 
{{Theorem|Theorem (Lipton 1989)|
:Let <math>M_1=\{x_1,x_2,\ldots,x_n\}</math> and <math>M_2=\{y_1,y_2,\ldots,y_n\}</math> be two multisets whose members are from <math>\{1,2,\ldots,n\}</math>. If <math>M_1\neq M_2</math>, then
::<math>\Pr[\mathrm{FING}(M_1)= \mathrm{FING}(M_2)]=O\left(\frac{1}{n}\right)</math>.
}}
{{Proof| Let <math>P_1(u)=\prod_{i=1}^n(u-x_i)</math> and <math>P_2(u)=\prod_{i=1}^n(u-y_i)</math>, and let <math>Q(u)=P_1(u)-P_2(u)</math>.
 
If <math>M_1\neq M_2</math>, then the polynomials <math>P_1</math> and <math>P_2</math> are not identical, and the polynomial <math>Q\not\equiv 0</math>. We only need to show that <math>\Pr[Q(r)\bmod =0]\,</math> is small for our choice of random <math>p</math> and <math>r</math>.
 
We first show that the probability that <math>Q\equiv 0 \bmod p</math> is small. Then apply the Schwartz-Zippel technique to show that conditioning on that <math>Q\not\equiv 0 \bmod p</math>, the probability <math>Q(r) \bmod p=0</math> is small.
 
Expand <math>Q(u)</math> in form of <math>Q(u)=\sum_{k=0}^n a_i u^k</math>. It can be verified by induction on <math>n</math> that for any coefficient <math>a_k</math>, it holds that <math>|a_k|\le 2n\cdot 2^{n}</math>.
 
Since <math>Q\not\equiv 0</math>, it has at least one nonzero coefficient <math>c\neq 0</math>, and it holds that <math>|c|\le 2n\cdot 2^{n}</math>. Then with the same analysis as in the problem of EQ, by our choice of random <math>p</math>,
:<math>\Pr[c\bmod p=0]=O\left(\frac{1}{n}\right)</math>.
 
Conditioning on that <math>Q\not\equiv 0</math> after modulo <math>p</math>, since the degree of <math>Q</math> is at most <math>n</math>, <math>Q</math> has at most <math>n</math> roots. Since <math>p\in[(n\log n)^2,2(n\log n)^2]</math>, therefore <math>r</math> is uniformly distributed over a set of size at least <math>(n\log n)^2</math>. Therefore, the probability that <math>Q(r)\bmod p =0</math> conditioning on that <math>Q\not\equiv 0\bmod p</math> is at most <math>\frac{1}{n(\log n)^2}</math>.
 
By the union bound,
<math>\Pr[\mathrm{FING}(M_1)= \mathrm{FING}(M_2)]=O\left(\frac{1}{n}\right)</math>.
}}
 
== Checking distinctness by fingerprinting multisets ==
We now have a fingerprint function for multisets, which is useful for checking the identities of multisets. To solve the problem of checking distinctness, we just check whether the input multiset <math>M</math> is identical to the set <math>\{1,2,\ldots,n\}</math>.

Latest revision as of 08:08, 11 March 2013

In number theory a Carmichael number is a composite positive integer [math]\displaystyle{ n }[/math], which satisfies the congruence [math]\displaystyle{ b^{n-1}\equiv 1\pmod{n} }[/math] for all integers [math]\displaystyle{ b }[/math] which are relatively prime to [math]\displaystyle{ n }[/math]. Being relatively prime means that they do not have common divisors, other than 1. Such numbers are named after Robert Carmichael.

All prime numbers [math]\displaystyle{ p }[/math] satisfy [math]\displaystyle{ b^{p-1}\equiv 1\pmod{p} }[/math] for all integers [math]\displaystyle{ b }[/math] which are relatively prime to [math]\displaystyle{ p }[/math]. This has been proven by the famous mathematician Pierre de Fermat. In most cases if a number [math]\displaystyle{ n }[/math] is composite, it does not satisfy this congruence equation. So, there exist not so many Carmichael numbers. We can say that Carmichael numbers are composite numbers that behave a little bit like they would be a prime number. Template:Math-stub