随机算法 (Fall 2015)/Fingerprinting

From TCS Wiki
Jump to navigation Jump to search

Fingerprinting

Both the Freivald's algorithm for checking matrix multiplication and Schwartz-Zippel theorem for testing polynomial identity learnt in the previous lecture can be abstracted as the following procedure:

Fingerprinting

Suppose we want to compare two items [math]\displaystyle{ Z_1 }[/math] and [math]\displaystyle{ Z_2 }[/math]. Instead of comparing them directly, we compute random fingerprints [math]\displaystyle{ \mathrm{FING}(Z_1) }[/math] and [math]\displaystyle{ \mathrm{FING}(Z_2) }[/math] of them and compare the fingerprints.

The fingerprints has the following properties:

  • [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] is a function, so if [math]\displaystyle{ Z_1= Z_2 }[/math] then [math]\displaystyle{ \mathrm{FING}(Z_1)=\mathrm{FING}(Z_2) }[/math].
  • If [math]\displaystyle{ Z_1\neq Z_2 }[/math] then [math]\displaystyle{ \Pr[\mathrm{FING}(Z_1)=\mathrm{FING}(Z_2)] }[/math] is small.
  • It is much easier to compute and compare the fingerprints than to compare [math]\displaystyle{ Z_1 }[/math] and [math]\displaystyle{ Z_2 }[/math] directly.

In Freivald's algorithm, the items to compare are two [math]\displaystyle{ n\times n }[/math] matrices [math]\displaystyle{ AB }[/math] and [math]\displaystyle{ C }[/math], and given an [math]\displaystyle{ n\times n }[/math] matrix [math]\displaystyle{ M }[/math], its random fingerprint is computed as [math]\displaystyle{ \mathrm{FING}(M)=Mr }[/math] for a uniformly random [math]\displaystyle{ r\in\{0,1\}^n }[/math].

In Schwartz-Zippel theorem, the items to compare are two polynomials [math]\displaystyle{ P_1(x_1,\ldots,x_n) }[/math] and [math]\displaystyle{ P_2(x_1,\ldots,x_n) }[/math], and given a polynomial [math]\displaystyle{ Q(x_1,\ldots,x_n) }[/math], its random fingerprint is computed as [math]\displaystyle{ \mathrm{FING}(Q)=Q(r_1,\ldots,r_n) }[/math] for [math]\displaystyle{ r_i }[/math] chosen independently and uniformly at random from some fixed set [math]\displaystyle{ S }[/math].

For different problems, we may have different definitions of [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math].

Communication complexity revisited

Now consider again the communication model where the two players Alice with a private input [math]\displaystyle{ x\in\{0,1\}^n }[/math] and Bob with a private input [math]\displaystyle{ y\in\{0,1\}^n }[/math] together compute a function [math]\displaystyle{ f(x,y) }[/math] by running a communication protocol.

We still consider the communication protocols for the equality function EQ

[math]\displaystyle{ \mathrm{EQ}(x,y)= \begin{cases} 1& \mbox{if } x=y,\\ 0& \mbox{otherwise.} \end{cases} }[/math]

With the language of fingerprinting, this communication problem can be solved by the following generic scheme:

A protocol for EQ by fingerprinting

Alice does:

  • choose a random fingerprint function [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] and compute the fingerprint [math]\displaystyle{ \mathrm{FING}(x) }[/math] of her input [math]\displaystyle{ x }[/math];
  • send the description of [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] and the value of [math]\displaystyle{ \mathrm{FING}(x) }[/math] to Bob;

Upon receiving [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] and [math]\displaystyle{ \mathrm{FING}(x) }[/math], Bob does:

  • check whether [math]\displaystyle{ \mathrm{FING}(x)=\mathrm{FING}(y) }[/math].

The communication cost and the accuracy of this randomized protocol are guaranteed by the performance of the fingerprint function [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math]:

  1. A random [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] can be described succinctly.
  2. The range of [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] is small, so the fingerprints are succinct.
  3. If [math]\displaystyle{ x\neq y }[/math], the probability [math]\displaystyle{ \Pr[\mathrm{FING}(x)=\mathrm{FING}(y)] }[/math] is small.

In above application of single-variate PIT, we know that [math]\displaystyle{ \mathrm{FING}(x)=\sum_{i=1}^n x_i r^{i} }[/math], where [math]\displaystyle{ r }[/math] is a random element from a finite field and the additions and multiplications are defined over the finite field, is a good fingerprint function. Now we introduce another fingerprint and hence a new communication protocol.

The new fingerprint function we design is as follows: by treating the input string [math]\displaystyle{ x\in\{0,1\}^n }[/math] as the binary representation of a number from [math]\displaystyle{ [2^n] }[/math], let [math]\displaystyle{ \mathrm{FING}(x)=x\bmod p }[/math] for some random prime [math]\displaystyle{ p }[/math]. The prime [math]\displaystyle{ p }[/math] can uniquely specify a random fingerprint function [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math], thus can be used as a description of the function, and also the range of the fingerprints is [math]\displaystyle{ [p] }[/math], thus we want the prime [math]\displaystyle{ p }[/math] to be reasonably small, but still has a good chance to distinguish different [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] after modulo [math]\displaystyle{ p }[/math].

A randomized protocol for EQ

Alice does:

for some parameter [math]\displaystyle{ k }[/math] (to be specified),
  • choose uniformly at random a prime [math]\displaystyle{ p\in[k] }[/math];
  • send [math]\displaystyle{ p }[/math] and [math]\displaystyle{ x\bmod p }[/math] to Bob;

Upon receiving [math]\displaystyle{ p }[/math] and [math]\displaystyle{ x\bmod p }[/math], Bob does:

  • check whether [math]\displaystyle{ x\bmod p=y\bmod p }[/math].

The number of bits to be communicated is [math]\displaystyle{ O(\log k) }[/math]. We then bound the probability of error [math]\displaystyle{ \Pr[x\bmod p=y\bmod p] }[/math] for [math]\displaystyle{ x\neq y }[/math], in terms of [math]\displaystyle{ k }[/math].

Suppose without loss of generality [math]\displaystyle{ x\gt y }[/math]. Let [math]\displaystyle{ z=x-y }[/math]. Then [math]\displaystyle{ z\lt 2^n }[/math] since [math]\displaystyle{ x,y\in[2^n] }[/math], and [math]\displaystyle{ z\neq 0 }[/math] for [math]\displaystyle{ x\neq y }[/math]. It holds that [math]\displaystyle{ x\bmod p=y\bmod p }[/math] if and only if [math]\displaystyle{ z }[/math] is dividable by [math]\displaystyle{ p }[/math]. Note that [math]\displaystyle{ z\lt 2^n }[/math] since [math]\displaystyle{ x,y\in[2^n] }[/math]. We only need to bound the probability

[math]\displaystyle{ \Pr[z\bmod p=0] }[/math] for [math]\displaystyle{ 0\lt z\lt 2^n }[/math], where [math]\displaystyle{ p }[/math] is a random prime chosen from [math]\displaystyle{ [k] }[/math].

The probability [math]\displaystyle{ \Pr[z\bmod p=0] }[/math] is computed directly as

[math]\displaystyle{ \Pr[z\bmod p=0]\le\frac{\mbox{the number of prime divisors of }z}{\mbox{the number of primes in }[k]} }[/math].

For the numerator, we have the following lemma.

Lemma
The number of distinct prime divisors of any natural number less than [math]\displaystyle{ 2^n }[/math] is at most [math]\displaystyle{ n }[/math].
Proof.
Each prime number is [math]\displaystyle{ \ge2 }[/math]. If an [math]\displaystyle{ N\gt 0 }[/math] has more than [math]\displaystyle{ n }[/math] distinct prime divisors, then [math]\displaystyle{ N\ge 2^n }[/math].
[math]\displaystyle{ \square }[/math]

Due to this lemma, [math]\displaystyle{ z }[/math] has at most [math]\displaystyle{ n }[/math] prime divisors.

We then lower bound the number of primes in [math]\displaystyle{ [k] }[/math]. This is given by the celebrated Prime Number Theorem (PNT).

Prime Number Theorem
Let [math]\displaystyle{ \pi(k) }[/math] denote the number of primes less than [math]\displaystyle{ k }[/math]. Then [math]\displaystyle{ \pi(k)\sim\frac{k}{\ln k} }[/math] as [math]\displaystyle{ k\rightarrow\infty }[/math].

Therefore, by choosing [math]\displaystyle{ k=tn\ln tn }[/math] for some [math]\displaystyle{ t }[/math], we have that for a [math]\displaystyle{ 0\lt z\lt 2^n }[/math], and a random prime [math]\displaystyle{ p\in[k] }[/math],

[math]\displaystyle{ \Pr[z\bmod p=0]\le\frac{n}{\pi(k)}\sim\frac{1}{t} }[/math].

We can make this error probability polynomially small and the number of bits to be communicated is still [math]\displaystyle{ O(\log k)=O(\log n) }[/math].

Randomized pattern matching

Consider the following problem of pattern matching, which has nothing to do with communication complexity.

  • Input: a string [math]\displaystyle{ x\in\{0,1\}^n }[/math] and a "pattern" [math]\displaystyle{ y\in\{0,1\}^m }[/math].
  • Determine whether the pattern [math]\displaystyle{ y }[/math] is a contiguous substring of [math]\displaystyle{ x }[/math]. Usually, we are also asked to find the location of the substring.

A naive algorithm trying every possible match runs in [math]\displaystyle{ O(nm) }[/math] time. The more sophisticated KMP algorithm inspired by automaton theory runs in [math]\displaystyle{ O(n+m) }[/math] time.

A simple randomized algorithm, due to Karp and Rabin, uses the idea of fingerprinting and also runs in [math]\displaystyle{ O(n + m) }[/math] time.

Let [math]\displaystyle{ X(j)=x_jx_{j+1}\cdots x_{j+m-1} }[/math] denote the substring of [math]\displaystyle{ x }[/math] of length [math]\displaystyle{ m }[/math] starting at position [math]\displaystyle{ j }[/math].

Algorithm (Karp-Rabin)
pick a random prime [math]\displaystyle{ p\in[k] }[/math];
for [math]\displaystyle{ j = 1 }[/math] to [math]\displaystyle{ n -m + 1 }[/math] do
if [math]\displaystyle{ X(j)\bmod p = y \bmod p }[/math] then report a match;
return "no match";

So the algorithm just compares the [math]\displaystyle{ \mathrm{FING}(X(j)) }[/math] and [math]\displaystyle{ \mathrm{FING}(y) }[/math] for every [math]\displaystyle{ j }[/math], with the same definition of fingerprint function [math]\displaystyle{ \mathrm{FING}(\cdot) }[/math] as in the communication protocol for EQ.

By the same analysis, by choosing [math]\displaystyle{ k=n^2m\ln (n^2m) }[/math], the probability of a single false match is

[math]\displaystyle{ \Pr[X(j)\bmod p=y\bmod p\mid X(j)\neq y ]=O\left(\frac{1}{n^2}\right) }[/math].

By the union bound, the probability that a false match occurs is [math]\displaystyle{ O\left(\frac{1}{n}\right) }[/math].

The algorithm runs in linear time if we assume that we can compute [math]\displaystyle{ X(j)\bmod p }[/math] for each [math]\displaystyle{ j }[/math] in constant time. This outrageous assumption can be made realistic by the following observation.

Lemma
Let [math]\displaystyle{ \mathrm{FING}(a)=a\bmod p }[/math].
[math]\displaystyle{ \mathrm{FING}(X(j+1))\equiv2(\mathrm{FING}(X(j))-2^{m-1}x_j)+x_{j+m}\pmod p\, }[/math].
Proof.
It holds that
[math]\displaystyle{ X(j+1)=2(X(j)-2^{m-1}x_j)+x_{j+m}\, }[/math].

So the equation holds on the finite field modulo [math]\displaystyle{ p }[/math].

[math]\displaystyle{ \square }[/math]

Due to this lemma, each fingerprint [math]\displaystyle{ \mathrm{FING}(X(j)) }[/math] can be computed in an incremental way, each in constant time. The running time of the algorithm is [math]\displaystyle{ O(n+m) }[/math].

Primality Test

In above applications, our algorithm is required to "sample a uniform random prime from [math]\displaystyle{ [N] }[/math]". This can be done by first sample a uniform random integer [math]\displaystyle{ n }[/math] from [math]\displaystyle{ [N] }[/math] and then test whether [math]\displaystyle{ n }[/math] is prime (and repeat the procedure if it fails the test). This requires us to efficiently determines whether a given integer is prime.

A primality test is an algorithm that given as input a number [math]\displaystyle{ n }[/math] determines whether [math]\displaystyle{ n }[/math] is prime. A naive deterministic algorithm is to check for every [math]\displaystyle{ 2\le i\le \sqrt{n} }[/math] whether [math]\displaystyle{ n }[/math] is divisible by [math]\displaystyle{ i }[/math]. The time complexity of this naive algorithm is [math]\displaystyle{ O(\sqrt{n}) }[/math] assuming that a modular computation takes constant time. Note that [math]\displaystyle{ \sqrt{n} }[/math] is obviously bounded by polynomial of [math]\displaystyle{ n }[/math], so do we have a "polynomial-time" algorithm?

The answer depends what exactly we mean by "polynomial-time" algorithms. Rigorously, an algorithm is polynomial-time if its worst-case running time is bounded by a polynomial in the size of the input, measured by the number of bits representing the input. In the case of primality testing, the input is a natural number [math]\displaystyle{ n }[/math] represented in binary code, which takes [math]\displaystyle{ \log n }[/math] bits. So for the problem of primality testing, or similar number theory problems, an algorithm is polynomial time if its running time is in polynomial of [math]\displaystyle{ \log n }[/math] instead of [math]\displaystyle{ n }[/math].

However, much faster and simpler randomized primality testing algorithms were known long before the discovery of efficient deterministic primality test.

Fermat Test

Recall the Fermat's little theorem.

Fermat's little theorem
If [math]\displaystyle{ n\gt 2 }[/math] is prime, then [math]\displaystyle{ a^{n-1}\equiv 1\pmod n }[/math] for every [math]\displaystyle{ a\in\{1,2,\ldots,n-1\} }[/math].

There are several proofs for this famous theorem. We will not prove the theorem but will only use it here.

If we can find an [math]\displaystyle{ a\in\{1,2,\ldots,n-1\} }[/math] such that [math]\displaystyle{ a^{n-1}\not\equiv 1\pmod n }[/math], it will prove that [math]\displaystyle{ n }[/math] is composite. This inspires the following "primality testing" algorithm.

Fermat test
  1. Choose a uniformly random [math]\displaystyle{ a\in\{1,2,\ldots,n-1\} }[/math].
  2. If [math]\displaystyle{ a^{n-1}\not\equiv 1\pmod n }[/math], then return "composite".
  3. Else return "probably prime".

Complexity of Fermat test

The time complexity of this algorithm depends on the computational cost [math]\displaystyle{ a^{n-1} \bmod n }[/math], whose straightforward computing takes [math]\displaystyle{ n-2 }[/math] multiplications, which is too expensive. We describe an efficient way of computing the modular exponent [math]\displaystyle{ a^x\bmod n\, }[/math] where [math]\displaystyle{ x\in[n] }[/math].

We first make the following observations regarding the modular exponentiations:

  • If the values of [math]\displaystyle{ a^x\bmod n }[/math] and [math]\displaystyle{ a^y\bmod n }[/math] are both known, then [math]\displaystyle{ a^{x+y}\bmod n }[/math] can be computed by multiplying (modulo [math]\displaystyle{ n }[/math]) them.
  • [math]\displaystyle{ a^{2^i} }[/math] can be computed by letting [math]\displaystyle{ a_0=a }[/math] and [math]\displaystyle{ a_{j}=a_{j-1}^2 \pmod n }[/math] for [math]\displaystyle{ j=1,2,\ldots, i }[/math], which takes only [math]\displaystyle{ i }[/math] modular multiplications.

Let [math]\displaystyle{ \ell=\lceil\log_2 n\rceil }[/math]. A number [math]\displaystyle{ x\in[n] }[/math] can be represented in its binary form: [math]\displaystyle{ x_\ell x_{\ell-1}\cdots x_1x_0 }[/math], where each [math]\displaystyle{ x_i\in\{0,1\} }[/math], so that [math]\displaystyle{ x=\sum_{i=0}^{\ell}x_i\cdot2^i }[/math].

Combining the above two observations, all [math]\displaystyle{ a^{x_i2^i}\bmod n }[/math] can be computed in [math]\displaystyle{ O(\log n) }[/math] many multiplications, and [math]\displaystyle{ a^x\bmod n }[/math] can be computed by multiplying (modulo [math]\displaystyle{ n }[/math]) them together.

The time complexity of Fermat test thus can be made in polynomial of [math]\displaystyle{ \log n }[/math].

Accuracy of Fermat test

If the output is "composite", then [math]\displaystyle{ a^{n-1}\not\equiv 1\pmod n }[/math] for some [math]\displaystyle{ a\in\{1,2,\ldots,n-1\} }[/math]. By the Fermat's little theorem, [math]\displaystyle{ n }[/math] must be composite. Therefore, for any prime [math]\displaystyle{ n }[/math], the output is always "probably prime".

For composite [math]\displaystyle{ n }[/math], it is possible that the algorithm picks an [math]\displaystyle{ a }[/math] such that [math]\displaystyle{ a^{n-1}\equiv 1\pmod n }[/math] and outputs "probably prime". But if the fraction of such bad [math]\displaystyle{ a }[/math] in [math]\displaystyle{ \{1,2,\ldots,n-1\} }[/math] is small enough, then the testing algorithm may still correctly output "composite" with a good chance. However, there exist (though very rare) such composites, called Carmichael numbers, that may fool the Fermat test.

Definition (Carmichael number)
A composite number [math]\displaystyle{ n }[/math] is a Carmichael number if [math]\displaystyle{ a^{n-1}\equiv 1\pmod n }[/math] for all [math]\displaystyle{ a\in\mathbb{Z}_n^* }[/math].

Here [math]\displaystyle{ \mathbb{Z}_n^* }[/math] is the multiplicative group modulo [math]\displaystyle{ n }[/math], defined as [math]\displaystyle{ \mathbb{Z}_n^*=\{a\mid 1\le a\le n-1\wedge \mathrm{gcd}(a,n)=1\} }[/math].

For non-Carmichael composites, the Fermat test may detect the compositeness with a fairly good chance. Let [math]\displaystyle{ B=\{a\in\mathbb{Z}_n^*\mid a^{n-1}\equiv 1\pmod n\} }[/math]. Note that [math]\displaystyle{ B }[/math] is closed under multiplication (modulo [math]\displaystyle{ n }[/math]), thus [math]\displaystyle{ B }[/math] is a subgroup of [math]\displaystyle{ \mathbb{Z}_n^* }[/math]. Therefore, [math]\displaystyle{ |\mathbb{Z}_n^*| }[/math] is divisible by [math]\displaystyle{ |B| }[/math].

If [math]\displaystyle{ n }[/math] is neither prime nor Carmichael, then [math]\displaystyle{ \mathbb{Z}_n^*\setminus B }[/math] is nonempty, i.e. [math]\displaystyle{ B }[/math] is a proper subgroup of [math]\displaystyle{ \mathbb{Z}_n^* }[/math], thus [math]\displaystyle{ |\mathbb{Z}_n^*|/|B| }[/math] is at least 2 and there are at least half [math]\displaystyle{ a\in\{1,2,\ldots,n-1\} }[/math] satisfying [math]\displaystyle{ a^{n-1}\not\equiv 1 }[/math].

In conclusion,

  • if [math]\displaystyle{ n }[/math] is prime, then the Fermat test returns "probably prime" with probability 1;
  • if [math]\displaystyle{ n }[/math] is non-Carmichael composite, then the Fermat test returns "composite" with probability at least [math]\displaystyle{ 1/2 }[/math];
  • if [math]\displaystyle{ n }[/math] is a Carmichael number, the Fermat test breaks down.

As long as the input is not a Carmichael number, we can repeat the Fermat test independently for [math]\displaystyle{ k }[/math] times and reduce the error probability to [math]\displaystyle{ 2^{-k} }[/math].

The Carmichael numbers are very rare. Let [math]\displaystyle{ c(n) }[/math] be the "Carmichael density" that

[math]\displaystyle{ c(n)=\frac{\text{number of Carmichael numbers }\le n}{n} }[/math].

In 1956, Erdős proved that

[math]\displaystyle{ c(n)\le\exp\left(-\Omega\left(\frac{\log n\log\log\log n}{\log \log n}\right)\right)=n^{-\Omega\left(\frac{\log\log\log n}{\log\log n}\right)} }[/math].

If one only needs to generates a prime number instead of testing the primality of a given number, then we can generates a random number, and apply the Fermat test. Due to the prime number theorem, the number of prime numbers less than or equal to [math]\displaystyle{ n }[/math] is [math]\displaystyle{ \pi(n)\sim\frac{n}{\ln n} }[/math]. This scheme will generates a prime number in a reasonable number of independent trials with a good chance.

Miller-Rabin Test

The Fermat test is based on the following way to prove that a number [math]\displaystyle{ n }[/math] is composite:

  • there exists a number [math]\displaystyle{ a }[/math] such that [math]\displaystyle{ a^{n-1}\not\equiv 1\pmod n }[/math].

The Miller-Rabin primality test is based on an additional way to prove that a number [math]\displaystyle{ n }[/math] is composite:

  • 1 has a nontrivial square root, that is, a number [math]\displaystyle{ a }[/math] satisfying that [math]\displaystyle{ a^2\equiv 1\pmod n }[/math] but [math]\displaystyle{ a\not\equiv \pm 1\pmod n }[/math].

The following theorem states that the existence of nontrivial square root of 1 is a valid proof of compositeness of [math]\displaystyle{ n }[/math].

Theorem
If [math]\displaystyle{ n\gt 2 }[/math] is prime, then [math]\displaystyle{ 1 }[/math] does not have a nontrivial square root.
Proof.

Suppose [math]\displaystyle{ a }[/math] is a square root of 1, that is, [math]\displaystyle{ a^2\equiv1\pmod n }[/math]. Therefore,

[math]\displaystyle{ (a-1)(a+1)=a^2-1\equiv 0\pmod n }[/math],

which means that [math]\displaystyle{ (a-1)(a+1)|n\, }[/math].

If [math]\displaystyle{ a\not\equiv \pm1\pmod n }[/math], then [math]\displaystyle{ n }[/math] divides neither [math]\displaystyle{ (a-1) }[/math] nor [math]\displaystyle{ (a+1) }[/math], which contradicts that [math]\displaystyle{ n }[/math] is prime and divides [math]\displaystyle{ (a-1)(a+1) }[/math].

[math]\displaystyle{ \square }[/math]

The idea of Miller-Rabin test is to find either a Fermat proof of compositeness, or a nontrivial square root of 1.

Miller-Rabin Primality Test
  1. Choose a uniformly random [math]\displaystyle{ a\in\{1,2,\ldots,n-1\} }[/math].
  2. Let [math]\displaystyle{ t }[/math] and [math]\displaystyle{ m }[/math] be such that [math]\displaystyle{ t\ge 1 }[/math], [math]\displaystyle{ m }[/math] is odd, and [math]\displaystyle{ n-1=2^tm }[/math].
  3. Let [math]\displaystyle{ a_0=a^m\bmod\, n\, }[/math]. For [math]\displaystyle{ i=1 }[/math] to [math]\displaystyle{ t }[/math], let [math]\displaystyle{ a_i=a_{i-1}^2 \bmod\, n }[/math].
  4. If [math]\displaystyle{ a_t\not\equiv 1\pmod n }[/math], then return "composite".
  5. If there is an [math]\displaystyle{ i }[/math], [math]\displaystyle{ 1\le i\le t }[/math], such that [math]\displaystyle{ a_i\equiv 1\pmod n }[/math] but [math]\displaystyle{ a_{i-1}\not\equiv \pm 1\pmod n }[/math], then return "composite".
  6. Else return "probably prime".

An easy inductive proof shows that [math]\displaystyle{ a_i=a^{2^im}\bmod\, n }[/math] for all [math]\displaystyle{ i }[/math], [math]\displaystyle{ 0\le i\le t }[/math]. In particular, [math]\displaystyle{ a_t\equiv a^{2^tm}=a^{n-1}\pmod n }[/math].

The original algorithm due to Miller is deterministic, which test all small [math]\displaystyle{ a }[/math] up to an [math]\displaystyle{ O(\log n) }[/math] order. The correctness of this deterministic algorithm relies on the unproven conjecture of Generalized Riemann hypothesis. It was observed by Rabin that the deterministic searching can be replaced by random sampling.

Line 4 of the algorithm is equivalent to that [math]\displaystyle{ a^{n-1}\not\equiv 1\pmod n }[/math], thus line 4 is just the Fermat test. If [math]\displaystyle{ n }[/math] passes the Fermat test, line 5 tries to find a nontrivial square root of 1 in form of [math]\displaystyle{ a^{2^im} }[/math].

If [math]\displaystyle{ n }[/math] is prime, then due to the Fermat little theorem and the fact that prime numbers do not have nontrivial square roots of 1, the conditions in line 4 and line 5 will never hold, thus the algorithm will return "probably prime". If [math]\displaystyle{ n }[/math] is a non-Carmichael composite, as in the Fermat test, line 4 returns "composite" with probability at least [math]\displaystyle{ 1/2 }[/math]. The only remaining case is when [math]\displaystyle{ n }[/math] is a Carmichael number.

We pick the largest [math]\displaystyle{ j }[/math] such that there is a [math]\displaystyle{ b\in\mathbb{Z}_{n}^* }[/math] satisfying [math]\displaystyle{ b^{2^jm}\equiv -1\pmod n }[/math], and define

[math]\displaystyle{ B=\{a\in\mathbb{Z}_n^*\mid a^{2^jm}\equiv \pm 1\pmod n\} }[/math].
Theorem
If [math]\displaystyle{ n }[/math] is a Carmichael number, then the [math]\displaystyle{ B }[/math] defined as above is a proper subgroup of [math]\displaystyle{ \mathbb{Z}_n^* }[/math].

Since [math]\displaystyle{ j }[/math] is fixed, it is easy to verify that [math]\displaystyle{ B }[/math] is closed under multiplication, thus [math]\displaystyle{ B }[/math] is a subgroup of [math]\displaystyle{ \mathbb{Z}_n^* }[/math]. It is a bit complicated to show that [math]\displaystyle{ \mathbb{Z}_n^*\setminus B }[/math] is nonempty and we will not give the full proof here.

The accuracy of Miller-Rabin test on Carmichael numbers is implied by this theorem. Suppose [math]\displaystyle{ n }[/math] is a Carmichael number. We call an [math]\displaystyle{ a\in\{1,2,\ldots, n-1\} }[/math] a liar if it fools the test in line 5, i.e. there is no such [math]\displaystyle{ i }[/math] that [math]\displaystyle{ a^{2^im}\equiv 1\pmod n }[/math] but [math]\displaystyle{ a^{2^{i-1}m}\not\equiv \pm 1\pmod n }[/math].

We claim that all liars belong to [math]\displaystyle{ B }[/math]. Due to the maximality of [math]\displaystyle{ j }[/math], [math]\displaystyle{ a^{2^im}\not\equiv -1 }[/math] for all [math]\displaystyle{ i\gt j }[/math]. Since [math]\displaystyle{ n }[/math] is a Carmichael number, [math]\displaystyle{ a^{n-1}\equiv 1\pmod n }[/math], if [math]\displaystyle{ a }[/math] is a liar then it mus hold that [math]\displaystyle{ a^{2^im}\equiv 1\pmod n }[/math] for all [math]\displaystyle{ i\gt j }[/math] or otherwise [math]\displaystyle{ a }[/math] cannot be a liar. In particular, [math]\displaystyle{ a^{2^{j+1}m}\equiv 1\pmod n }[/math]. Again, since [math]\displaystyle{ a }[/math] is a liar, [math]\displaystyle{ a^{2^jm}\equiv \pm1\pmod n }[/math], therefore [math]\displaystyle{ a\in B }[/math].

We show that when [math]\displaystyle{ n }[/math] is a Carmichael number, all numbers [math]\displaystyle{ a }[/math] that fools the Miller-Rabin test belongs to a proper subgroup of [math]\displaystyle{ \mathbb{Z}_n^* }[/math], therefore the Miller-Rabin test returns a "composite" with probability [math]\displaystyle{ 1/2 }[/math].

In conclusion,

  • if [math]\displaystyle{ n }[/math] is prime, the algorithm returns "probably prime";
  • if [math]\displaystyle{ n }[/math] is a non-Carmichael composite, the algorithm returns "composite" in line 4 with probability at least [math]\displaystyle{ 1/2 }[/math];
  • if [math]\displaystyle{ n }[/math] is a Carmichael number, the algorithm returns "composite" in line 5 with probability at least [math]\displaystyle{ 1/2 }[/math].