# Polynomial identities

Consider the following problem:

• Given as the input two multivariate polynomials ${\displaystyle P_{1}(x_{1},\ldots ,x_{n})}$ and ${\displaystyle P_{2}(x_{1},\ldots ,x_{n})}$,
• check whether the two polynomials are identical, denoted ${\displaystyle P_{1}\equiv P_{2}}$.

Obviously, if ${\displaystyle P_{1},P_{2}}$ are written out explicitly, the question is trivially answered in linear time just by comparing their coefficients. But in practice they are usually given in very compact form (e.g., as determinants of matrices), so that we can evaluate them efficiently, but expanding them out and looking at their coefficients is out of the question.

 Example Consider the polynomial ${\displaystyle P(x_{1},\ldots ,x_{n})=\prod _{\overset {i Show that evaluating ${\displaystyle P}$ at any given point can be done efficiently, but that expanding out ${\displaystyle P}$ to find all its coefficients is computationally infeasible even for moderate values of ${\displaystyle n}$.

Here is a very simple randomized algorithm, due to Schwartz and Zippel. Testing ${\displaystyle P_{1}\equiv P_{2}}$ is equivalent to testing ${\displaystyle P\equiv 0}$, where ${\displaystyle P=P_{1}-P_{2}}$.

 Algorithm (Schwartz-Zippel) pick ${\displaystyle r_{1},\ldots ,r_{n}}$ independently and uniformly at random from a set ${\displaystyle S}$; if ${\displaystyle P_{1}(r_{1},\ldots ,r_{n})=P_{2}(r_{1},\ldots ,r_{n})}$ then return “yes” else return “no”;

This algorithm requires only the evaluation of ${\displaystyle P}$ at a single point. And if ${\displaystyle P\equiv 0}$ it is always correct.

In the Theorem below, we’ll see that if ${\displaystyle P\neq 0}$ then the algorithm is incorrect with probability at most ${\displaystyle {\frac {d}{|S|}}}$, where ${\displaystyle d}$ is the maximum degree of the polynomial ${\displaystyle P}$.

 Theorem (Schwartz-Zippel) Let ${\displaystyle Q(x_{1},\ldots ,x_{n})}$ be a multivariate polynomial of degree ${\displaystyle d}$ defined over a field ${\displaystyle \mathbb {F} }$. Fix any finite set ${\displaystyle S\subset \mathbb {F} }$, and let ${\displaystyle r_{1},\ldots ,r_{n}}$ be chosen independently and uniformly at random from ${\displaystyle S}$. Then ${\displaystyle \Pr[Q(r_{1},\ldots ,r_{n})=0\mid Q\not \equiv 0]\leq {\frac {d}{|S|}}.}$
Proof.
 The theorem holds if ${\displaystyle Q}$ is a single-variate polynomial, because a single-variate polynomial ${\displaystyle Q}$ of degree ${\displaystyle d}$ has at most ${\displaystyle d}$ roots, i.e. there are at most ${\displaystyle d}$ many choices of ${\displaystyle r}$ having ${\displaystyle Q(r)=0}$, so the theorem follows immediately. For multi-variate ${\displaystyle Q}$, we prove by induction on the number of variables ${\displaystyle n}$. Write ${\displaystyle Q(x_{1},\ldots ,x_{n})}$ as ${\displaystyle Q(x_{1},\ldots ,x_{n})=\sum _{i=0}^{k}x_{n}^{k}Q_{i}(x_{1},\ldots ,x_{n-1})}$ where ${\displaystyle k}$ is the largest exponent of ${\displaystyle x_{n}}$ in ${\displaystyle Q(x_{1},\ldots ,x_{n})}$. So ${\displaystyle Q_{k}(x_{1},\ldots ,x_{n-1})\not \equiv 0}$ by our definition of ${\displaystyle k}$, and its degree is at most ${\displaystyle d-k}$. Thus by the induction hypothesis we have that ${\displaystyle \Pr[Q_{k}(r_{1},\ldots ,r_{n-1})=0]\leq {\frac {d-k}{|S|}}}$. Conditioning on the event ${\displaystyle Q_{k}(r_{1},\ldots ,r_{n-1})\neq 0}$, the single-variate polynomial ${\displaystyle Q'(x_{n})=Q(r_{1},\ldots ,r_{n-1},x_{n})=\sum _{i=0}^{k}x_{n}^{k}Q_{i}(r_{1},\ldots ,r_{n-1})}$ has degree ${\displaystyle k}$ and ${\displaystyle Q'(x_{n})\not \equiv 0}$, thus {\displaystyle {\begin{aligned}&\quad \,\Pr[Q(r_{1},\ldots ,r_{n})=0\mid Q_{k}(r_{1},\ldots ,r_{n-1})\neq 0]\\&=\Pr[Q'(r_{n})=0\mid Q_{k}(r_{1},\ldots ,r_{n-1})\neq 0]\\&\leq {\frac {k}{|S|}}\end{aligned}}}. Therefore, due to the law of total probability, {\displaystyle {\begin{aligned}&\quad \,\Pr[Q(r_{1},\ldots ,r_{n})=0]\\&=\Pr[Q(r_{1},\ldots ,r_{n})=0\mid Q_{k}(r_{1},\ldots ,r_{n-1})\neq 0]\Pr[Q_{k}(r_{1},\ldots ,r_{n-1})\neq 0]\\&\quad \,\,+\Pr[Q(r_{1},\ldots ,r_{n})=0\mid Q_{k}(r_{1},\ldots ,r_{n-1})=0]\Pr[Q_{k}(r_{1},\ldots ,r_{n-1})=0]\\&\leq \Pr[Q(r_{1},\ldots ,r_{n})=0\mid Q_{k}(r_{1},\ldots ,r_{n-1})\neq 0]+\Pr[Q_{k}(r_{1},\ldots ,r_{n-1})=0]\\&\leq {\frac {k}{|S|}}+{\frac {d-k}{|S|}}\\&={\frac {d}{|S|}}.\end{aligned}}}
${\displaystyle \square }$