I wrote in my previous post that Cantor and Zassenhaus' algorithm of polynomial factorisation over finite fields can be divided into three stages:
According to Modern Computer Algebra, "the second stage consumes the bulk of the computing time". In 1998 Erich Kaltofen and Victor Shoup designed a new approach to distinct-degree factorisation which allows to decrease it's cost. This strategy is called "baby/giant step" and exploits the fact from number theory:
- squarefree factorisation
- distinct-degree factorisation
- equal-degree factorisation
(in the standard Cantor-Zassenhaus algorithm the squarefree factorisation step is merged with the algorithm itself)
Lemma. For nonnegative integers i and j, the polynomial xqi−xqj∈Fq[x] is divisible by precisely those irreducible polynomials in Fq[x] whose degree divides i−j.
The proof is easy: if i≥j then xqi−xqj=(xqi−j−x)qj. From the Theorem from the previous post we know that the factorisation of xqk−x consists of all irreducible factors whose degree is a divisor of k.
Now it's possible to formulate the algorithm:
Fast distinct-degree factorisation (DDF)
Input and output
This algorithm takes as input a square-free polynomial f∈Fq[x] of degree n.
The output is f[1],...,f[n]∈Fq[x] such that for 1≤d≤n, f[d] is the product of the monic irreducible factors of f of degree d. The algorithm is parameterized by a constant β, with 0≤β≤1.
Step 1 (compute baby steps)
Let l=⌈nβ⌉ . For 0≤i≤l, compute h_i = x^{q^i} \mod f.
Step 2 (compute giant steps)
Step 2 (compute giant steps)
Let m = \lceil \frac{n}{2l} \rceil . For 1\leq j\leq m, compute H_j = x^{q^{lj}} \mod f.
Step 3 (compute interval polynomials)
For 1\leq j\leq m, compute I_j = \prod\limits_{0\leq i\leq l}(H_j - h_i) \mod f .
(by Lemma the polynomial I_j is divisible by those irreducible factors of f whose degree divides an integer k with (j-1)l < k \leq jl)
Step 4 (compute coarse DDF)
In this step, we compute polynomials F_1, ... , F_m where F_j = f^{[(j-1)l+1]} f^{[(j-1)l+2]} ... f^{[jl]}. This is done as follows.
s\leftarrow f;
for j \leftarrow 1 to m do
\left\{F_j \leftarrow \gcd(s, I_j); s\leftarrow \frac{s}{F_j}\right\}
Step 5 (compute fine DDF)
s\leftarrow f;
for j \leftarrow 1 to m do
\left\{F_j \leftarrow \gcd(s, I_j); s\leftarrow \frac{s}{F_j}\right\}
Step 5 (compute fine DDF)
In this step, we compute the output polynomials f^{[1]}, ... , f ^{[n]}.
First, initialize f^{[1]}, ... , f^{[n]} to 1. Then do the following.
for j \leftarrow1 to m do
{
g\leftarrow F_j ;
for i\leftarrow l - 1 down to 0 do
\left\{ f^{[lj-i]} \leftarrow \gcd(g, H_j - h_i); g\leftarrow \frac{g}{f^{[lj-i]}} \right\}
\left\{ f^{[lj-i]} \leftarrow \gcd(g, H_j - h_i); g\leftarrow \frac{g}{f^{[lj-i]}} \right\}
}
if s \neq 1 then f^{[\deg(s)]}\leftarrow s
Some technical remarks
1. Lemma. For any positive integer r if we are given h = x^{q^r} \mod f \in F_q [x], then for any g \in F_q [x], we can compute g^{q^r} \mod f as g(h) \mod f.
Now to perform Step 2 we set H_1 = h_l and then compute each H_j as H_{j−1} (H_1) \mod f.
2. Computing f(g) \mod h is a so-called "modular composition" problem. For solving this problem one can use e.g. Horner scheme or Brent and Kung's algorithm (this also requires fast matrix multiplication).
By now I implemented the simplest version of the fast CZ algorithm: it uses Horner scheme in Step 2.
The next step might be to implement matrix multiplication and Brent and Kung's algorithm.