Index calculus algorithm

In group theory, the index calculus algorithm is a probabilistic algorithm for computing discrete logarithms. It relies on finding a relatively small factor base, for which most elements of the group can be expressed as products of elements in the factor base. Suitable techniques are known for $\mathbb {F} _{n}^{*}$ , the multiplicative group of a finite field of order n. When n = 2^m, the index-calculus algorithm is the best solution known for computing discrete logarithms. For n prime, a modification based on the general number field sieve provides better asymptotic performance.

Description

Roughly speaking, the discrete log problem asks us to find an x such that $g^{x}\equiv h{\pmod {n}}$ , where g, h, and the modulus n are given.

The algorithm (described in detail below) applies to the group $(\mathbb {Z} /q\mathbb {Z} )^{*}$ where q is prime. It requires a factor base as input. This factor base is usually chosen to be the number −1 and the first r primes starting with 2. From the point of view of efficiency, we want this factor base to be small, but in order to solve the discrete log for a large group we require the factor base to be (relatively) large. In practical implementations of the algorithm, those conflicting objectives are compromised one way or another.

It is noteworthy that the lack of the notion of prime elements in the group of points on elliptic curves, makes it impossible to find an efficient factor base to run index calculus in these groups. Therefore this algorithm is incapable of solving discrete logarithms efficiently in elliptic curve groups.

The algorithm is performed in three stages. The first two stages depend only on the generator g and prime modulus q, and find the discrete logarithms of a factor base of r small primes. The third stage finds the discrete log of the desired number h in terms of the discrete logs of the factor base.

The first stage consists of searching for a set of r linearly independent relations between the factor base and power of the generator g. Each relation contributes one equation to a system of linear equations in r unknowns, namely the discrete logarithms of the r primes in the factor base. This stage is embarrassingly parallel and easy to divide among many computers.

The second stage solves the system of linear equations to compute the discrete logs of the factor base. Although a minor computation compared to the other stages, a system of hundreds of thousands or millions of equations is a significant computation requiring large amounts of memory, and it is not embarrassingly parallel, so a supercomputer is typically used.

The third stage searches for a power s of the generator g which, when multiplied by the argument h, may be factored in terms of the factor base g^sh = (−1)^f₀ 2^f₁ 3^f₂···p_r^f_r.

Finally, in an operation too simple to really be called a fourth stage, the results of the second and third stages can be rearranged by simple algebraic manipulation to work out the desired discrete logarithm x = f₀log_g(−1) + f₁log_g2 + f₂log_g3 + ··· + f_rlog_gp_r − s.

The first and third stages are both embarrassingly parallel, and in fact the third stage does not depend on the results of the first two stages, so it may be done in parallel with them.

The choice of the factor base size r is critical, and the details are too intricate to explain here. The larger the factor base, the easier it is to find relations in stage 1, and the easier it is to complete stage 3, but the more relations you need before you can proceed to stage 2, and the more difficult stage 2 is. The relative availability of computers suitable for the different types of computation required for stages 1 and 2 is also important.

The algorithm

Input: Discrete logarithm generator g, modulus q and argument h. Factor base {−1,2,3,5,7,11,...,p_r}, of length r+1.
Output: x such that g^x ≡ h (mod q).

relations ← empty_list
for k = 1, 2, ...
- Using an integer factorization algorithm optimized for smooth numbers, try to factor $g^{k}$ modulo q using the factor base, i.e. find $e_{i}$ 's such that $g^{k}=(-1)^{e_{0}}2^{e_{1}}3^{e_{2}}\cdots p_{r}^{e_{r}}$
- Each time a factorization is found:
  - Store k and the computed $e_{i}$ 's as a vector $(e_{0},e_{1},e_{2},\ldots ,e_{r},k)$ (this is a called a relation)
  - If this relation is linearly independent to the other relations:
    - Add it to the list of relations
    - If there are at least r+1 relations, exit loop
Form a matrix whose rows are the relations
Obtain the reduced echelon form of the matrix
- The first element in the last column is the discrete log of −1 and the second element is the discrete log of 2 and so on
for s = 0, 1, 2, ...
- Try to factor $g^{s}h=(-1)^{f_{0}}2^{f_{1}}3^{f_{2}}\cdots p^{f_{r}}$ over the factor base
- When a factorization is found:
  - Output $x=f_{0}\log _{g}(-1)+f_{1}\log _{g}2+\cdots +f_{r}\log _{g}p_{r}-s.$

Complexity

Assuming an optimal selection of the factor base, the expected running time (using L-notation) of the index-calculus algorithm can be stated as $L_{n}[1/2,c]\ (c>0)\!$ .

For $\mathbb {F} _{2^{m}}^{*}$ , an optimization by Don Coppersmith yields an algorithm with an expected running time of $L_{2^{m}}[1/3,c]\ (0<c<1.587)$ .

History

The first to discover the idea was Kraitchik ^[1] in 1922. After DLP became important in 1976 with the creation of the Diffie-Hellman cryptosystem, R. Merkle from Stanford University rediscovered the idea in 1977. The first publications came in the next two years from Merkle's colleagues,.^[2]^[3] Finally, Adleman optimized the algorithm and presented it in the form we know it today.^[4]

External links

Discrete logarithms in finite fields and their cryptographic significance, by Andrew Odlyzko
Discrete Logarithm Problem, by Chris Studholme, including the June 21, 2002 paper "The Discrete Log Problem".
Handbook of Applied Cryptography. CRC Press. 1997. pp. 107–109. ISBN 0-8493-8523-7. {{cite book}}: Unknown parameter |authors= ignored (help)

Notes

^ M. Kraitchik, Théorie des nombres, Gauthier--Villards, 1922
^ S.C.Pohlig, Algebraic and Combinatoric Aspects of Cryptograhy, Thechnical report Stanford University, 1977,
^ M.E. Hellman and J.M. Reyneri, Fast computation of discrete logarithms in GF (q),Advances in Cryptology--Proceedings of Crypto, 1983,
^ L. Adleman, A subexponential algorithm for the discrete logarithm problem with applications to cryptography,In Foundations of Computer Science, 1979., 20th Annual Symposium on , 2008,

[1] M. Kraitchik, Théorie des nombres, Gauthier--Villards, 1922

[2] S.C.Pohlig, Algebraic and Combinatoric Aspects of Cryptograhy, Thechnical report Stanford University, 1977,

[3] M.E. Hellman and J.M. Reyneri, Fast computation of discrete logarithms in GF (q),Advances in Cryptology--Proceedings of Crypto, 1983,

[4] L. Adleman, A subexponential algorithm for the discrete logarithm problem with applications to cryptography,In Foundations of Computer Science, 1979., 20th Annual Symposium on , 2008,

[1]

[2]

[3]

[4]

v t e Number-theoretic algorithms
Primality tests	AKS APR Baillie–PSW Elliptic curve Pocklington Fermat Lucas Lucas–Lehmer Lucas–Lehmer–Riesel Proth's theorem Pépin's Quadratic Frobenius Solovay–Strassen Miller–Rabin
Prime-generating	Sieve of Atkin Sieve of Eratosthenes Sieve of Pritchard Sieve of Sundaram Wheel factorization
Integer factorization	Continued fraction (CFRAC) Dixon's Lenstra elliptic curve (ECM) Euler's Pollard's rho p − 1 p + 1 Quadratic sieve (QS) General number field sieve (GNFS) Special number field sieve (SNFS) Rational sieve Fermat's Shanks's square forms Trial division Shor's
Multiplication	Ancient Egyptian Long Karatsuba Toom–Cook Schönhage–Strassen Fürer's
Euclidean division	Binary Chunking Fourier Goldschmidt Newton-Raphson Long Short SRT
Discrete logarithm	Baby-step giant-step Pollard rho Pollard kangaroo Pohlig–Hellman Index calculus Function field sieve
Greatest common divisor	Binary Euclidean Extended Euclidean Lehmer's
Modular square root	Cipolla Pocklington's Tonelli–Shanks Berlekamp
Other algorithms	Chakravala Cornacchia Exponentiation by squaring Integer square root Integer relation (LLL; KZ) Modular exponentiation Montgomery reduction Schoof Trachtenberg system
Italics indicate that algorithm is for numbers of special forms