123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215 |
- The choice of Diffie-Hellman parameters
- * Background
- Diffie-Hellman key exchange uses two parameters, a prime p and a
- generator g, which are used to derive the public parameters
- y1 = g^x1 (mod p) and y2 = g^x2 (mod p), and then the shared secret
- z = y1^x2 = (g^x1)^x2 = g^(x1*x2) = (g^x2)^x1 = y2^x1 (mod p).
- For the computation to be secure, several conditions must be true.
- The exponent must be big enough, for there is a square-root search
- algorithm to find the exponent. (E.g. a 16-bit exponent can be found
- in about 2^8 = 256 steps.) And then the modulus must be chosen so
- as to make the general discrete log problem difficult.
- The general discrete log problem can be solved for each prime-power
- factor of p-1 independently, so if all of the factors of p-1 are small,
- this is easy to do. Since p-1 is even, it must have a factor of 2, but
- the remaining portion q = (p-1)/2 can be chosen to be prime, making the
- problem as difficult as possible. Finding such numbers is computationally
- expensive, but as they are parameters which are only computed once, this
- is a reasonable up-front cost.
- * Number theory
- A second advantage of prime moduli of this form is that all generators
- g are good. This is because the generator must have a large order in
- the group Z*_p. But that group is of size p-1 = 2*q, and the order of
- any element of a group must divide the size of the group. The only
- divisors this has are 1, 2, q and 2*q. The only element of order 1 is
- 1, and the only element of order 2 is -1. All other elements, from 2
- through -2, have orders of either p-1 or (p-1)/2, which are both large.
- If the generator g has order p-1, it is a generator of the group Z*_p,
- and this is generally how one is advised to generate Diffie-Hellman
- parameters. This explains the similarity in names. However, if g is
- of order p-1, then it must be a quadratic non-residue modulo p. That
- is, it must not be a square of another number. If it were a square,
- then since the size of the group is even, no power of it would ever
- equal its square root, so it could not be a generator.
- If g is indeed of order p-1, then even powers of g are quadratic
- residues (squares, modulo p), and odd powers are quadratic non-residues
- (non-squares). Given a number y and a prime p, the Legendre symbol
- (y/p) is straightforward to compute, and this tells you if y is a
- quadratic residue. If it is, and y = g^x, then x must be even. If not,
- then x would have to be odd. In this way, for a generator which is a
- quadratic non-residue, the low-order bit of the exponent x is easily
- computed.
- If g is a quadratic residue, then the useful values of exponents x is
- more limited, since only the value of the exponent x modulo q = (p-1)/2
- has any effect on the output y = g^x (mod p), but generally exponents x
- are much less than p in any case, so this limitation on range is not an
- issue.
- Essentially, in either case, only the value of x modulo q is secret,
- but if g is a quadratic non-residue, the low-order bit of x is
- available to an attacker, while if it is a quadratic residue, the
- high-order bit is known to be 0.
- Thus, it does not really matter whether g is a quadratic residue or
- not, but if it is not, the exponent x should be chosen one bit larger.
- This adds a trivial amount of work to the computation of y, and for
- that reason it may be preferable to choose g to be a quadratic
- residue. This is not currently done, however.
- * Choice of generator g
- Because any g will do, and the choice of g does not affect the difficulty
- of performing the discrete log computation, choosing it for convenience
- of computation is best, and g = 2 is simplest to compute with.
- If in fact it is desirable to choose a generator which is a quadratic
- residue, then g = 2 can still be used if the prime p is suitably
- chosen. If p = +/-1 (mod 8), then 2 is a quadratic residue. If
- p = +/-3 (mod 8), then 2 is a non-residue.
- * Choice of prime-generation technique
- There may be additional primes of special form for which the discrete
- logarithm problem is particularly easy. The authors are not aware of
- any, but theoretical advances are possible and it would be nice to
- assure users of the system that the prime was not chosen to have any
- hidden special properties: only the published criteria were used.
- David Kravitz of the NSA has suggested a technique for generating
- "kosherized" primes for DSS which has been adapted to generate
- Diffie-Hellman primes.
- The technique uses a string of bytes as a seed to a cryptographically
- strong one-way hash function. This generator produces the initial
- value for a search for a suitable prime.
- David Kravitz' technique generates random numbers from successive seeds
- until one is found to be a suitable prime. This is unbearably slow for
- primes of the special form being sought, but it can be sped up, at a
- negligible cost in uniformity of the chosen primes by generating only a
- starting position for a linear search for a suitable prime. Such a
- search can be carried out particularly efficiently.
- * Details of the technique
- The generator is based on SHA.1, the FIPS 180.1 secure hash algorithm.
- This takes the given seed as input and produces a 160-bit output
- sequence in 20 bytes. These bytes are taken as a big-endian number to
- produce a number n0 from 0 to 2^160-1.
- (I.e. n0 = 2^152 * byte0 + 2^144 * byte1 + ... + 2^8 * byte19 + byte20.)
- Then, the seed is incremented, as a big-endian array of bytes, modulo its
- size (i.e. the last byte is incremented, propagating carry if necessary),
- and hashed again to produce n1, then n2, etc.
- A number of arbitrary size may be constructed by concatenating
- N = n0 + 2^160 * n1 + 2^320 * n2 + .... To get a number no larger
- than 2^k, take the low-order k bits of N, N mod 2^k. Obviously,
- if k is 1024, it is only necessary to compute n0 through n6.
- To generate a k-bit prime p (2^k > p >= 2^(k-1)), take t = N mod 2^(k-2),
- i.e. a number with at most k-2 significant bits. Then add 2^(k-1),
- to force the number into the desired range, and 2^(k-2), to force it
- into the high half of the range. This extra refinement makes an attack
- more expensive, without affecting the time required to do computations
- mod p. Additional high-order 1 bits could be forced, but the incremental
- benefit rapidly diminishes.
- The resultant number t is used as the starting point in a search for a
- suitable prime p. p is chosen to be the first number >= t such that p
- is prime and (p-1)/2 is prime.
- * Choice of seed
- Because SHA.1 is a cryptographic hash, it is computationally infeasible
- to find an input which has a given output. Indeed, there is no known
- technique better than brute-force search to find an input which
- produces an output with any special properties. Assuming that there is
- an unknown class of primes which are easy to solve the discrete
- logarithm problem for, this ensures that the chance of choosing a prime
- p which is a member of that class is no better than random chance,
- regardless of malice on the part of the designer.
- The seed chosen is arbitrary, so was chosen for aesthetic reasons.
- It is the 79 bytes of the ASCII representation of a quote by Mahatma
- Gandhi:
- Whatever you do will be insignificant, but it is very important that you do it.
- * Implementation details
- Obviously, a program was written to find a prime according to these
- rules. To aid anyone who wishes to repeat the search to confirm that
- the published primes were indeed generated in this way, here is a
- description of how it was done. The primes if the desired form have a
- density of about (ln p)^-2. E.g. for 1024-bit p, about one out of
- every 503791 numbers meets these criteria, so a considerable amount of
- searching is required. The following techniques can make the
- computation tolerable.
- First, note that q must be odd and not congruent to 0, modulo 3. Thus,
- q must be congruent to +/-1, modulo 6. Thus p = 2*q+1 must be
- congruent to 2*1+1 = 3 or 2*-1+1 = -1 modulo 12. But p congruent to 3
- mod 12 would be divisible by 3, and not prime, so p must be congruent
- to 11 mod 12.
- Thus, the initial search point t can first be increased until it is
- congruent to 11 modulo 12. Searching from this point forward, only
- every 12th number, t+12*i, needs to be considered.
- If it is desired to choose p so that 2 is a quadratic residue (meaning
- that p is congruent to +/-1 modulo 8), then this additional constraint
- can be met with no additional difficulty by beginning at the next
- number which is congruent to 23 mod 24 and searching in steps of 24.
- But in the following discussion, a step size of 12 is assumed.
- Then, a sieve is built for trial division by a number of small primes
- for a range of following i values. For large primes, a large search
- space is required, so a large sieve is desirable. The value used was
- 65536 bits (8K bytes). It may be necessary to rebuild the sieve
- beginning at t+12*65536 if no suitable prime is found before then, but
- this sieve is large enough that the refilling is infrequent and the
- overhead is negligible.
- Initially, every position in the sieve is marked as a potential prime.
- Then, for the small primes s from 5 through 65521, position i in the
- sieve is marked as unsuitable if t+12*i is divisble by s, i.e.
- definitely not prime. To do this cheaply, consider that t+12*i = 0
- (mod s) if i = -12^-1 * t (mod s). So finding t mod s, then 12^-1 (mod
- s) and multiplying (mod s) will produce the first i value which is
- known to be divisible by s, and then every s positions thereafter in
- the sieve will be divisible. This does the equivalent of a great deal
- of trial division with minimal effort.
- Positions in the sieve are also marked as as unsuitable if (t-1)/2+6*i
- = 0 (mod s), because these positions will have (p-1)/2 divisible by s
- and thus non-prime. This works similarly, and (t-1)/2 mod s can be
- derived from t mod s without actually doing another full division.
- This sieve filters out all but 1/591 of the possible values of i as
- obviously composite, leaving an expected 852 numbers to be checked by
- stronger means before a suitable prime p is found.
- After these two sieving operations have removed all numbers from
- consideration where p or q = (p-1)/2 have small divisors, the remaining
- candidates are subjected to a fast optimized Fermat test, to the base 2,
- once for p and once for q. This eliminates, for practical purposes,
- all composite numbers.
- Special composite numbers can be chosen which pass this test and yet
- are not primes - they are called pseudoprimes - but they are so rare in
- the ranges considered that the chances of finding one without
- deliberate search are utterly negligible. And the stating value for
- the search was carefully chosen to have no hidden special properties.
- If p and q are found to be prime by this test, some extra confirmation
- pseudoprimality tests are performed just to make sure of the conclusion
- and p is returned as the result.
|