Trial division (Java)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C | Java

Trial division, perhaps the simplest algorithm for factoring and primality testing, simply attempts to divide a number by all possible factors. Divisions that have no remainder (go in evenly) reveal factors. We only have to test up to the square root of n, since if x is a factor, so is n/x. If no factors between 2 and \sqrt{n} (inclusive) are found, the number is prime.


Contents

[edit] Implementations

[edit] With floating-point square root

The simplest possible implementation tries each integer between 2 and the square root of n, computed using the API function Math.sqrt(). We use the modulus operator % to find the remainder after division by each trial factor:

<<trial division with sqrt>>=
public static int trialDivisionSqrt(int n) {
    int sqrtN = (int)Math.sqrt((double)n);
    for (int x = 2; x <= sqrtN; x++) {
        if (n % x == 0) {
            return x;
        }
    }
    return IS_PRIME;
}

The reserved value IS_PRIME is not a possible factor:

<<constants>>=
public static final int IS_PRIME = 0;

This works but has a few issues. The square root operation, although only done during initialization, is expensive for small inputs. A more subtle issue is that we must be able to convert any int to a double without too much loss of precision; Java guarantees this, since its int is 32 bits and its double has a 52-bit mantissa (its float, with a 23-bit mantissa, might not).

[edit] With squaring

Rather than testing if x \leq \sqrt{n}, we can square both sides and test if x^2 \leq n. We can additionally avoid performing a multiply in every iteration by noting that (x + 1)2 = x2 + (2(x + 1) − 1). The resulting trial division is ideal for very small numbers:

<<trial division with squaring>>=
static int trialDivisionSquaring(int n) {
    for (int x = 2, xSquared = 4;
        xSquared > 2*x - 1 && xSquared <= n;
        x++, xSquared += 2*x - 1)
    {
        if (n % x == 0) {
            return x;
        }
    }
    return IS_PRIME;
}

There is some subtlety here; we have to ensure that x2 does not overflow before it exceeds n, which might occur if n is close to its maximum value. The approach used above is to make sure that xSquared is larger than the value just added to it; since x never exceeds sqrt(Integer.MAX_VALUE) this is safe.

[edit] With less trial divisors

The approaches so far work, but are much slower than they need to be: if 2 does not divide n, no multiple of 2 will either, but we try all even numbers in the range. We can save time by skipping all of them but 2:

<<trial division with odd divisors only>>=
public static int trialDivisionOdd(int n) {
    int sqrtN = (int)Math.sqrt((double)n);
    if (n % 2 == 0) return 2;
    for (int x = 3; x <= sqrtN; x += 2) {
        if ((n % x) == 0) {
            return x;
        }
    }
    return IS_PRIME;
}

More generally, we only have to try prime divisors. Since non-negative ints have a 31-bit magnitude, there are 4792 possible prime factors. Placing each in an int, our table is only 19168 bytes (shorts are not large enough to hold the larger primes). We could either include a hard-coded list of the primes, or use a simple one-time Sieve of Eratosthenes to generate it, like this:

<<Sieve of Eratosthenes for all trial primes>>=
static int[] primes;

static private void generatePrimeList(int max) {
    BitSet isNotPrime = new BitSet(max+1);
    int numPrimes = 1;
    for (int i = 2; i <= max; i++) {
        if (!isNotPrime.get(i)) {
            numPrimes++;
            for (int j = i + i; j <= max; j += i) {
                isNotPrime.set(j);
            }
        }
    }
    primes = new int[numPrimes];
    int j = 0;
    for (int i = 2; i <= max; i++) {
        if (!isNotPrime.get(i)) {
            primes[j] = i;
            j++;
        }
    }
}

<<import statements>>=
import java.util.BitSet;

We could make the sieve more efficient in time and space at the expense of clarity by omitting all even numbers from the sieve bit array.

We can now write a version of trial division for 32-bit numbers that only checks prime divisors:

<<trial division with prime divisors only>>=
public static int trialDivisionPrimes(int n) {
    int sqrtN = (int)Math.sqrt((double)n);
    for (int primeIdx = 0; primeIdx < primes.length && primes[primeIdx] <= sqrtN; primeIdx++) {
        if (n % primes[primeIdx] == 0) {
            return primes[primeIdx];
        }
    }
    return IS_PRIME;
}

[edit] Long version

Trial division can still tackle the largest 31-bit integers on a modern computer in microseconds. To push it to its limits, we extend it to Java's longs, which have 63-bit magnitude for non-negative numbers. Since only the prime sieving variant is practical in this range, we extend this variant to longs:

<<trial division for longs with prime divisors only>>=
public static int trialDivisionPrimesLong(long n) {
    check input range
    compute square root
    for (int primeIdx = 0; primeIdx < primes.length && primes[primeIdx] <= sqrtN; primeIdx++) {
        if (n % primes[primeIdx] == 0) {
            return primes[primeIdx];
        }
        
    }
    return IS_PRIME;
}

There are no changes to the algorithm itself. However, we have a problem with the list of primes: to factor all longs, we need a list of all primes less than sqrt(Long.MAX_VALUE) = 3037000500, some of which are too large to fit in an int; even if we made our table a table of longs, the BitSet constructor in the sieving method would still not accept such a large input. Moreover, to avoid overflow in the sieving method, we cannot allow the sieve size to exceed Integer.MAX_VALUE/2. To deal with this, we limit the range of input to the square of the largest prime in the sieve, which is in any case well beyond the practical range of trial division:

<<check input range>>=
long maxPrime = primes[primes.length-1];
if (maxPrime*maxPrime > n) {
    throw new IllegalArgumentException("Cannot factor numbers larger than " + maxPrime*maxPrime);
}

A more subtle issue is that converting a long to a double for the purpose of computing its square root may lose precision, since a double only has a 53-bit mantissa. This would result in some large prime factors not being checked. For lack of a better option, we write our own square root computation function based on Newton's method which directly computes the square root of a long in a double:

<<long square root>>=
public static long sqrt(long n) {
    long guess = 2, prev_guess = 0, prev2_guess = 0;
    while (guess != prev_guess && guess != prev2_guess) {
        prev2_guess = prev_guess;
        prev_guess = guess;
        guess = (guess + n/guess)/2;
    }
    return Math.min(guess, prev_guess);
}

This code operates only on longs, repeatedly averaging the current guess with n divided by the current guess. We identify convergence by comparison with the previous two values, since the procedure may end up oscillating between the floor and ceiling of the square root. Since we want the floor, we return the smaller of the final two values.

A subtlety of this implementation is that guess + n/guess never overflows. To see this, call the guess x and assume n ≥ 4. Notice that one of either x or n/x is always no greater than the square root of n and at least 2. In these two cases, we have:

x + \left \lfloor \frac{n}{x} \right \rfloor \leq x + \frac{n}{x} = \frac{1}{x}\left(x^2 + n\right) \leq \frac{1}{2}\left(n + n\right) = n. x + \left \lfloor \frac{n}{x} \right \rfloor \leq x + \frac{n}{x} = \frac{n}{n/x} + \frac{n}{x} = \frac{1}{n/x}\left(n + \left(\frac{n}{x}\right)^2\right) \leq \frac{1}{2}\left(n + n\right) = n.

Now we use the above function to compute the long square root:

<<compute square root>>=
long sqrtN = sqrt(n);

[edit] Why not BigInteger?

Java has support for arbitrary-precision integers in its standard API through a class called java.math.BigInteger. It's not difficult to write trial division for BigInteger objects, but it isn't useful, because the basic operations are considerably slower than with ints or longs, and longs already push trial division beyond its practical limits; in particular, one cannot create a BitSet in Java larger than 231 − 1 bits, which is necessary to construct the prime number sieve for factoring numbers exceeding 262. Factoring such numbers without using a prime sieve is too slow on current hardware.

[edit] Test main

This test main allows us to exercise all of our implementations and compare them on numbers supplied on the command line:

<<TrialDivision.java>>=
import statements

class TrialDivision {

    constants

    trial division with sqrt

    trial division with squaring

    trial division with odd divisors only

    Sieve of Eratosthenes for all trial primes

    trial division with prime divisors only

    long square root

    trial division for longs with prime divisors only

    public static void main(String[] args) {
        long n = Long.parseLong(args[1]);
        if (n <= 0) {
            System.out.println("Please enter only positive integers.");
        } else if (args[0].equals("TrialDivisionSqrt")) {
            for (int i = 0; i < 10000 - 1; i++) trialDivisionSqrt((int)n);
            System.out.println(trialDivisionSqrt((int)n));
        } else if (args[0].equals("TrialDivisionSquaring")) {
            for (int i = 0; i < 10000 - 1; i++) trialDivisionSquaring((int)n);
            System.out.println(trialDivisionSquaring((int)n));
        } else if (args[0].equals("TrialDivisionOdd")) {
            for (int i = 0; i < 10000 - 1; i++) trialDivisionOdd((int)n);
            System.out.println(trialDivisionOdd((int)n));
        } else if (args[0].equals("TrialDivisionPrimes")) {
            generatePrimeList(65536);
            for (int i = 0; i < 10000 - 1; i++) trialDivisionPrimes((int)n);
            System.out.println(trialDivisionPrimes((int)n));
        } else if (args[0].equals("TrialDivisionPrimesLong")) {
            generatePrimeList(Integer.MAX_VALUE/2);
            for (int i = 0; i < 10000 - 1; i++) trialDivisionPrimesLong(n);
            System.out.println(trialDivisionPrimesLong(n));
        } else {
            System.out.println("Invalid algorithm selection.");
        }
    }

}

Since trial division on such small numbers is very quick, we place each call in a long loop for better performance measuring. This also makes the cost of producing the primes table in the last case small on average.

[edit] Performance

Under construction.gif
TODO
This portion of this article is incomplete.
Download code
hijacker
hijacker
hijacker
hijacker