Semiprime equationization

Prime numbers play an important role in cryptography due to the fact that it's hard to factorize a semiprime into its original two factors.


A while ago, RSA put prizes on large semiprimes and challenged the public to crack them. Not surprisingly, most of the numbers remained unfactored until this day.

The entire security system is based on the fact that we don't have a truly efficient factorization algorithm. To give you an example, using the most efficient known algorithm (GNFS), it took 5 months to factorize the RSA-640 number using 80 AMD Opteron CPUs, clocked at 2.2 GHz.

But there is something special about the RSA semiprimes; all of them are the product of two prime numbers which have about the same length, with a deviation of ±1 in some cases. This is a very valuable piece of information, because we can test only a restricted amount of prime numbers that are in the range we are interested in.

For example, RSA-100 is a 100-digit number and is the product of two 50-digit prime numbers. Using this two important properties, we can generate a system of mathematical equations to represent any RSA number. The equations are correct only when we know the number of digits of both prime factors.

To illustrate this, let's consider the product of 71 and 43, which is 3053:

a = (x * y)
b = (a mod 10)
3  = b

a = ((x * y) + a / 10)
a = (x * y)
b = ((a mod 10) + (a mod 10))
5  = (b mod 10)

a = ((x * y) + a / 10)
b = (a / 10 + (a mod 10) + b / 10)
0  = (b mod 10)

b = (a / 10 + b / 10)
3  = b

NOTEx mod 10 can also be written as: x - 10*x / 10.

# Details

We have 4 unknown values: x, x, y and y. All we need, is a clever algorithm to find this values based on all the information that we have about the factors. If such an algorithm will be found, we will be able to factorize any RSA number within days or weeks, instead of months or years.

In this particular example, x=1, x=7, y=3 and y=4, which are the digits of the original prime factors:
  • 71 = 10*x + x
  • 43 = 10*y + y

# Implementations

Below you can find some 'proof of concept' implementations which generate equations in pseudocode:

Comments