Playing with RSA: signing without the RSA private key

13-03-2025

Preliminary note

Contrary to the norm for academic papers, I will not include references to third parties. The text should be sufficient for the purpose of the blog. This content is not intended for cryptographers, they indicated it is too simple, so I left out details, and uninteresting. Security people might want to know that rogue signing without extracting the private key from a TRD/HSM or sustained access to the TRD/HSM, is possible.

Introduction

RSA may be at risk by quantum computing. It also means that the academic world is losing all interest in RSA weaknesses or any protocols based on RSA, like RSA signing. Most introductory textbooks describe the basics: to sign data; you compute a hash of the data, then encrypt the data with an RSA private key.

Implementing RSA signing naively that way has well-known issues.

One issue is that your security also depends on the hash function, in a direct way: hash collisions mean that one signature actually signs two hashes. Practical abuse of hash collisions has been demonstrated. The lower bar today is SHA-1, a 160 bit hash.

A second issue is when the hash computation is seemingly not necessary, like encrypting 128 bit keys, or other data with limited size. Padding the data has been introduced to thwart a number of attacks against unpadded encryption of small data with RSA.

In this blog I report on an experiment I started against common sense to see if a wild idea could be used against a decent hash, a common padding, and RSA-based signing.

Combining encryption results

A simple observation is that you can combine two RSA "raw" encryptions to get the encryption of the product. Another observation is that you can use consecutive prime numbers from 2 as a base to produce infinitely many numbers. If you limit your numbers to a certain maximum, you can compute the probability that a number can be factored with the first n prime numbers. Up to n square …

The other thing that got my attention is that in the basic padding, the first 2 bytes are fixed, then there is random data, and then the actual data. The random data means that when signing, you get to pick a lot of the data.

The idea was to combine both elements and see where that would lead.

Building a solution: first steps, trick one

A small hurdle on the way is that encrypting the lowest prime numbers, you actually violate an important no-no: never encrypt small data with the private key.

To get around that, you need to know some things about modulo computations. Most of us will know about inverse numbers in multiplication: 5 and 0.2 (1/5) as 5*0.2 == 1. In modulo computation with integers it is different. If we compute modulo 7 then 2 and 4 and 3 and 5 are inverse numbers: 2*4 mod 7 == 1, 3*5 mod 7 == 1.

Now we can do some tricks to get around the limitation of private key usage. We pick m as a large enough number and compute the reverse of m, mi. We encrypt mi and get emi. We encrypt m*p and get emp. We multiple emi and emp and get ep.

What is also nice to know is that computing the inverse modulo some prime number is a problem that has been solved long ago: there is a fast algorithm for it. And anyway, that step is not time critical so even brute force probably could work.

The basic approach that can be used at this point are the following steps:

- Prepare:

Pick a masking number n/2 < m < n
Compute m_inv, the inverse of m
Encrypt m_inv giving enc_m_inv
Compute all primes p < sqrt(n)
Encrypt all m*p giving enc_mp and multiply with enc_m_inv, yielding enc_p

- Brute force:

Generate messages according to PKCS#1, basic format:
starting with 0x0001 (or 0x002 – that does not matter for the approach)
ending with 0x00 and the data to be signed
varying the random data in between
try to factor the message with only the primes available

- successful factorization: generate the product of the corresponding encrypted prime numbers, it is the encryption of the message.

As every cryptography student can tell you, this is leading nowhere for even small RSA sizes, you need too many primes. So the smart thing to do was not to pursue this idea any further. I did continue.

Least important first, trick two: how to check factorization

Somewhere along the road it seems to matter to be able to quickly check if prime factorization with the limited set is possible. Instead of using whatever real crypto people would use, I looked for a solution regardless of common knowledge. In the end, an approach that worked well for me, was a strange idea. Make the product of all the prime numbers you want to use (yes, it gets very big) and then look for the greatest common divisor between that huge product and the number you want to check. If the GCD is 1, no luck. Otherwise, simplify both with the GCD. If the number becomes 1, congratulations, if not, repeat.

Why is this the least important? It optimizes one step, but does not allow to solve larger RSA sizes or larger hash sizes. However, using this trick, it did help getting interesting timings later on.

In search for a breakthrough: what if we can assure the message to factor has one very big prime factor that we can choose?

We start with a very simple, small size problem, more specifically at the format of a PKC#1 v1.5 padded "hash" H of 16 bits combined with RSA64, as a demonstration of the technique.

The (randomly picked) hash value for the example: 40261= 0x9d45

The message G to be signed with PKCS#1 padding is the number:

G = (0x0002 << 48) + (randb << 24) + 0x9d45

If LP is a factor of G, then:

G mod LP == 0

Then we can rewrite the equation G mod LP == 0 as:

[ (0x0002 << 48) + (1<<24) *randb + 0x9d45 ] mod LP = 0

[ (0x0002 << 48) + 0x9d45 + (1<<24) * randb ] mod LP = 0

((1<<24) * randb ) mod LP = -((0x0002 << 48) + 0x9d45) mod LP

And finally:

[(1<<24) mod LP ]* randb + LP * i = -((0x0002 << 48) + 0x9d45) mod LP

Once we pick a large prime LP, we get this equation with two unknowns: the randb and the i. We are only interested in the value of randb.

Sadly enough, one equation with two unknowns feels like a major problem.

If we could solve this problem we can ensure that the message G has a large factor, and that we stand a good chance of having low factor factorization for G/LP.

It is possible to use brute force to find a solution for small scale problems like this one, it is not a good idea for larger problem sizes.

Unless there is a better solution…

Learning about a phantastic trick, trick three: an equation with two unknowns has a unique solution that can be computed efficiently !!

The equation is a linear Diophantine equation: a x + b y = c, where all are integers, and a, b and c are known, and we have two unknowns x and y.

A simpler case is called Bézout's identity: if a and b have d as greatest common divisor, then there exist 2 integers x and y, such that a x + b y = d.

As a special case, if a and b are coprime the GCD is 1, and the equation becomes a x + b y = 1.

What is even better, is that there is an efficient algorithm to find the x and y, the extended Euclidean algorithm.

This finding is a true game changer.

If we can choose a large prime factor, the remainder is much smaller and more likely to be factored with limited low primes!

Back to our toy example:

The (randomly picked) hash value: 40261= 0x9d45

We will use the (randomly picked) prime 9456191 as target large prime LP.

From:

[(1<<24) mod LP ]* randb + LP * i = -((0x0002 << 48) + 0x9d45) mod LP

With:

LP == 9456191

(1<<24) mod LP == 7321025

-((0x0002 << 48) + 0x9d45) mod LP == 9119220

The diophantine equation becomes:

7321025 * randbX + 9456191 * i = 9119220

The solution of the equation is randbX: -891786762240 and i: 690425265420

The value of randb == randbX mod LP == 7698588 = 0x75789c

Filling in randb in the right spot of the message:

G == 0x0002 75789c 00 9d45 == 692110827232581

Dividing G by the LP yields 73191291. The factors of 73191291 are: 3, 37, 137, 4813

The full factorization of the message G == 692110827232581 is 3, 37, 137, 4813, 9456191

Adding a last trick, trick four, relative prime is good enough

Are we there yet? No!

Given any LP we still need to have G/LP factored with a limited set of primes. Most of the time that will not be the case. We need to be able to try very many LP values to hit a factorization with low primes.

So we face two potentially blocking properties:

- we need a huge number of primes

- we need a lot of low primes

The more primes we can try, the higher the probability of requiring less low primes. The most restricting requirement is many LP values to try.

The way to have tons of LP with limited encryption requirements is to use combination theory. If we want a LP size 60, we can do with 6 pieces of 10 bits, 12 pieces of 5 bits, …

If we encrypt 15 pieces of 10bits, we can generate comb(15,6) trials of 60 bits. They are not truly prime numbers, but still extremely likely coprime, which is good enough.

Does it work?

Yes it does, and much better than I ever anticipated. The larger the RSA size, the larger large prime can be used. The remainder for low prime factorization is mostly determined by the hash size. So RSA 1024 or RSA 2048 is not that different to handle. That is why MD5 is easy, SHA-1 (160 bit) doable, and then it gets harder. The low primes required for reasonable time on 1 laptop are all primes below 2^25 for the harder cases.

So what?

I was very disappointed by the reaction of the reviewers of the paper I wrote about it. It is true that the best I showed was SHA-1 with RSA1024 with PKCS#1 basic padding, which for instance for banking would be very border line. It is true that I mainly used very old knowledge to solve my problem. It is true that it looks very simple when you see the result.

The key message however is that these are signatures produced without ever having direct access to or knowing the private key. I need to have raw chosen plaintexts encryptions of large sized data, which are not valid padded signatures, in the order of the number of primes below 2^25. That is it. No longer access required.

How to justify these chosen plaintext encryptions? After setting up the TRD/HSM and generating an RSA key pair, run some tests encrypting and decrypting in both directions for QA purposes. Keep some of the results…

For all I know and checked this type of attack was not ever described, although it could have been done for many years.

Baseline: there are two assumptions that are violated:

- it is necessary to have the private key to be able to sign

- lacking the private key, you need to have rogue access to the TRD/HSM at the time of signing

Appendix: example

This result is a demonstration to create signatures using RSA1024 signatures with PKCS#1 v1.5 padding of 160-bit hashes (SHA-1).

The parameters are these:

RSA 1024, it means the total size of signature is 1024 bit
Using a SHA-1 hash means the payload is 160 bits
Random choice: 0x cfeb766e8b3e68f5 20d6b4cb9ded4c9f 4836f7c0
We have 16 bits at the start: 0x0002
We have 8 bits zero preceding the payload: 0x00
The random part (remaining bits) is 1024-16-8-160 = 840 bits

Some options for the parts that will contribute to the semi-large-prime

210 x 4, 4 out of 90 parts, enabling 1,581,580 tries
105 x 8, 8 out of 30 parts, enabling 5,852,925 tries – the example used this choice

The generated message in hex:

0002

226457fbd858e118 5dc544da5783dda6 429d100fcca5b7a8 22b2bc9cf8c3fe21 d54c73f0f567b126 5b440a727f57a3d6

2c3d8f1a51e9553f b62474eb7bec7cef 40cee50f42cfab7f 47e5a8dab4e01957 1a3abbff4ab8c80c 9dcc3922b50ae0bd

86dfee6ab8f40aef d0

cfeb766e8b3e68f5 20d6b4cb9ded4c9f 4836f7c0

The large prime used for this result is based on the product of 8 primes, size (about) 105 bits each:

39563663226838690568503084408837
23242250157768908462311826520001
34926775727968560055566565572643
25727904021271952351997116404669
37458968540915288523765949862771
37465149896471340326378283037627
32523183626275222243986820950419
28499187030230046620274429275287

Remainder to be factorized: 5446952503637007931101775999236838306461489879882816

Factors of the remainder:

[2, 2, 2, 2, 2, 2, 3, 17, 19, 29, 43, 67, 757, 8629, 34693, 222163, 713663, 813817, 2524343, 14241979]

14241979 is the 924,753th prime, the best result of this test set.